On Wednesday 29th September 2021 at 12:20 UTC, we stopped processing web behavior session data in our European region (region 1). We continued to accept new session data and none was lost. We resumed normal processing at 15:12 UTC.
Customers using web behavior data may have experienced the following problems:
Our web behavior tracking service relies on “indexes”, which are performance improvements used by many database systems to help them process large volumes of data. Periodically, the web behavior tracking service checks these indexes to see whether they need rebuilding. In this case, the index upgrades took a far longer than was expected. While indexes are rebuilding, the service doesn’t process new information, so a backlog of session data developed.
During this incident, we prepared a software fix to temporarily prevent indexes from being rebuilt. However, before this was rolled out, the indexes finished rebuilding and service resumed with sessions processing again, so we didn’t complete the rollout.
We apologize for any inconvenience caused during this incident. From here, we have an action to change our web behavior tracking service so it’s capable of rebuilding indexes and processing new data simultaneously, rather than pausing new data ingestion while waiting for indexes to finish rebuilding.