At approximately 14:40 BST, Thursday 24th October 2019 (01:40 AEDT, Friday 25th October 2019) the APAC Engagement Cloud API service went offline. All subsequent API requests failed until service was restored at 15:10 BST (02:10 AEDT, Friday 25th October 2019). All other global and regional services were unaffected.
As part of our continuous platform reviews and seasonal preparations more APAC API resources were being deployed ahead of forecasted usage.
Whilst the update process increased available API resource successfully, it also ran a new IP assignment to the API load balancer. During the IP swap all internet traffic to the API service was terminated by the load balancer.
The incorrect IP assignment was possible because our deployment process has two modes. It can be used to deploy additional resources to an existing application or in a disaster recovery scenario launch a new installation which assigns a new IP address. Human error lead to the wrong mode being applied.
The fault was identified before the update completed, therefore, we were able to immediately revert the load balancer to its original configuration.
We will remove the requirement for our engineers to input parameters to this procedure and existing infrastructure will dictate the safest mode of operation.