At approximately 17:26 BST on Monday 30th October 2019, we began to see delayed messages within the CPaaS platform. During this period, any new messages via api.comapi.com and portal.comapi.com were being accepted but not being delivered. This included all types of messages, SMS, Facebook Messenger, Twitter and Native Push. We restored full service for any new traffic at 17:56 BST and backlogs were cleared at 18:45 BST.
The backlog was caused by an increase in load of 4x times the amount of usual traffic to our primary message queuing system over a 10 minute period. The increased load consumed more memory and triggered a mechanism which throttles the volume of message queue operations. This feature of our message queuing software is purposely designed to protect the system from overuse.
We redirected traffic to our secondary site at 17:56 BST. This allowed any new requests to be delivered via separate infrastructure and gave our primary site time to catch up with the backlog.
At 18:45 BST the backlog of messages in our primary site had cleared. Following successful testing, we routed traffic back to our primary site.
We’re going to: