At approximately 12:10 UTC on 13th October 2020, we experienced delays and failures with a number of services, including campaign sends in our European region (Region 1). We restored normal services at 12:45 UTC on 13th October 2020, but a number of services experienced delays up until 15:55 UTC on 13th October 2020. Some campaign sends didn’t complete successfully. We have contacted the affected customers directly to advise they’ll need to re-issue their sends.
This issue was caused by a misconfiguration in the messaging system, which occurred during a change to normalize our message queue addresses. The sent messages were stored incorrectly or failed to deliver, which caused corresponding services to fail.
The timeline for resolving the issue was (times stated in UTC):
12:10: We changed our messaging system configuration. The sent messages are stored, but are unavailable for receiving services
12:20: Messaging system configuration changed again. The message sending stopped working
12:33: We identified parts of our infrastructure were unavailable
12:35: We identified the issue was related to the messaging system
12:45: We reverted our messaging system configuration to the original value
15:55: The incorrectly stored messages were re-delivered, but some customers sends needed to be restarted or re-issued. Our Support Team conducted proactive reach outs to affected customers.
In our next major platform release, we’ll include a change to prevent incorrect configuration from being applied.