Access to Engagement Cloud - All Regions
Incident Report for dotdigital
Postmortem

RCA: Access to Engagement Cloud - All Regions

Summary of impact:

At approximately 14:45 BST on Tuesday 2nd June 2019 all Engagement Cloud websites in all regions went offline. This had a widespread impact but critically the following was observed:

  • Users could not log into their account
  • All API calls failed
  • Link and open tracking were not collected
  • Images within sent campaign were missing
  • Survey and Forms and Landing Pages did not load

Our CPaaS One API remained online throughout this incident.

Service was restored at 15:12 BST.

Root cause:

Our services went offline because of an outage at Cloudflare. Cloudflare are a popular security and CDN product who sit in front of our websites, they also noticed the disruption and made an announcement on their status page.

Cloudflare have since published details of the event identifying a misconfigured Web Application Firewall Rule as the cause.

Mitigation:

Service was restored after Cloudflare temporarily disabled their Web Application Firewall and removed the offending rule.

Next steps:

Cloudflare has become a key Internet service which is relied upon by millions of organisations, including many large and well-known companies. Typically Cloudflare have a good up-time record however we are now engaging with them to understand how their processes will be improved to prevent future incidents.

Posted 4 months ago. Jul 03, 2019 - 11:25 BST

Resolved
This incident has now been resolved. This was caused by network performance issues with Cloudflare.
We apologise for the inconvenience.
Posted 4 months ago. Jul 02, 2019 - 15:35 BST
Monitoring
Cloudflare have implemented a fix and we will continue to monitor the situation.
Posted 4 months ago. Jul 02, 2019 - 15:23 BST
Identified
The issue has been traced to an issue with Cloudflare. We are in communication with them to establish when the issue will be fixed.
Posted 4 months ago. Jul 02, 2019 - 14:55 BST
Investigating
We are currently experiencing issues with access to Engagement Cloud including our API. This is currently being investigated and an update will be provided in due course.
Posted 4 months ago. Jul 02, 2019 - 14:52 BST
This incident affected: Europe - Engagement Cloud r1 (Europe - Web Application, Europe - API, Europe - Open and Link Tracking, Europe - Surveys and Forms, Europe - Landing Pages), North America - Engagement Cloud r2 (North America - Web Application, North America - API, North America - Open and Link Tracking, North America - Surveys and Forms, North America - Landing Pages), Asia Pacific - Engagement Cloud r3 (Asia Pacific - Web Application, Asia Pacific - API, Asia Pacific - Open and Link Tracking, Asia Pacific - Surveys and Forms, Asia Pacific - Landing Pages), and Global - Website, Global - Login Page.