Contact imports, email open tracking data and images not loading
Incident Report for Dotdigital
Postmortem

Summary of impact:

Between 06:30 and 07:50 UTC on 13th April 2022, as we deployed the latest version of Dotdigital, errors began to occur in the event processing and contact import services. Some customers may have experienced issues with contact imports and open reporting for email campaign sends.

Between 08:40 and 09:15 UTC, we experienced errors in the image resizing service meaning customers may have seen missing images in our EasyEditor. However, images in sent campaigns continued to load normally.

Root Cause:

Contact imports

We introduced a bug fix for this release. When the fix was deployed it caused imports containing a suppressed contact with empty data in mapped fields to error. This ultimately caused those imports to fail.

Opens

As part of routine improvements to our services, we made a change to how we process open event messages. This change meant some messages failed to process correctly and open data was lost.

Images

Our ongoing application modernization work involved converting the image resizer to run on the latest framework. We are still identifying the root cause for this issue.

Mitigation:

Our mitigation steps for these issues were (times stated in UTC): 

Contact imports

06:31: We began our release

06:35: We saw errors in the contact importer

06:45: We rolled back the contact importer and contacts could be imported successfully.

Opens

06:31: We began our release

06:35. We saw errors in the event parser service and investigations began

07:05: Our investigations revealed opens were being lost and we reverted the service to the previous version. At this point, email open data was being captured as normal.

Images

08:04: We noticed a small number of errors on the image resizer application. This initially appeared as a brief database connection issue.

09:04: We began receiving customer reports of missing images in our EasyEditor. Our teams began an investigation.

09:18: We reverted the image resizer to the previous version and images loaded as expected in our EasyEditor.

Next Steps:

The intended changes for the contact importer and opens have been reworked and were subsequently released successfully later on in the day.

We are still investigating the images issue. Once fully understood we will investigate running the old and new site side by side. We will then be able to migrate traffic across, detecting any issues before customer impact occurs.

We have learned some lessons from today and will adjust future testing and deployment plans to prevent re-occurrence.

Posted Apr 19, 2022 - 11:40 BST

Resolved
We’re sorry, but we had a small issue during today’s release. During 06:30 and 09:11 UTC on 13th April 2022, some customers may have noticed some intermittent issues:
- Contact imports failing
- Disruption to email open tracking
- Images failing to load in campaigns.
During our continuous deployment process, we introduced some changes to our contact importer and open tracking services. After releasing the changes we noticed increased error rates, so we immediately rolled back the changes to these services. All other updates we released today were successful.
We’re really sorry this happened today and for any disruption it may have caused. We’ll post up a report to explain the issue and resolution in more detail in a couple of days.
Posted Apr 13, 2022 - 07:30 BST