Restarts in proxy layer causing increase of errors and slowing down logs ingestion
Incident Report for ESS (Public)
Resolved
The incident is resolved.
Posted Jun 27, 2019 - 17:34 UTC
Monitoring
The ingest delays in eu-west-1 have returned to normal. We will be monitoring this incident for the next 30 minutes.
Posted Jun 27, 2019 - 16:25 UTC
Update
AWS ap-southeast-1 logging delay is now resolved. We are still working on eu-west-1.
Posted Jun 27, 2019 - 15:25 UTC
Update
We have rolled back the proxy release across all regions and 5xx rates have dropped to normal in AWS eu-west-1 and AWS ap-southeast-1. We are currently working on improving the log ingestion rates in those two regions. Logs are currently delayed approximately one hour.
Posted Jun 27, 2019 - 14:58 UTC
Identified
We have successfully rolled back proxy release in ap-southeast-1 region which has experienced a significant drop in 5xx errors.

We are still working on improving logs ingestion rates and rolling back proxy release in the following regions: ap-northeast-1, ap-southeast-2, sa-east-1, eu-west-1, eu-central-1, GCP us-west1 and GCP europe-west3.
Posted Jun 27, 2019 - 13:58 UTC
Investigating
We have noticed an increased rate of proxy restarts following an upgrade to our proxy layer. We are currently assessing the impact it has on our platform. Preliminary findings indicate slowdowns in logs ingestion. Regions mostly affected are: eu-west-1 and ap-southeast-1.

We have decided to perform a rollback and currently working on it. We will provide more information in the next hour.
Posted Jun 27, 2019 - 13:31 UTC