Impacted availability - sending messages, logging in, all functions

Incident Report for Discord

Resolved

This incident is resolved.

The root cause was on us. We were performing maintenance on our API cluster and unintentionally reduced the API cluster capacity too far, leading to latency. Unfortunately, this also caused our Envoy front proxy to become unhappy and get into an unrecoverable state.

We resolved the incident by doing a rolling restart of the Envoy proxy after we fixed the API cluster capacity to handle user traffic.
Posted Apr 03, 2020 - 01:36 PDT

Monitoring

Latency and message send rate should be back to normal. We are still monitoring and will resolve this incident once we're confident everything is 100%.
Posted Apr 03, 2020 - 00:51 PDT

Update

We are still working to resolve the latency that this situation has caused. Engineers are online and working the problem and we hope to have service restored in a few minutes.
Posted Apr 03, 2020 - 00:43 PDT

Identified

This was related to a maintenance we were performing, we unintentionally overloaded our API cluster. We are working on recovering now.
Posted Apr 03, 2020 - 00:31 PDT

Investigating

We are investigating an issue with Discord.
Posted Apr 03, 2020 - 00:28 PDT