All Systems Operational
API Operational
90 days ago
99.99 % uptime
Today
Gateway Operational
90 days ago
99.99 % uptime
Today
CloudFlare ? Operational
Media Proxy ? Operational
90 days ago
100.0 % uptime
Today
Voice Operational
EU West Operational
EU Central Operational
Singapore Operational
Sydney Operational
US Central Operational
US East Operational
US South Operational
US West Operational
Brazil Operational
Hong Kong Operational
Russia Operational
Japan Operational
South Africa Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
had a major outage
had a partial outage
API Response Time
Fetching
Past Incidents
Oct 16, 2019

No incidents reported today.

Oct 15, 2019

No incidents reported.

Oct 14, 2019

No incidents reported.

Oct 13, 2019

No incidents reported.

Oct 12, 2019

No incidents reported.

Oct 11, 2019

No incidents reported.

Oct 10, 2019

No incidents reported.

Oct 9, 2019

No incidents reported.

Oct 8, 2019

No incidents reported.

Oct 7, 2019

No incidents reported.

Oct 6, 2019

No incidents reported.

Oct 5, 2019
Resolved - Turns out we didn't need to disable typing events (this time), we were able to get the cluster stable by just reducing the number of API servers by about 10%.

Presence is back up and running & happy, DMs are flowing like normal.

We've got a fix in the works for the underlying root cause (overstressing our service discovery system, which prevented the presence system from restarting with all nodes). We should have this deployed next week, but in the meantime, things should be stable.

Thank you very much for your patience as we worked through this, and we are super sorry for the inconvenience!
Oct 5, 14:57 PDT
Monitoring - We discovered an underlying issue with our etcd cluster proving to be overloaded, which was having knock-on effects on restarting our presence cluster.

In order to reduce load on etcd, we've had to scale down our API cluster, and to ensure we can still provide service we've had to globally disable typing events temporarily. We are working on a fix to the etcd load issue, but the good news is that DMs look to be recovering.
Oct 5, 14:45 PDT
Identified - Unfortunately we continue to have issues with the service. We've brought in more engineers (the Jake) and are working on it.
Oct 5, 14:28 PDT
Monitoring - The cluster has been restarted successfully and is recovering. This usually takes 10 to 15 minutes for all online users to re-establish connections to the presence cluster, at which point service will be restored.
Oct 5, 14:17 PDT
Identified - We are having issues with our presence cluster (which handles direct messages) and are currently in the process of restarting it.
Oct 5, 14:11 PDT
Investigating - The team is aware of major impact sending direct messages right now. We are online and working to fix the issue.
Oct 5, 14:04 PDT
Oct 4, 2019

No incidents reported.

Oct 3, 2019

No incidents reported.

Oct 2, 2019

No incidents reported.