All systems operational
We will try to refresh every 5 minutes
Gateway Operational
API   (?) The guts of the application. Operational
Media Proxy   (?) This is the service responsible for serving images, audio, and video. It is reliant on our CDN. Operational

Incident history


✓ Resolved after 27m of downtime

US East Connection Issues

May 25, 2018 at 4:13 AM

Resolved - We believe all users experiencing issues have been able to connect at this time. (May 25, 2018 - 05:54)

Monitoring - We believe the connectivity issues are being caused by an isolated ISP issue. We’ve had reports that swapping to Google DNS servers (see here; https://developers.google.com/speed/public-dns/docs/using) resolves the problem for users. (May 25, 2018 - 04:40)

Investigating - We’re aware of reports that users are experiencing connection issues on the East coast of the United States. We’re currently investigating these issues, and apologize for any inconvenience it may be causing you. (May 25, 2018 - 04:13)

✓ Resolved after 1h 36m of downtime

Unavailable Guilds & Connection Issues

April 13, 2018 at 3:54 PM

Post-mortem

At approximately 14:01, a Redis instance acting as the primary for a highly-available cluster used by our API services was migrated automatically by Google’s Cloud Platform. This migration caused the node to incorrectly drop offline, forcing the cluster to rebalance and trigger known issues with the way our API instances handle Redis failover. After resolving this partial outage, unnoticed issues on other services caused a cascading failure through Example Chat App’s real time system. These issues caused enough critical impact that Example Chat App’s engineering team was forced to fully restart the service, reconnecting millions of clients over a period of 20 minutes.


Update - A fix has been implemented and we are monitoring the results. Looks like this has been fixed. (Apr 13, 2018 - 17:30)

Monitoring - After hitting the ole reboot button Example Chat App is now recovering. We’re going to continue to monitor as everyone reconnects. (Apr 13, 2018 - 16:50)

Investigating - We’re aware of users experiencing unavailable guilds and issues when attempting to connect. We’re currently investigating. (Apr 13, 2018 - 15:54)