[Global][Public Cloud] - Databases API instabilities
Resolved
Start time : 02/09/2024 10:30 UTC
End time : 26/09/2024 15:30 UTC
Root cause : This incident was due to change in underlying metrics infrastructure caused latency increase.
We would like to inform you that the incident have been resolved and the situation is back to normal.
We thank you for your understanding and patience throughout this incident
Posted Oct 01, 2024 - 09:19 UTC
Monitoring
A change in underlying metrics infrastructure on August 21 ended up increasing latency for metrics calls in certain conditions, causing the September 2 event with a period of hours where all such calls ended up erroring. Other parts of the system decided that calls where latency was too high were in error, causing about a 3% error rate on metrics endpoints since then.
Increasing latency toleration in some parts of the system allows us to mitigate the impact and reach an almost 0% error rate. We continue to monitor and work on improvements to go back to nominal behavior. Please be advised that further details will be provided on Monday 30/09/2024.
Posted Sep 27, 2024 - 13:48 UTC
Update
The incident is still ongoing. We would like to assure you that we are doing our utmost to resolve this situation as quickly as possible.
As soon as the situation evolves or the incident is resolved, we will keep you informed.
Posted Sep 16, 2024 - 09:25 UTC
Update
The incident is still ongoing. We would like to assure you that we are doing our utmost to resolve this situation as quickly as possible.
As soon as the situation evolves or the incident is resolved, we will keep you informed.
Posted Sep 06, 2024 - 10:08 UTC
Identified
Ongoing actions :Errors are again being observed since the 05/09/2024 08:00 UTC.
Our providers is currently working on a fix for this issue.
Posted Sep 05, 2024 - 15:29 UTC
Update
Ongoing actions :Errors have been observed again the 04/09/2024 between 01:00 UTC and 2:00 UTC and also between 07:00 UTC and 08:00 UTC.
Our teams are currently monitoring the situation.
Posted Sep 04, 2024 - 08:46 UTC
Monitoring
Ongoing actions : No more impact is being observed since 02/09/2024 09:30 UTC.
Our teams are currently monitoring the situation.
Posted Sep 03, 2024 - 13:05 UTC
Investigating
Service impact : Errors are being observed again on API calls to metrics endpoints.
Customer may have error 500 and error 200.
Database service functionality is not impacted and should continue to work as expected.
Ongoing actions : Our technical teams are currently working with our partner to solve the issue.
Update will be posted as significant progress is made.
Posted Sep 03, 2024 - 07:57 UTC
Monitoring
Ongoing actions : No more impact is being observed since 02/09/2024 20:00 UTC.
Our teams are currently monitoring the situation.
Posted Sep 03, 2024 - 07:17 UTC
Update
The incident is still ongoing. We would like to assure you that our providers are doing their utmost to resolve this situation as quickly as possible.
As soon as the situation evolves or the incident is resolved, we will keep you informed.
Thank you for your understanding
Posted Sep 02, 2024 - 17:32 UTC
Update
Our providers continuing to work on a fix for this issue.
Posted Sep 02, 2024 - 12:54 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Sep 02, 2024 - 12:33 UTC
Investigating
Start time : 02/09/2024 10:30 UTC
Service impact : API calls to metrics endpoints are erroring. Database service functionality is not impacted and should continue to work as expected.
Ongoing actions : Investigating
Our providers are working on the issue.
Update will be posted as significant progress is made.
Posted Sep 02, 2024 - 12:33 UTC
This incident affected: Databases || Kafka (BHS, DE, GRA, SBG, UK, WAW), Databases || Cassandra (BHS, DE, SBG, UK, WAW), Databases || Grafana (BHS, DE, GRA, SBG, UK, WAW), Databases || Kafka Connect (BHS, DE, GRA, SBG, UK, WAW), Databases || Kafka MirrorMaker (BHS, DE, GRA, SBG, UK, WAW), Databases || MySQL (BHS, DE, GRA, SBG, UK, WAW), Databases || OpenSearch (BHS, DE, GRA, SBG, UK, WAW), Databases || PostgreSQL (BHS, DE, GRA, SBG, UK, WAW), Databases || Caching (BHS, DE, GRA, SBG, UK, WAW), Databases || M3 Aggregator (BHS, DE, GRA, SBG, UK, WAW), and Databases || M3DB (BHS, DE, GRA, SBG, UK, WAW).