Resolver Portal Status and Alerting Delay Issues

Minor incident EU-01 Admin Portal
2024-11-21 09:40 CET · 1 week, 1 day, 52 minutes

Updates

Resolved

We are pleased to inform you that the recent intermittent delays and issues with resolver data and alerts on the portal for customers connected to our EU-01 cluster have been resolved.

Actions Taken
Our engineering team worked diligently to restore normal operations:

  • Resource Expansion: Added additional nodes to increase cluster capacity.
  • Configuration Improvements: Adjusted system resources and scheduling settings.
  • Data Cleanup: Removed unnecessary data to reduce system load.
  • System Optimization: Made adjustments to improve performance and prevent resource bottlenecks.

Current Status

  • System Performance Restored: Normal operations have resumed, and system performance has significantly improved.
  • Monitoring: We are closely monitoring cluster performance to ensure continued stability.

We apologize for any inconvenience this incident may have caused and appreciate your patience and understanding. Our commitment to providing you with reliable and efficient service remains our top priority.

November 29, 2024 · 09:54 CET
Retroactive

Summary
We are currently experiencing intermittent delays and issues with resolver data and alerts on the portal for customers connected to our EU-01 cluster. Some customers may notice:

  • Resolvers displayed as “Connected” instead of “Active.”
  • Delays in receiving alerts and notifications.
  • Resolver metrics showing “Waiting for data” messages.

Impact

  • Portal Display Issues: Resolver statuses and metrics may not reflect - real-time data.
  • Delayed Alerts: Notifications for certain events may be delayed.

Please note: DNS resolution and domain filtering services are operating normally and are not affected by this issue.

Cause
An unexpected high load on our data processing systems has affected the timely processing of resolver data and alerts.

Actions Taken

  • Resource Expansion: Added additional nodes to increase cluster capacity.
  • Configuration Improvements: Adjusted system resources and scheduling settings.

Current Status
System performance has improved, and normal operations are resuming for many customers. However, some intermittent issues may still occur until the system is fully stabilized.

Next Steps

  • Data Cleanup: We will remove unnecessary data to reduce load.
  • Ongoing Monitoring: Keep a close watch on cluster performance and make necessary adjustments.
  • Optimization: Continue performance tuning and re-enable background processes once stability is confirmed.
November 27, 2024 · 13:27 CET

← Back