Observability¶

Observability

In modern distributed systems, failures are inevitable. When something goes wrong, the key to a swift recovery is not just monitoring for known failure modes but having the ability to ask questions about your system's behavior, even questions you didn't foresee. This is the essence of observability.

The Contain Platform provides a comprehensive, built-in observability solution that gives you deep insights into the health and performance of both the platform and your applications.

The Three Pillars of Observability¶

We provide a unified solution that covers the three pillars of observability. We collect, process, and present this data in a centralized service, giving you a single pane of glass for all your telemetry.

Metrics: These are numerical measurements aggregated over time, like CPU usage, memory consumption, or request rates. Metrics are powerful for understanding the health of your system, identifying trends, and triggering alerts when a threshold is breached.

"What was the average API response time over the last hour?"
Logs: These are timestamped, unstructured, or structured text records of discrete events. Logs are invaluable for debugging, as they provide detailed, event-specific context that helps you understand what happened at a specific time.

"What specific error did a user encounter when their payment failed?"
Traces: A trace represents the end-to-end journey of a single request as it moves through all the different services in your application. Traces are essential for pinpointing bottlenecks and understanding the complex interactions in a microservices architecture.

"Why is the user profile page loading slowly, and which downstream service is responsible for the delay?"

Key Benefits of Our Observability Solution¶

By providing a managed, "batteries-included" observability stack, we help you:

Troubleshoot Faster

Quickly move from detecting a problem to understanding its context and root cause without switching between different tools.
Improve System Reliability

Set up proactive alerts on key performance indicators to detect and address issues before they impact your users.
Optimize Application Performance

Identify and resolve performance bottlenecks in your distributed services by analyzing trace data.
Enhance Security & Compliance

Use detailed logs as an audit trail to track significant events, investigate security incidents, and meet compliance requirements.

Getting Started¶

Our platform is designed to make observability seamless. Platform-level telemetry is collected automatically. To add data from your own applications, you typically just need to instrument your code and add a few annotations to your deployment manifests.

To learn more about the specific tools we use and how to integrate your applications, please see the documentation for our **Observability Service.