Introduction to Application Observability¶

The Application Observability service is a fully managed service for the collection, storage, and analysis of your application's telemetry. It processes your application's metrics, logs, and traces to provide unified, deep insights into its performance, health, and usage patterns.

The service is built entirely on an open-source foundation, giving your teams the power of industry-standard tools like Grafana, Prometheus, and OpenTelemetry without the operational burden of managing them. We provide you with a dedicated, secure Grafana instance, giving you a single pane of glass to troubleshoot faster, optimize performance, and understand your application's behavior.

Unified Insights for Your Entire Team¶

The service is designed to support core functions across your technical and business teams, breaking down data silos and providing a common platform for collaboration.

For Operations: Monitor application performance, response times, and resource consumption in real-time. Use correlated data and powerful alerting to perform root cause analysis and reduce Mean Time to Recovery (MTTR).
For Development: Debug code performance in any environment, from staging to production. Gain immediate technical insight into the impact of new deployments and identify performance bottlenecks before they affect users.
For Business: Analyze application usage patterns and user behavior to make data-driven decisions. Understand which features are being used and how the application is performing from the end-user's perspective.

A Fully Managed Service¶

Application Observability is a complete, managed solution. We handle the entire lifecycle of the observability platform, allowing your teams to focus on instrumenting your applications and gaining insights from the data.

Our managed service commitment includes:

High-Availability Platform: We provision and configure the entire LGTM (Loki, Grafana, Tempo, Mimir) stack for high availability and resilience.
Platform Scaling and Performance: We continuously monitor and horizontally scale the platform components to meet your application's ingestion and query loads.
Maintenance and Upgrades: We manage all software patching, security updates, and version upgrades for the platform components, ensuring the system is always secure and up-to-date.
Data Integration and Correlation: We configure the platform to ensure your data sources (metrics, logs, and traces) are seamlessly correlated within your Grafana instance.

Built on an Open Foundation¶

We believe in providing powerful tools without creating vendor lock-in. Our Application Observability service is built entirely on open-source standards and open APIs.

Visualization: Grafana is the central interface for all your data. We provide a dedicated, secure Grafana instance for your organization.
Metrics: Grafana Mimir provides a scalable, multi-tenant backend for Prometheus metrics, enabling long-term retention and powerful queries.
Logs: Grafana Loki provides a cost-effective log aggregation system that makes it easy to search and analyze your application logs.
Traces: Grafana Tempo provides a distributed tracing backend that integrates seamlessly with your logs and metrics.
Collection: Data collection is handled by OpenTelemetry, the industry standard for application instrumentation.

By using these open standards, your data remains portable, and your teams build valuable skills on industry-leading tools.

Key Features¶

The service is built with security, reliability, and cost-effectiveness as core principles.

Secure Multi-Tenancy¶

The platform is architected from the ground up for multi-tenancy and complete data segregation.

Dedicated Grafana Instance: You get your own dedicated Grafana instance, secured on a unique domain (e.g., your-org.observability.netic.dk).
Data Isolation: Your telemetry data is logically isolated at every stage—ingestion, storage, and querying—ensuring that you can only ever see your own data.
Encrypted In-Transit: All telemetry data sent from your clusters to our platform is secured using mutual TLS (mTLS), with access control managed by unique client certificates.

Reliable and Automated Data Collection¶

We deploy a robust, two-tier OpenTelemetry Collector architecture in your clusters to ensure data is collected reliably, even in the event of network interruptions.

Local Collection: An agent on each node handles local data collection and discovery.
Automatic Metrics Scraping: The service automatically discovers and scrapes Prometheus endpoints from correctly annotated pods, simplifying metric collection.
Persistent Buffering: Both the local agents and a central gateway in your cluster use persistent queues. If the connection to our platform is interrupted, your telemetry data is buffered locally and forwarded once the connection is restored, preventing data loss.

Granular Cost Control¶

We provide you with the tools to manage your data ingestion and control your costs.

Namespace Opt-in: Data collection is only active for namespaces that you explicitly enable via a Kubernetes label (application-observability.netic.dk/enabled: "true"). This gives you granular control over which applications send data.
Usage Dashboard: Your Grafana instance includes a pre-configured dashboard that provides real-time tracking of your consumption against the service's primary cost drivers, giving you full transparency into your usage.

Pricing, Legal and Support¶

Tip

For general information about pricing, legal or support concerning the platform, services or components, consult your contract or see the contact page.