Observability
OpenSearch provides observability capabilities for monitoring applications, infrastructure, and AI agents. Choose the path that matches your use case.
Ingesting observability data
Before exploring your data, you need to ingest it into OpenSearch. Use OpenSearch Data Prepper to transform unstructured log data into structured data for improved querying and filtering.
Get started with log ingestion
Exploring and analyzing observability data
OpenSearch provides the following tools for exploring and analyzing observability data:
- Event analytics – Turn data-driven events into visualizations using Piped Processing Language (PPL).
- Application analytics – Create custom observability applications to view system availability status.
- Trace analytics – Visualize and analyze distributed traces from your applications.
- Metric analytics – Query and visualize Prometheus metrics data.
- Using Discover for observability – Analyze logs, metrics, and traces using specialized interfaces within observability workspaces.
Monitoring applications
For specialized application monitoring, OpenSearch provides two focused solutions: Application Performance Monitoring (APM) for traditional microservices and agent traces for AI/LLM applications.
| APM | Agent traces | |
|---|---|---|
| Purpose | Monitor microservices and web applications | Monitor AI agents and large language models (LLMs) |
| Metrics | RED metrics (Rate, Errors, Duration) | Token usage, model calls, agent steps |
| Visualization | Service maps, latency charts, error tracking | Execution graphs (DAGs), trace trees, timelines |
| Conventions | OpenTelemetry standard conventions | OpenTelemetry generative AI semantic conventions |
| Best for | APIs, microservices, web services | Chatbots, AI agents, LLM applications |
APM
APM monitors distributed applications using service topology, RED metrics, and performance tracking. APM requires the following components:
- An OpenSearch cluster and OpenSearch Dashboards with workspaces enabled.
- An OpenTelemetry Collector.
- OpenSearch Data Prepper.
- Prometheus for metrics storage.
- Applications instrumented using OpenTelemetry.
Agent traces
Agent traces observe generative AI applications and LLM agents using specialized tracing for AI workloads. Agent traces require the following components:
- An OpenSearch cluster with OpenSearch Dashboards.
- OpenSearch Data Prepper for trace processing.
- Applications instrumented with OpenTelemetry generative AI semantic conventions.
Query performance
Use Query Insights to monitor and optimize the performance of queries running in your cluster. Identify slow queries, analyze query patterns, and improve cluster efficiency.
Get started with Query Insights
Organizing visualizations
After creating visualizations, organize them into dashboards and reports for sharing with your team:
- Notebooks – Combine visualizations, code blocks, and narrative text to create reports, runbooks, and documentation.
- Operational panels – Organize PPL visualizations into dashboards for monitoring and analysis.
Alerting and detection
OpenSearch provides tools for detecting issues and sending notifications:
- Alerting – Create monitors that query your data on a schedule, define triggers for alert conditions, and execute actions when alerts fire.
- Anomaly detection – Automatically detect anomalies in your time-series data using machine learning with the Random Cut Forest (RCF) algorithm.
- Forecasting – Predict future values in your time-series data using the RCF model to anticipate threshold breaches before they occur.
- Notifications – Configure channels for sending alerts through Slack, email, Amazon SNS, webhooks, and other communication services.
OpenSearch Observability Stack
The OpenSearch Observability Stack provides a complete, preconfigured observability platform that you can run locally using Docker Compose. The Observability Stack includes:
- All APM and agent trace capabilities.
- A GenAI SDK for Python or TypeScript instrumentation.
- An Agent Health tool for local debugging and evaluation.
- A Docker Compose setup with example applications.
- A preconfigured OpenTelemetry Collector, OpenSearch Data Prepper, and Prometheus.
Learn more about Observability Stack