Observability

OpenSearch provides observability capabilities for monitoring applications, infrastructure, and AI agents. Choose the path that matches your use case.

Ingesting observability data

Before exploring your data, you need to ingest it into OpenSearch. Use OpenSearch Data Prepper to transform unstructured log data into structured data for improved querying and filtering.

Get started with log ingestion

Exploring and analyzing observability data

OpenSearch provides the following tools for exploring and analyzing observability data:

Event analytics – Turn data-driven events into visualizations using Piped Processing Language (PPL).
Application analytics – Create custom observability applications to view system availability status.
Trace analytics – Visualize and analyze distributed traces from your applications.
Metric analytics – Query and visualize Prometheus metrics data.
Using Discover for observability – Analyze logs, metrics, and traces using specialized interfaces within observability workspaces.

Monitoring applications

For specialized application monitoring, OpenSearch provides two focused solutions: Application Performance Monitoring (APM) for traditional microservices and agent traces for AI/LLM applications.

	APM	Agent traces
Purpose	Monitor microservices and web applications	Monitor AI agents and large language models (LLMs)
Metrics	RED metrics (Rate, Errors, Duration)	Token usage, model calls, agent steps
Visualization	Service maps, latency charts, error tracking	Execution graphs (DAGs), trace trees, timelines
Conventions	OpenTelemetry standard conventions	OpenTelemetry generative AI semantic conventions
Best for	APIs, microservices, web services	Chatbots, AI agents, LLM applications

APM

APM monitors distributed applications using service topology, RED metrics, and performance tracking. APM requires the following components:

An OpenSearch cluster and OpenSearch Dashboards with workspaces enabled.
An OpenTelemetry Collector.
OpenSearch Data Prepper.
Prometheus for metrics storage.
Applications instrumented using OpenTelemetry.

Agent traces

Agent traces observe generative AI applications and LLM agents using specialized tracing for AI workloads. Agent traces require the following components:

An OpenSearch cluster with OpenSearch Dashboards.
OpenSearch Data Prepper for trace processing.
Applications instrumented with OpenTelemetry generative AI semantic conventions.

Get started with agent traces

Query performance

Use Query Insights to monitor and optimize the performance of queries running in your cluster. Identify slow queries, analyze query patterns, and improve cluster efficiency.

Get started with Query Insights

Organizing visualizations

After creating visualizations, organize them into dashboards and reports for sharing with your team:

Notebooks – Combine visualizations, code blocks, and narrative text to create reports, runbooks, and documentation.
Operational panels – Organize PPL visualizations into dashboards for monitoring and analysis.

Alerting and detection

OpenSearch provides tools for detecting issues and sending notifications:

Alerting – Create monitors that query your data on a schedule, define triggers for alert conditions, and execute actions when alerts fire.
Anomaly detection – Automatically detect anomalies in your time-series data using machine learning with the Random Cut Forest (RCF) algorithm.
Forecasting – Predict future values in your time-series data using the RCF model to anticipate threshold breaches before they occur.
Service-level objectives (Experimental) – Define availability and latency targets for your services and track error budgets and consumption rates against a Prometheus-compatible ruler.
Notifications – Configure channels for sending alerts through Slack, email, Amazon SNS, webhooks, and other communication services.

OpenSearch Observability Stack

The OpenSearch Observability Stack provides a complete, preconfigured observability platform that you can run locally using Docker Compose. The Observability Stack includes:

All APM and agent trace capabilities.
A GenAI SDK for Python or TypeScript instrumentation.
An Agent Health tool for local debugging and evaluation.
A Docker Compose setup with example applications.
A preconfigured OpenTelemetry Collector, OpenSearch Data Prepper, and Prometheus.

Learn more about Observability Stack

Ingesting observability data
Exploring and analyzing observability data
Monitoring applications
- APM
- Agent traces
Query performance
Organizing visualizations
Alerting and detection
OpenSearch Observability Stack

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Observability

Ingesting observability data

Exploring and analyzing observability data

Monitoring applications

APM

Agent traces

Query performance

Organizing visualizations

Alerting and detection

OpenSearch Observability Stack

OpenSearch Links

Get Involved

Resources

Contact Us