Link Search Menu Expand Document Documentation Menu

Observability

OpenSearch provides observability capabilities for monitoring applications, infrastructure, and AI agents. Choose the path that matches your use case.


Ingesting observability data

Before exploring your data, you need to ingest it into OpenSearch. Use OpenSearch Data Prepper to transform unstructured log data into structured data for improved querying and filtering.

Get started with log ingestion


Exploring and analyzing observability data

OpenSearch provides the following tools for exploring and analyzing observability data:


Monitoring applications

For specialized application monitoring, OpenSearch provides two focused solutions: Application Performance Monitoring (APM) for traditional microservices and agent traces for AI/LLM applications.

  APM Agent traces
Purpose Monitor microservices and web applications Monitor AI agents and large language models (LLMs)
Metrics RED metrics (Rate, Errors, Duration) Token usage, model calls, agent steps
Visualization Service maps, latency charts, error tracking Execution graphs (DAGs), trace trees, timelines
Conventions OpenTelemetry standard conventions OpenTelemetry generative AI semantic conventions
Best for APIs, microservices, web services Chatbots, AI agents, LLM applications

APM

APM monitors distributed applications using service topology, RED metrics, and performance tracking. APM requires the following components:

Agent traces

Agent traces observe generative AI applications and LLM agents using specialized tracing for AI workloads. Agent traces require the following components:

Get started with agent traces


Query performance

Use Query Insights to monitor and optimize the performance of queries running in your cluster. Identify slow queries, analyze query patterns, and improve cluster efficiency.

Get started with Query Insights


Organizing visualizations

After creating visualizations, organize them into dashboards and reports for sharing with your team:

  • Notebooks – Combine visualizations, code blocks, and narrative text to create reports, runbooks, and documentation.
  • Operational panels – Organize PPL visualizations into dashboards for monitoring and analysis.

Alerting and detection

OpenSearch provides tools for detecting issues and sending notifications:

  • Alerting – Create monitors that query your data on a schedule, define triggers for alert conditions, and execute actions when alerts fire.
  • Anomaly detection – Automatically detect anomalies in your time-series data using machine learning with the Random Cut Forest (RCF) algorithm.
  • Forecasting – Predict future values in your time-series data using the RCF model to anticipate threshold breaches before they occur.
  • Notifications – Configure channels for sending alerts through Slack, email, Amazon SNS, webhooks, and other communication services.

OpenSearch Observability Stack

The OpenSearch Observability Stack provides a complete, preconfigured observability platform that you can run locally using Docker Compose. The Observability Stack includes:

  • All APM and agent trace capabilities.
  • A GenAI SDK for Python or TypeScript instrumentation.
  • An Agent Health tool for local debugging and evaluation.
  • A Docker Compose setup with example applications.
  • A preconfigured OpenTelemetry Collector, OpenSearch Data Prepper, and Prometheus.

Learn more about Observability Stack