Live queries

Introduced 3.0

Use the Live Queries API to retrieve currently running search queries across the cluster or on specific nodes. Monitoring live queries using Query Insights allows you to get real-time visibility into the search queries that are currently executing within your OpenSearch cluster. This is useful for identifying and debugging queries that might be running for an unexpectedly long time or consuming significant resources at the moment.

The API returns a list of currently executing search queries, sorted by a specified metric (defaulting to latency) in descending order. The response includes details for each live query, such as the query status, start time, total resource usage aggregated across coordinator and shard tasks, and individual task-level breakdowns.

Endpoints

GET /_insights/live_queries

Query parameters

The following table lists the available query parameters. All query parameters are optional.

Parameter	Data type	Description
`verbose`	Boolean	Whether to include detailed query information in the output. Default is `true`.
`nodeId`	String	A comma-separated list of node IDs used to filter the results. If omitted, queries from all nodes are returned.
`sort`	String	The metric to sort the results by. Valid values are `latency`, `cpu`, or `memory`. Default is `latency`.
`size`	Integer	The number of query records to return. Must be a positive integer. Default is `100`.
`wlmGroupId`	String	Filters results to only return queries belonging to the specified workload management group. If omitted, queries from all groups are returned.
`use_finished_cache`	Boolean	When set to `true`, the response includes a `finished_queries` array containing recently completed queries from the finished query cache. Default is `false`.

Finished query cache

When use_finished_cache=true is specified, the API also returns recently completed queries alongside the currently running ones. This is useful for correlating live queries with queries that have just completed, providing a more comprehensive view of recent query activity.

Cache lifecycle

The finished query cache remains inactive when a node starts and consumes no resources until you enable it. The lifecycle works as follows:

The cache activates on the first API call that includes use_finished_cache=true. Once the cache is active, the node begins capturing completed queries into the cache.
While active, the cache stores up to 1,000 recently completed queries. Individual records are retained for 5 minutes before being automatically evicted. Each API call returns up to 50 of the most recent records.
If no API call containing use_finished_cache=true is made within the idle timeout period (by default, 5 minutes), the cache automatically deactivates and clears its data.
After an idle deactivation, the cache automatically reactivates on the next API call with use_finished_cache=true and begins capturing queries again.

Because the cache only activates on demand, queries that complete before the first use_finished_cache=true call are not captured. To ensure complete coverage, make an initial API call containing use_finished_cache=true before running the queries you want to monitor.

Cache settings

You can configure the idle timeout using the following dynamic cluster setting:

PUT _cluster/settings
{
  "persistent": {
    "search.insights.live_queries.cache.idle_timeout": "5m"
  }
}

The search.insights.live_queries.cache.idle_timeout setting accepts a time value. Set to 0 to disable the cache entirely and stop it immediately. Non-zero values must be between 2m and 10m. Default is 5m. Changing from 0 to a non-zero value reactivates the cache without requiring a node restart.

For more information, see Dynamic settings.

Example requests

The following example request fetches the top 10 queries sorted by CPU usage, with verbose output disabled:

GET /_insights/live_queries?verbose=false&sort=cpu&size=10

The following example request fetches live queries along with recently completed queries:

GET /_insights/live_queries?use_finished_cache=true

The following example request filters live queries by workload management group:

GET /_insights/live_queries?wlmGroupId=DEFAULT_WORKLOAD_GROUP

Example response

{
  "live_queries": [
    {
      "id": "troGHNGUShqDj3wK_K5ZIw:512",
      "status": "running",
      "start_time": 1745359226777,
      "total_latency_millis": 13959,
      "total_cpu_nanos": 405000,
      "total_memory_bytes": 3104,
      "coordinator_task": {
        "task_id": "troGHNGUShqDj3wK_K5ZIw:512",
        "node_id": "troGHNGUShqDj3wK_K5ZIw",
        "action": "indices:data/read/search",
        "status": "running",
        "description": "indices[my-index-*], search_type[QUERY_THEN_FETCH], source[{\"size\":20,\"query\":{\"term\":{\"user.id\":{\"value\":\"userId\",\"boost\":1.0}}}}]",
        "start_time": 1745359226777,
        "running_time_nanos": 13959364458,
        "cpu_nanos": 305000,
        "memory_bytes": 2048
      },
      "shard_tasks": [
        {
          "task_id": "Y6eBnbdISPO6XaVfxCBRgg:101",
          "node_id": "Y6eBnbdISPO6XaVfxCBRgg",
          "action": "indices:data/read/search[phase/query]",
          "status": "running",
          "description": "id[0], type[query], indices[my-index-*]",
          "start_time": 1745359226800,
          "running_time_nanos": 13900000000,
          "cpu_nanos": 100000,
          "memory_bytes": 1056
        }
      ]
    }
  ]
}

The preceding response shows a single live query:

The top-level fields (id, status, start_time, total_latency_millis, total_cpu_nanos, total_memory_bytes) provide a summary of the entire search request. The total_* metrics are aggregated across the coordinator task and all shard tasks, giving you a single view of the query’s overall resource consumption.
The coordinator_task object describes the task on the coordinator node that received the search request and is orchestrating the query across shards. In this example, the coordinator node (troGHNGUShqDj3wK_K5ZIw) has been running for approximately 13.9 seconds and has consumed 305,000 nanoseconds of CPU time and 2,048 bytes of memory. The description field includes the target indexes, search type, and the full query source.
The shard_tasks array lists the individual shard-level tasks spawned by the coordinator node. Each shard task runs on a specific data node and executes a phase of the search (for example, search[phase/query]). In this example, one shard task is running on node Y6eBnbdISPO6XaVfxCBRgg, consuming 100,000 nanoseconds of CPU and 1,056 bytes of memory. A query that spans multiple shards or data nodes will have multiple entries in this array.

When use_finished_cache=true is specified, the response also includes a finished_queries array:

{
  "live_queries": [],
  "finished_queries": [
    {
      "timestamp": 1745359230000,
      "id": "troGHNGUShqDj3wK_K5ZIw:512",
      "top_n_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "status": "completed",
      "node_id": "troGHNGUShqDj3wK_K5ZIw",
      "source": {
        "size": 20,
        "query": {
          "term": {
            "user.id": {
              "value": "userId",
              "boost": 1.0
            }
          }
        }
      },
      "indices": ["my-index-*"],
      "search_type": "query_then_fetch",
      "measurements": {
        "latency": {
          "number": 13959364458,
          "count": 1,
          "aggregationType": "NONE"
        },
        "cpu": {
          "number": 405000,
          "count": 1,
          "aggregationType": "NONE"
        },
        "memory": {
          "number": 3104,
          "count": 1,
          "aggregationType": "NONE"
        }
      }
    }
  ]
}

The id field in a completed query record uses the same nodeId:taskId format as the live query id (for example, troGHNGUShqDj3wK_K5ZIw:512). This lets you correlate a completed query with the live query it originated from. The top_n_id is a separate UUID that links the completed query to its corresponding record in the top N queries store. If the query did not qualify as a top N query, top_n_id is null.

Response fields

The following table lists the fields in each object in the live_queries array.

Field	Data type	Description
`id`	String	The unique identifier of the search request (the coordinator node task ID in `nodeId:taskId` format).
`status`	String	The current status of the query. Valid values are `running` or `cancelled`.
`start_time`	Long	The time at which the query started, in milliseconds since the epoch.
`wlm_group_id`	String	The workload management group ID associated with the query. Only present if the query belongs to a workload group.
`total_latency_millis`	Long	The total elapsed time of the query in milliseconds, aggregated across coordinator and shard tasks.
`total_cpu_nanos`	Long	The total CPU time consumed by the query in nanoseconds, aggregated across coordinator and shard tasks.
`total_memory_bytes`	Long	The total heap memory used by the query in bytes, aggregated across coordinator and shard tasks.
`coordinator_task`	Object	Details about the coordinator task for this query. See Task fields.
`shard_tasks`	Array	A list of shard-level task details for this query. Each element follows the same structure as Task fields.

Task fields

Each coordinator_task object and each member of the shard_tasks array contains the following fields.

Field	Data type	Description
`task_id`	String	The task identifier in `nodeId:taskId` format.
`node_id`	String	The ID of the node on which the task is running.
`action`	String	The action performed by the task (for example, `indices:data/read/search` for coordinator tasks or `indices:data/read/search[phase/query]` for shard tasks).
`status`	String	The current status of the task.
`description`	String	A description of the task, including the target indexes, search type, and query source. Only included if `verbose` is `true`.
`start_time`	Long	The time at which the task started, in milliseconds since the epoch.
`running_time_nanos`	Long	The elapsed time of the task, in nanoseconds.
`cpu_nanos`	Long	The CPU time consumed by the task, in nanoseconds.
`memory_bytes`	Long	The amount of heap memory used by the task, in bytes.

The finished_queries array fields

When use_finished_cache=true is specified, the finished_queries array contains query objects with the following fields.

Field	Data type	Description
`timestamp`	Long	The time at which the query completed, in milliseconds since the epoch.
`id`	String	The live query identifier (in `nodeId:taskId` format) for correlation with live queries.
`top_n_id`	String	The UUID linking this record to the corresponding top N query record. May be `null` if the query did not qualify as a top N query.
`status`	String	The completion status of the query (for example, `completed`).
`node_id`	String	The coordinator node ID.
`source`	Object	The query source body.
`indices`	Array	The list of indexes targeted by the query.
`search_type`	String	The search execution type (for example, `query_then_fetch`).
`phase_latency_map`	Object	A breakdown of latency by search phase.
`task_resource_usages`	Array	Per-task resource usage details.
`measurements`	Object	An object containing the final performance metrics for the query. Each metric (`latency`, `cpu`, `memory`) contains `number`, `count`, and `aggregationType` fields.

Endpoints
Query parameters
Finished query cache
- Cache lifecycle
- Cache settings
Example requests
Example response
Response fields
- Task fields
- The finished_queries array fields

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.