Scroll API

Introduced 1.0

You can use the scroll operation to retrieve a large number of results. For example, for machine learning jobs, you can request an unlimited number of results in batches.

To use the scroll operation, add a scroll parameter to the request header with a search context to tell OpenSearch how long you need to keep scrolling. This search context needs to be long enough to process a single batch of results.

Because search contexts consume a lot of memory, we suggest you don’t use the scroll operation for frequent user queries. Instead, use the sort parameter with the search_after parameter to scroll responses for user queries.

Endpoints

GET _search/scroll
POST _search/scroll
GET _search/scroll/<scroll-id>
POST _search/scroll/<scroll-id>

Path parameters

Parameter	Type	Description
scroll_id	String	The scroll ID for the search.

Query parameters

All scroll parameters are optional.

Parameter	Type	Description
scroll	Time	Specifies the amount of time the search context is maintained.
scroll_id	String	The scroll ID for the search.
rest_total_hits_as_int	Boolean	Whether the `hits.total` property is returned as an integer (`true`) or an object (`false`). Default is `false`.

Example request

The following example demonstrates the scroll workflow from initiating a scroll operation to retrieving all results.

Step 1: Start the scroll operation

To begin scrolling, send an initial search query with a scroll parameter that specifies how long to keep the search context alive (for example, 10m for 10 minutes). Use the size parameter to set how many results to return in each batch:

GET /shakespeare/_search?scroll=10m
{
  "size": 10000
}

response = client.search(
  index = "shakespeare",
  params = { "scroll": "10m" },
  body =   {
    "size": 10000
  }
)

OpenSearch caches the results and returns a scroll ID to access them in batches:

"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAUWdmpUZDhnRFBUcWFtV21nMmFwUGJEQQ=="

Step 2: Retrieve subsequent batches

Pass this scroll ID to the scroll operation to get back the next batch of results:

GET /_search/scroll
{
  "scroll": "10m",
  "scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAUWdmpUZDhnRFBUcWFtV21nMmFwUGJEQQ=="
}

response = client.scroll(
  body =   {
    "scroll": "10m",
    "scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAUWdmpUZDhnRFBUcWFtV21nMmFwUGJEQQ=="
  }
)

Using this scroll ID, you get results in batches of 10,000 as long as the search context is still open. Typically, the scroll ID does not change between requests, but it can change, so make sure to always use the latest scroll ID. If you don’t send the next scroll request within the set search context, the scroll operation does not return any results.

Detecting the end of results

When you’ve scrolled through all results, the final batch contains an empty hits array:

{
  "_scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAUWdmpUZDhnRFBUcWFtV21nMmFwUGJEQQ==",
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

When hits.hits is an empty array, you’ve retrieved all available results and should stop scrolling. Make sure to close the scroll context to free up resources.

Using sliced scroll

If you expect billions of results, use a sliced scroll. Slicing allows you to perform multiple scroll operations for the same request, but in parallel. Set the ID and the maximum number of slices for the scroll:

GET /shakespeare/_search?scroll=10m
{
  "slice": {
    "id": 0,
    "max": 10
  },
  "query": {
    "match_all": {}
  }
}

response = client.search(
  index = "shakespeare",
  params = { "scroll": "10m" },
  body =   {
    "slice": {
      "id": 0,
      "max": 10
    },
    "query": {
      "match_all": {}
    }
  }
)

With a single scroll ID, you get back 10 results. You can have up to 10 IDs.

Step 3: Close the scroll context

Close the search context when you’re done scrolling because the scroll operation continues to consume computing resources until the timeout:

DELETE /_search/scroll/DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAcWdmpUZDhnRFBUcWFtV21nMmFwUGJEQQ==

response = client.clear_scroll(
  scroll_id = "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAcWdmpUZDhnRFBUcWFtV21nMmFwUGJEQQ==",
  body = { "Insert body here" }
)

To close all open scroll contexts:

DELETE /_search/scroll/_all

response = client.clear_scroll(
  scroll_id = "_all",
  body = { "Insert body here" }
)

The scroll operation corresponds to a specific timestamp. It doesn’t consider documents added after that timestamp as potential results.

Example response

{
  "succeeded": true,
  "num_freed": 1
}

Endpoints
Path parameters
Query parameters
Example request
Example response
Related documentation

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Scroll API

Endpoints

Path parameters

Query parameters

Example request

Step 1: Start the scroll operation

Step 2: Retrieve subsequent batches

Detecting the end of results

Using sliced scroll

Step 3: Close the scroll context

Example response

OpenSearch Links

Get Involved

Resources

Contact Us

Scroll API

Endpoints

Path parameters

Query parameters

Example request

Step 1: Start the scroll operation

Step 2: Retrieve subsequent batches

Detecting the end of results

Using sliced scroll

Step 3: Close the scroll context

Example response

Related documentation