You're viewing version 3.1 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.
Memory-optimized search
Introduced 3.1
Memory-optimized search allows the Faiss engine to run efficiently without loading the entire vector index into off-heap memory. Without this optimization, Faiss typically loads the full index into memory, which can become unsustainable if the index size exceeds available physical memory. With memory-optimized search, the engine memory-maps the index file and relies on the operating system’s file cache to serve search requests. This approach avoids unnecessary I/O and allows repeated reads to be served directly from the system cache.
Memory-optimized search affects only search operations. Indexing behavior remains unchanged.
Limitations
The following limitations apply to memory-optimized search in OpenSearch:
- Supported only for the Faiss engine with the HNSW method
 - Does not support IVF or product quantization (PQ)
 - Requires an index restart to enable or disable
 
If you use IVF or PQ, the engine loads data into memory regardless of whether memory-optimized mode is enabled.
Configuration
To enable memory-optimized search, set index.knn.memory_optimized_search to true when creating an index:
PUT /test_index
{
  "settings": {
    "index.knn": true,
    "index.knn.memory_optimized_search": true
  },
  "mappings": {
    "properties": {
      "vector_field": {
        "type": "knn_vector",
        "dimension": 128,
        "method": {
          "name": "hnsw",
          "engine": "faiss"
        }
      }
    }
  }
}
To enable memory-optimized search on an existing index, you must close the index, update the setting, and then reopen the index:
POST /test_index/_close
PUT /test_index/_settings
{
  "index.knn.memory_optimized_search": true
}
POST /test_index/_open
Integration with disk-based search
When you configure a field with on_disk mode and 1x compression, memory-optimized search is automatically enabled for that field, even if memory optimization isn’t enabled at the index level. For more information, see Memory-optimized vectors.
Memory-optimized search differs from disk-based search because it doesn’t use compression or quantization. It only changes how vector data is loaded and accessed during search.
Performance optimization
When memory-optimized search is enabled, the warm-up API loads only the essential information needed for search operations, such as opening streams to the underlying Faiss index file. This minimal warm-up results in:
- Faster initial searches.
 - Reduced memory overhead.
 - More efficient resource utilization.
 
For fields where memory-optimized search is disabled, the warm-up process loads vectors into off-heap memory.