Workload parameters

Workload parameters let you customize workload behavior without editing the workload files directly. You can control settings such as bulk size, number of shards, index name, and search configuration by passing parameters at runtime.

OpenSearch Benchmark workloads use Jinja2 templates. When you pass parameters using the --workload-params flag, OpenSearch Benchmark injects them into the workload JSON files before execution.

For example, a workload’s index.json might contain the following settings:

{
  "settings": {
    "index.number_of_shards": {{ number_of_shards | default(1) }},
    "index.number_of_replicas": {{ number_of_replicas | default(0) }}
  }
}

When you run a benchmark with --workload-params='{"number_of_shards": 3}', OpenSearch Benchmark replaces {{ number_of_shards | default(1) }} with 3. Parameters you don’t override use their default values.

Passing parameters

You can pass parameters in the following ways.

JSON file (recommended for many parameters)

Create a JSON file containing your parameters:

{
  "number_of_shards": 3,
  "number_of_replicas": 1,
  "bulk_size": 5000,
  "target_index_name": "my_index"
}

Then reference it as follows:

opensearch-benchmark run --workload=geonames --workload-params=my-params.json

Inline JSON

Pass parameters directly on the command line:

opensearch-benchmark run --workload=geonames --workload-params='{"number_of_shards": 3, "bulk_size": 5000}'

Comma-separated key-value pairs

Use the following format to pass key-value pairs:

opensearch-benchmark run --workload=geonames --workload-params="number_of_shards:3,bulk_size:5000"

The comma-separated format only supports string values. Use JSON file or inline JSON for numbers, Boolean values, or nested objects.

Parameter precedence

When the same parameter is defined in multiple sources, OpenSearch Benchmark applies them in the following order (highest priority first):

--workload-params (CLI flag): Overrides all other values.
Workload template defaults: Default values specified in {{ var | default(value) }} expressions in the workload JSON files (for example, {{ number_of_shards | default(1) }}).
Undefined: If no default is specified and the parameter is not provided, OpenSearch Benchmark raises a template rendering error.

Template syntax

This section describes the most common template patterns used in workload files.

Variable with a default value

Use the default filter to specify a fallback value when a parameter is not provided. For example, if bulk_size is not provided in --workload-params, the expression evaluates to 5000:

{{ bulk_size | default(5000) }}

Boolean values

Use the tojson filter for Boolean values to ensure correct JSON output. For example, the following expression evaluates to false (without quotation marks), not "false":

{{ query_cache_enabled | default(false) | tojson }}

String values

Wrap string variables in quotation marks:

"{{ conflicts | default('random') }}"

Conditional sections

Use {% if %} blocks to include or exclude sections based on whether a parameter is defined or on the parameter value.

Including a field only when defined

The following template conditionally includes the target-throughput field. If target_throughput is not provided using --workload-params, the entire field is omitted from the rendered output:

{% if target_throughput is defined %}
"target-throughput": {{ target_throughput }},
{% endif %}

If/else for alternative values

Use {% else %} to provide a fallback. For example, if use_zstd is set to true in --workload-params, the rendered output sets the source-file parameter to documents.json.zst. Otherwise, it sets the source-file to documents.json.bz2:

{% if use_zstd %}
"source-file": "documents.json.zst",
{% else %}
"source-file": "documents.json.bz2",
{% endif %}

Conditionally adding index fields

This pattern is commonly used to define optional fields in vectorsearch workload templates. The {%- endif %} (with the dash) trims trailing whitespace and newline characters, preventing empty lines from appearing and avoiding invalid JSON formatting (such as trailing commas or misaligned structure):

"properties": {
  {% if id_field_name is defined and id_field_name != "_id" %}
  "{{ id_field_name }}": {
    "type": "keyword"
  },
  {%- endif %}
  "embedding": {
    "type": "knn_vector",
    "dimension": {{ target_index_dimension }}
  }
}

Version-based conditionals

Some workloads adapt their behavior based on distribution_version, which OpenSearch Benchmark sets automatically according to the target cluster. This pattern allows a single workload to support multiple OpenSearch versions by conditionally including version-specific operations or settings:

{% if distribution_version is not defined %}
  {% set distribution_version = "2.11.0" %}
{% endif %}

{% if distribution_version.split('.') | map('int') | list >= "2.19.1".split('.') | map('int') | list %}
  {# Include features available in 2.19.1+ #}
{% endif %}

For loops

Use {% for %} loops to generate repeated structures:

{% for i in range(1, 101) %}
{
  "name": "query-{{ i }}",
  "operation-type": "search",
  "body": { ... }
},
{% endfor %}

Integer conversion

Use the int filter when a parameter must be an integer:

{{ target_index_dimension | default(768) | int }}

Including external files

Workloads are typically organized into multiple files for readability. The {{ benchmark.collect }} helper composes a single workload definition from multiple JSON files at render time.

Importing the helper

Every workload.json that uses benchmark.collect must import it at the top of the file:

{% import "benchmark.helpers" as benchmark with context %}

The with context clause ensures that all workload parameters are available in the included files.

Collecting operations and test procedures

A typical workload.json delegates its operations and test procedures to separate files:

{% import "benchmark.helpers" as benchmark with context %}
{
  "version": 2,
  "description": "My workload",
  "indices": [ ... ],
  "corpora": [ ... ],
  "operations": [
    {{ benchmark.collect(parts="operations/*.json") }}
  ],
  "test_procedures": [
    {{ benchmark.collect(parts="test_procedures/*.json") }}
  ]
}

The parts argument accepts glob patterns. The pattern operations/*.json matches all JSON files in the operations/ directory and includes their contents, separated by commas. This keeps the main workload.json concise while the operation and test procedure definitions are defined in separate files.

Composing schedules from shared parts

Test procedures can reuse common schedule fragments. For example, the vectorsearch workload has shared schedules under test_procedures/common/:

test_procedures/
  common/
    index-only-schedule.json
    search-only-schedule.json
    force-merge-schedule.json
    vespa-search-only-schedule.json
  default.json

A test procedure in default.json composes its schedule from these parts:

{
  "name": "no-train-test",
  "default": true,
  "schedule": [
    {{ benchmark.collect(parts="common/index-only-schedule.json") }},
    {{ benchmark.collect(parts="common/force-merge-schedule.json") }},
    {{ benchmark.collect(parts="common/search-only-schedule.json") }}
  ]
}

Each collected file contains one or more schedule entries. Parameters such as {{ target_index_name }} in those files are resolved from the same --workload-params passed on the command line because the with context import propagates all parameters to the included files.

Index body files

The body field in an index definition references a separate JSON file for mappings and settings:

"indices": [
  {
    "name": "geonames",
    "body": "index.json"
  }
]

The index.json file is a Jinja2 template like any other workload file, so it can use parameters:

{
  "settings": {
    "index.number_of_shards": {{ number_of_shards | default(1) }}
  },
  "mappings": { ... }
}

Discovering available parameters

To view the parameters supported by a workload, use the info command. This command lists the workload’s test procedures along with their configurable parameters and default values:

opensearch-benchmark info --workload=geonames

You can also inspect the workload source directly. Parameters appear as {{ variable_name | default(value) }} in workload JSON files. The main workload files are the following:

workload.json – The top-level workload definition.
index.json – The index settings and mappings.
test_procedures/default.json – The test procedure schedules.
_operations/default.json – The operation definitions.

Common parameters

The following parameters are supported by most official workloads.

Parameter	Description	Default
`number_of_shards`	The primary shard count for created indexes.	`1`
`number_of_replicas`	The replica count for created indexes.	`0`
`bulk_size`	The number of documents per bulk request.	`5000` or `10000`
`bulk_indexing_clients`	The number of concurrent bulk indexing clients.	`8`
`ingest_percentage`	The percentage of the document corpus to ingest.	`100`
`target_throughput`	The target number of operations per second per client.	Unthrottled
`search_clients`	The number of concurrent search clients.	`1`
`cluster_health`	The required cluster health status before proceeding.	`green`
`source_enabled`	Whether to store the `_source` field.	`true`

Vector search workload parameters

The vectorsearch workload supports additional parameters for vector search benchmarking.

Parameter	Description	Default
`target_index_name`	The vector index name.	`target_index`
`target_field_name`	The vector field name.	`target_field`
`target_index_dimension`	The number of vector dimensions.	`768`
`target_index_space_type`	The distance metric. Valid values are `l2`, `innerproduct`, and `cosinesimil`.	Varies
`target_index_body`	The path to index settings file.	`indices/faiss-index.json`
`target_index_bulk_size`	The number of documents per bulk request.	`500`
`target_index_bulk_index_data_set_format`	The corpus format. Valid values are `hdf5` and `bigann`.	`hdf5`
`target_index_bulk_index_data_set_corpus`	The corpus name (for example, `cohere-1m`).	Varies
`target_index_bulk_indexing_clients`	The number of concurrent indexing clients.	`10`
`target_index_max_num_segments`	The number of segments after force merge.	`1`
`hnsw_ef_construction`	The HNSW graph build-time exploration factor.	`256`
`hnsw_ef_search`	The HNSW search-time exploration factor.	`256`
`query_k`	The number of nearest neighbors to retrieve.	`100`
`query_count`	The number of queries to run. Use `-1` for all queries.	`-1`
`query_data_set_format`	The query vector format. Valid values are `hdf5` and `bigann`.	`hdf5`
`query_data_set_corpus`	The query vector corpus name.	Varies
`search_clients`	The number of concurrent search clients.	`1`
`neighbors_data_set_corpus`	The ground-truth neighbors corpus used for recall evaluation.	Varies
`neighbors_data_set_format`	The neighbors dataset format.	`hdf5`

Example vector search parameter file

The following example shows a complete parameter file for a vectorsearch workload:

{
  "target_index_name": "vector_1m",
  "target_field_name": "embedding",
  "target_index_body": "indices/faiss-index.json",
  "target_index_primary_shards": 1,
  "target_index_replica_shards": 0,
  "target_index_dimension": 768,
  "target_index_space_type": "innerproduct",
  "target_index_bulk_size": 500,
  "target_index_bulk_index_data_set_format": "hdf5",
  "target_index_bulk_index_data_set_corpus": "cohere-1m",
  "target_index_bulk_indexing_clients": 10,
  "target_index_max_num_segments": 1,
  "hnsw_ef_construction": 200,
  "hnsw_ef_search": 256,
  "query_k": 100,
  "query_data_set_format": "hdf5",
  "query_data_set_corpus": "cohere-1m",
  "query_count": 10000,
  "search_clients": 1,
  "neighbors_data_set_corpus": "cohere-1m",
  "neighbors_data_set_format": "hdf5"
}

To use this parameter file, save it as params.json and run the benchmark with the --workload-params flag:

opensearch-benchmark run \
  --pipeline=benchmark-only \
  --workload-path=/path/to/vectorsearch \
  --workload-params=params.json \
  --target-hosts=localhost:9200

Passing parameters
Parameter precedence
Template syntax
Discovering available parameters
Common parameters
Vector search workload parameters
- Example vector search parameter file

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.