Split string processor
The split_string processor splits a field into an array using a delimiting character.
Configuration
The following table describes the options you can use to configure the split_string processor.
| Option | Required | Type | Description |
|---|---|---|---|
| entries | Yes | List | List of entries. Valid values are source, delimiter, and delimiter_regex. |
| source | N/A | N/A | The key to split. |
| delimiter | No | N/A | The separator character responsible for the split. Cannot be defined at the same time as delimiter_regex. At least delimiter or delimiter_regex must be defined. |
| delimiter_regex | No | N/A | The regex string responsible for the split. Cannot be defined at the same time as delimiter. At least delimiter or delimiter_regex must be defined. |
Example
To get started, create the following pipeline.yaml file:
split-string-all-configs-pipeline:
source:
http:
path: /logs
ssl: false
processor:
- split_string:
# 1) The top-level list of split "entries"
entries:
# 2) Use `source` + `delimiter` (comma)
- source: csv_line
delimiter: ","
# 3) Another `source` + `delimiter` (pipe)
- source: tags
delimiter: "|"
# 4) `source` + `delimiter` (slash) to split a path
- source: path
delimiter: "/"
# 5) `source` + `delimiter_regex` (semicolon + optional spaces)
- source: semicolons
delimiter_regex: ";\\s*"
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
insecure: true
username: admin
password: admin_pass
index_type: custom
index: split-string-demo-%{yyyy.MM.dd}
You can test the pipeline using the following command:
curl -sS -X POST "http://localhost:2021/logs" \
-H "Content-Type: application/json" \
-d '[
{
"csv_line": "x,y",
"tags": "beta|test",
"path": "usr/local/bin",
"semicolons": "alpha;beta ; gamma"
}
]'
The document stored in OpenSearch contains the following information:
{
...
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "split-string-demo-2025.10.15",
"_id": "YSAz6JkBrcmuDURMmTeo",
"_score": 1,
"_source": {
"csv_line": [
"x",
"y"
],
"tags": [
"beta",
"test"
],
"path": [
"usr",
"local",
"bin"
],
"semicolons": [
"alpha",
"beta ",
"gamma"
]
}
}
]
}
}