Link Search Menu Expand Document Documentation Menu

rex

The rex command extracts fields from a raw text field using regular expression named capture groups. It uses Java regex patterns. For more information, see the Java regular expression documentation.

The rex and parse commands compared

The rex and parse commands both extract information from text fields using Java regular expressions with named capture groups. The following table compares the capabilities of the rex and parse commands.

Feature rex parse
Pattern type Java regex Java regex
Named groups required Yes Yes
Multiple named groups Yes No
Multiple matches Yes No
Text substitution Yes No
Offset tracking Yes No
Special characters in group names No No

Syntax

The rex command has the following syntax:

rex [mode=<mode>] field=<field> <pattern> [max_match=<int>] [offset_field=<string>]

Parameters

The rex command supports the following parameters.

Parameter Required/Optional Description
field Required The field to extract data from. The field must be a string.
<pattern> Required The regular expression pattern with named capture groups used to extract new fields. The pattern must contain at least one named capture group using the (?<name>pattern) syntax. Group names must start with a letter and contain only letters and digits.
mode Optional The pattern-matching mode. Valid values are extract and sed. The extract mode creates new fields from regular expression named capture groups. The sed mode performs text substitution using sed-style patterns (supports s/pattern/replacement/ with flags, y/from_chars/to_chars/ transliteration, and backreferences).
max_match Optional The maximum number of matches to extract. If the value is greater than 1, the extracted fields are returned as arrays. A value of 0 indicates unlimited matches; however, the effective number of matches is automatically limited by the configured maximum. The default maximum is 10 and can be configured using plugins.ppl.rex.max_match.limit (see the note). Default is 1.
offset_field Optional Valid in extract mode only. The name of the field in which to store the character offset positions of the matches.

You can set the max_match limit in the plugins.ppl.rex.max_match.limit cluster setting. For more information, see SQL settings. Setting this limit to a large value is not recommended because it can lead to excessive memory consumption, especially with patterns that match empty strings (for example, \d* or \w*).

Example 1: Extracting service name and error type from log messages

The following query extracts the error type from Java exception log messages. Non-matching rows return null for the extracted field:

source=otellogs
| where severityText = 'ERROR'
| rex field=body "(?<errtype>[A-Z][a-zA-Z]+Exception)"
| fields body, errtype
| head 3

The query returns the following results:

body errtype
Payment failed: connection timeout to payment gateway after 30000ms null
NullPointerException in CheckoutService.placeOrder at line 142 NullPointerException
Out of memory: Java heap space - shutting down pod payment-6f8d4b-ht7q3 null

Example 2: Extracting multiple words using max_match

The following query uses the rex command with the max_match parameter to extract multiple words from the body field. The extracted field is returned as an array of strings:

source=otellogs
| where severityText = 'WARN'
| rex field=body "(?<word>[A-Za-z]+)" max_match=3
| fields body, word

The query returns the following results:

body word
Slow query detected: SELECT * FROM products WHERE category = ‘electronics’ took 3200ms [Slow,query,detected]
Connection pool 80% utilized on database replica db-replica-02 [Connection,pool,utilized]
SSL certificate for api.example.com expires in 14 days [SSL,certificate,for]
Rate limit threshold reached: 450/500 requests per minute for API key ending in …abc789 [Rate,limit,threshold]

Example 3: Replacing text using sed mode

The following query uses sed mode to mask IP addresses in log messages for privacy compliance:

source=otellogs
| where LIKE(body, '%authenticated%') OR LIKE(body, '%credentials%')
| rex field=body mode=sed "s/[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]+/xxx.xxx.xxx.xxx/"
| fields body

The query returns the following results:

body
User U300 authenticated via OAuth2 from xxx.xxx.xxx.xxx

Example 4: Tracking match positions using offset_field

The following query tracks the character positions where matches occur, useful for highlighting matches in a UI:

source=otellogs
| where severityText = 'ERROR'
| rex field=body "(?<errtype>[A-Z][a-zA-Z]+Exception)" offset_field=pos
| where NOT ISNULL(errtype)
| fields body, errtype, pos

The query returns the following results:

body errtype pos
NullPointerException in CheckoutService.placeOrder at line 142 NullPointerException errtype=0-19

Capture group names cannot contain underscores because of Java regex limitations. For example, (?<error_type>\w+) is invalid; use (?<errortype>\w+) instead.

For detailed Java regex pattern syntax and usage, refer to the official Java Pattern documentation.