Parse XML processor
The parse_xml processor parses XML data for an event.
Configuration
You can configure the parse_xml processor with the following options.
| Option | Required | Type | Description |
|---|---|---|---|
source | No | String | Specifies which event field to parse. |
destination | No | String | The destination field of the parsed XML. Defaults to the root of the event. Cannot be "", /, or any white-space-only string because these are not valid event fields. |
pointer | No | String | A JSON pointer to the field to be parsed. The value is null by default, meaning that the entire source is parsed. The pointer can access JSON array indexes as well. If the JSON pointer is invalid, then the entire source data is parsed into the outgoing event object. If the key that is pointed to already exists in the event object and the destination is the root, then the pointer uses the entire path of the key. |
parse_when | No | String | Specifies under what conditions the processor should perform parsing. Default is no condition. Accepts an OpenSearch Data Prepper expression string following the expression syntax. |
overwrite_if_destination_exists | No | Boolean | Overwrites the destination if set to true. Set to false to prevent changing a destination value that exists. Defaults to true. |
delete_source | No | Boolean | If set to true then this will delete the source field. Defaults to false. |
tags_on_failure | No | String | A list of strings specifying the tags to be set in the event that the processor fails or an unknown exception occurs during parsing. |
handle_failed_events | No | String | Determines how to handle events containing XML processing errors. Valid values are skip (log the error and send the event downstream to the next processor) and skip_silently (send the event downstream to the next processor without logging the error). Default is skip. |
Usage
The following examples show how to use the parse_xml processor in your pipeline.
Example: Minimum configuration
The following example shows the minimum configuration for the parse_xml processor:
parse-xml-pipeline:
source:
stdin:
processor:
- parse_xml:
source: "my_xml"
sink:
- stdout:
When the input event contains the following data:
{ "my_xml": "<Person><name>John Doe</name><age>30</age></Person>" }
The processor parses the event into the following output:
{ "name": "John Doe", "age": "30" }
Metrics
The following table describes common Abstract processor metrics.
| Metric name | Type | Description |
|---|---|---|
recordsIn | Counter | Metric representing the ingress of records to a pipeline component. |
recordsOut | Counter | Metric representing the egress of records from a pipeline component. |
timeElapsed | Timer | Metric representing the time elapsed during execution of a pipeline component. |
The parse_xml processor includes the following custom metrics.
Counter
parseErrors: The number of parse errors resulting from invalid XML in events. This indicates that the XML format could not be parsed.processingFailures: The number of processing failures that have occurred in theparse_xmlprocessor. This indicates unexpected errors not related to invalid XML format.