Link Search Menu Expand Document Documentation Menu

File source

The file plugin reads events from a local file once when the pipeline starts. It’s useful for loading seed data, testing processors and sinks, or replaying a fixed dataset. This source does not monitor the file for new lines after startup.

Option Required Type Description
path Yes String An absolute path to the input file inside the Data Prepper container, for example, /usr/share/data-prepper/data/input.jsonl.
format No String Specifies how to interpret the file content. Valid values are json and plain. Use json when your file has one JSON object per line or a JSON array. Use plain for raw text lines. Default is plain.
record_type No String The type of output record produced by the source. Valid values are event and string. Use event to produce structured events expected by downstream processors and the OpenSearch sink. Default is string.

Example

The following examples demonstrate how different file types can be processed.

JSON file

The following example processes a JSON file:

file-to-opensearch:
  source:
    file:
      path: /usr/share/data-prepper/data/input.ndjson
      format: json
      record_type: event
  sink:
    - opensearch:
        hosts: ["https://opensearch:9200"]
        index: file-demo
        username: admin
        password: admin_pass
        insecure: true

Plain text file

A raw text file can be processed using the following pipeline:

plain-file-to-opensearch:
  source:
    file:
      path: /usr/share/data-prepper/data/app.log
      format: plain
      record_type: event
  processor:
    - grok:
        match:
          message:
            - '%{TIMESTAMP_ISO8601:timestamp} \[%{LOGLEVEL:level}\] %{GREEDYDATA:msg}'
  sink:
    - opensearch:
        hosts: ["https://opensearch:9200"]
        index: plain-file-demo
        username: admin
        password: admin_pass
        insecure: true

CSV file

You can process a CSV file using the csv processor:

csv-file-to-opensearch:
  source:
    file:
      path: /usr/share/data-prepper/data/ingest.csv
      format: plain  
      record_type: event     
  processor:
    - csv:
        column_names: ["time","level","message"]
  sink:
    - opensearch:
        hosts: ["https://opensearch:9200"]
        index: csv-demo
        username: admin
        password: admin_pass
        insecure: true

350 characters left

Have a question? .

Want to contribute? or .