Link Search Menu Expand Document Documentation Menu

Prometheus sink

The Prometheus sink buffers OpenTelemetry metrics and exports them in Prometheus time series format using the Remote Write API. It supports both open-source Prometheus and Amazon Managed Service for Prometheus (AMP).

The prometheus sink processes only metric data. All other data types are sent to the DLQ pipeline, if it is configured.

To ensure compatibility, the Prometheus sink sorts metrics by timestamp within each batch before sending them to the server. It also supports an out-of-order window, which allows ingestion of metrics with older timestamps.

Usage

The following examples configure the Prometheus sink for different deployment scenarios.

Open-source Prometheus with no authentication

To use an open-source Prometheus instance, provide an https:// URL. To use http://, set insecure to true. No aws block is needed. Prometheus must be started with the --web.enable-remote-write-receiver flag:

pipeline:
  ...
  sink:
    - prometheus:
        url: "http://localhost:9090/api/v1/write"
        insecure: true
        threshold:
          max_events: 1000
          flush_interval: PT5S

Open-source Prometheus with HTTP Basic authentication

To authenticate with HTTP Basic credentials (for example, when Prometheus is behind a reverse proxy with basic authentication enabled), use the authentication block:

pipeline:
  ...
  sink:
    - prometheus:
        url: "https://localhost:9090/api/v1/write"
        authentication:
          http_basic:
            username: "promuser"
            password: "prompass"

AMP

To use AMP, provide the aws configuration block. An https:// URL is required when using AWS authentication:

pipeline:
  ...
  sink:
    - prometheus:
        url: "https://aps-workspaces.us-east-2.amazonaws.com/workspaces/ws-xxxxxxxx-xxxx/api/v1/remote_write"
        aws:
          region: "us-east-2"
          sts_role_arn: "arn:aws:iam::123456789012:role/data-prepper-prometheus-role"
        threshold:
          max_events: 1000
          flush_interval: PT5S

IAM permissions

When using AMP, configure AWS Identity and Access Management (IAM) to grant OpenSearch Data Prepper permissions to write to Amazon Managed Service for Prometheus. You can use a configuration similar to the following JSON configuration:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "amp-access",
            "Effect": "Allow",
            "Action": [
                "aps:RemoteWrite"
            ],
            "Resource": "arn:aws:aps:<region>:<account-id:workspace>/<workspace-id>"
        }
    ]
}

Configuration

Use the following options when customizing the prometheus sink.

Option Required Type Description
url Yes String The Prometheus Remote Write endpoint URL. Supports https:// by default. To use http://, set insecure to true. When aws is configured, https:// is required.
insecure No Boolean When set to true, allows http:// URLs. By default, only https:// URLs are permitted. Default is false.
encoding No String The compression format used for requests. Only snappy is supported. Default is snappy.
remote_write_version No String The version of the Prometheus remote write protocol. Only 0.1.0 is supported.
content_type No String The MIME type of the body. Only application/x-protobuf is supported.
out_of_order_time_window No Duration The time window allowed for late-arriving data points. Data older than this window relative to the latest point will be dropped. Default is 10s.
sanitize_names No Boolean Determines whether metric and label names are sanitized in order to comply with Prometheus naming conventions. Default is true.
connection_timeout No Duration The maximum amount of time allowed to establish an HTTP connection. Default is 60s.
idle_timeout No Duration The maximum amount of time an idle HTTP connection remains open before being closed. Default is 60s.
request_timeout No Duration The maximum amount of time allowed for a full end-to-end HTTP request to complete. Default is 60s.
threshold No Threshold configuration Configuration for batching and flushing time-series data.
max_retries No Integer The maximum number of attempts for failed ingestion requests. Uses exponential backoff with jitter on retryable status codes (429, 502, 503, or 504). Default is 5.
aws No AWS configuration AWS configuration for AWS Signature Version 4 signing. When present, requests are signed with AWS credentials. Cannot be used with authentication.
authentication No Authentication configuration HTTP Basic authentication credentials. Cannot be used with aws.

Threshold configuration

Use the following options to configure batching and flushing behavior for the Prometheus sink.

Option Required Type Description
max_events No Integer The maximum number of events to accumulate before flushing to Prometheus. Default is 1000.
max_request_size No String The maximum size of the request payload before flushing. Default is 1mb.
flush_interval No Duration The maximum amount of time to wait before flushing events. Default is 10s.

AWS configuration

When an aws block is present, requests are automatically signed with Signature Version 4. An https:// URL is required. The AWS configuration supports the following options.

Option Required Type Description
region No String The AWS Region to use for credentials. Defaults to standard SDK behavior to determine the region.
sts_role_arn No String The STS role to assume for requests to AWS. Defaults to null, which uses standard SDK credential behavior.
sts_header_overrides No Map A map of header overrides to make when assuming the IAM role.
sts_external_id No String An optional external ID to use when assuming the IAM role.

Authentication configuration

The authentication block supports HTTP Basic authentication. It cannot be used together with aws (Signature Version 4 signing). The authentication configuration supports the following options.

Option Required Type Description
http_basic.username Yes String The username for HTTP Basic authentication.
http_basic.password Yes String The password for HTTP Basic authentication.