Link Search Menu Expand Document Documentation Menu

AWS Lambda sink

This page explains how to configure and use AWS Lambda with OpenSearch Data Prepper, enabling Lambda functions to serve as both processors and sinks.

Configuration

Configure the Lambda sink using the following parameters.

Field Type Required Description
function_name String Yes The name of the AWS Lambda function to invoke.
invocation_type String No Specifies the invocation type. Default is event.
aws.region String Yes The AWS Region in which the Lambda function is located.
aws.sts_role_arn String No The Amazon Resource Name (ARN) of the role to assume before invoking the Lambda function.
max_retries Integer No The maximum number of sink-level retries if the Lambda invocation fails. This controls Data Prepper’s retry logic. Default is 3.
client.max_retries Integer No The maximum number of AWS SDK client-level retries for individual API calls. This controls the underlying SDK retry mechanism for network or service errors. Default is 3.
client.api_call_timeout Duration No The total timeout for the entire API call including all retries. Default is 60s.
client.api_call_attempt_timeout Duration No The timeout for each individual retry attempt. If not specified, AWS SDK defaults are used.
client.connection_timeout Duration No The SDK connection timeout. Default is 60s.
client.read_timeout Duration No The amount of time the SDK waits for data to be read from an established connection. If not specified, AWS SDK defaults are used.
client.max_concurrency Integer No The maximum number of concurrent threads in the client. Default is 200.
client.base_delay Duration No The base delay for the exponential backoff. Default is 100ms.
client.max_backoff Duration No The maximum backoff time for the exponential backoff. Default is 20s.
batch Object No Optional batch settings for Lambda invocations. Default is key_name = events. Default threshold is event_count=100, maximum_size="5mb", and event_collect_timeout = 10s.
lambda_when String No A conditional expression that determines when to invoke the Lambda sink.
dlq Object No The dead-letter queue (DLQ) configuration for failed invocations.

Example configuration

sink:
  - aws_lambda:
      function_name: "my-lambda-sink"
      invocation_type: "event"
      aws:
        region: "us-west-2"
        sts_role_arn: "arn:aws:iam::123456789012:role/my-lambda-sink-role"
      max_retries: 5
      client:
        max_retries: 3
        api_call_timeout: PT60S
        api_call_attempt_timeout: PT30S  # Optional: per-attempt timeout
        connection_timeout: PT60S
        read_timeout: PT15M              # Optional: for long-running Lambda functions
        max_concurrency: 200
        base_delay: PT0.1S
        max_backoff: PT20S
      batch:
        key_name: "events"
        threshold:
          event_count: 50
          maximum_size: "3mb"
          event_collect_timeout: PT5S
      lambda_when: "event['type'] == 'log'"
      dlq:
        region: "us-east-1"
        sts_role_arn: "arn:aws:iam::123456789012:role/my-sqs-role"
        bucket: "<<your-dlq-bucket-name>>"

Timeout configuration

The AWS Lambda sink supports multiple timeout layers following AWS SDK best practices:

  • api_call_timeout: The total amount of time for the entire API call including all retries.
  • api_call_attempt_timeout: The time limit for each individual attempt.
  • read_timeout: The amount of time to wait for data from an established connection.

For Lambda functions that run for longer than 60 seconds, configure both api_call_timeout and read_timeout to appropriate values.

Usage

The invocation types are as follows:

  • event (Default): Executes functions asynchronously without waiting for responses.
  • request-response (Sink only): Executes functions synchronously, though responses are not processed.
  • batch: Automatically groups events based on configured thresholds.
  • dlq: Supports the DLQ configuration for failed invocations after retry attempts.

Data Prepper components use an AWS Identity and Access Management (IAM) role assumption, aws.sts_role_arn, for secure Lambda function invocation and respect Lambda’s concurrency limits during event processing. For more information, see the AWS Lambda documentation.

Developer guide

Integration tests must be executed separately from the main Data Prepper build. Execute them with the following command:

./gradlew :data-prepper-plugins:aws-lambda:integrationTest -Dtests.sink.lambda.region="us-east-1" -Dtests.sink.lambda.functionName="lambda_test_function"  -Dtests.sink.lambda.sts_role_arn="arn:aws:iam::123456789012:role/dataprepper-role

350 characters left

Have a question? .

Want to contribute? or .