DEV Community

Cover image for Kinesis Producers
Olawale Adepoju for AWS Community Builders

Posted on

6

Kinesis Producers

Kinesis Producers

A producer for Amazon Kinesis Data Streams is an application that feeds user data records into a Kinesis data stream (also called data ingestion). The Kinesis Producer Library (KPL) makes it easier to construct producer applications by allowing developers to achieve high write throughput to a Kinesis data stream.

There are different methods to stream data into Amazon kinesis streams:

  • Kinesis SDK
  • Kinesis Producer Library (KPL)
  • Kinesis Agent

Other third-party libraries include:

Spark, Log4J, Appenders, Flume, Kafka Connect, NiFi

Kinesis Producer SDK - PutRecord(s)

  • PutRecord (one record) and PutRecords (many records) APIs are utilized.
  • PutRecords leverages batching and enhances performance, resulting in fewer HTTP calls.
  • AWS Mobile SDKs: Android, iOS, etc...
  • Managed Amazon Web Services sources for Kinesis Data Streams:

    • AWS IoT
    • CloudWatch Logs
    • Kinesis Data Analytics

Use cases:
low throughput, higher latency, simple API, AWS Lambda

Kinesis Producer Library (KPL)

  • Easy to use and highly configurable C++/Java library
  • Used for building high-performance, long-running producers
  • Automated and configurable retry mechanism
  • Synchronous or Asynchronous APIs (better performance for async)
  • Submits metrics to CloudWatch for monitoring.
  • Batching (both turned on by default) – increase throughput, decrease cost:
    • Collect Records and Write to multiple shards in the same PutRecords API call.
    • Aggregate – increased latency.

Kinesis Producer Library (KPL) Batching

By inserting some delay using RecordMaxBufferedTime, batching efficiency can be impacted (default 100ms)

Image description

NOTE: When not to use the Kinesis Producer Library

  • The KPL can incur an additional processing delay of up to RecordMaxBufferedTime within the library (user-configurable)
  • Larger values ​​of RecordMaxBufferedTime result in higher packing efficiencies and better performance
  • Applications that cannot tolerate this additional delay may need to use the AWS SDK directly

Image description

Kinesis Agent

Monitor Log files and sends them to Kinesis Data Streams
Java-based agent, built on top of KPL
Install in Linux-based server environments

Features:

  • Write from multiple directories and write to multiple streams
  • Routing feature based on directory/log file
  • Pre-process data before sending to streams (single line, CSV to JSON, log to JSON)
  • The agent handles file rotation, checkpointing, and retry upon failures
  • Emits metrics to CloudWatch for monitoring

AWS Kinesis API - Exceptions

  • Provisioned Throughput Exceeded Exceptions
  • Happens when sending more data (exceeding MB/s or TPS for any shard)
  • Make sure you don't have a hot shard (such as your partition key is bad and too many data goes to that partition) Solution:

    • Retries with backoff
    • Increase shards (scaling)
    • Ensure your partition key is a good one

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Create a simple OTP system with AWS Serverless cover image

Create a simple OTP system with AWS Serverless

Implement a One Time Password (OTP) system with AWS Serverless services including Lambda, API Gateway, DynamoDB, Simple Email Service (SES), and Amplify Web Hosting using VueJS for the frontend.

Read full post