DEV Community

Cover image for How to Track DevOps Events with AWS Kinesis Data Streams
Joanna Wallace
Joanna Wallace

Posted on

How to Track DevOps Events with AWS Kinesis Data Streams

To run a cloud platform in production, your team needs to know how things are running. There are seemingly endless metrics, measurements, and logs to analyze to ensure the platform is running as it should. Keeping clients satisfied so they continue using your platform is the goal of any cloud company.

Along with the significant amount of data you can collect from your system comes just as many tools you may use to collect them. On AWS alone, you might make use of Lambda, CloudTrail, CloudWatch, and XRay. Each of these tools also has a subset of tools that can be useful for tracking your information. However, the most interesting is not the individual data points but the analysis of that data. To properly analyze, data needs to be accessible by the same analytical tools. AWS Kinesis Data Streams can provide a method to amalgamate data quickly and efficiently for this analysis.

Features of Kinesis Data Streams

Kinesis data streams have many features that allow them to be used in a wide breadth of use cases. In this article, we will highlight features especially critical for analyzing platform health.

Real-Time Streaming Performance

Kinesis Data Stream allows data to flow through the queue at very high speeds. Each shard can consume 1MB/s input and provide 2MB/s output. AWS also limits inputs by the number of writes (1000 PUT requests per second). If you require more per second information, add new shards to increase the capacity of the stream. Adding shards will add to the capacity available directly.

With scaling speeds, streaming from a Kinesis Data Stream to a real-time analytics process can provide fast results. For DevOps security, notifications can be sent to users efficiently, so teams may address problems earlier, even while they are occurring. This speed can significantly shorten the downtime of your platform.

Easily Scale Capacity

Your platform may require different capacity settings based on predicted or spontaneous usage spikes. Kinesis Data Streams can dynamically scale with capacity ranging from the megabytes available with a single shard up to terabytes. The number of PUT requests can also scale up to millions of records per second. This scaling capacity means Kinesis can grow as your platform gains users and requires more throughput. You can stick with the same tool as your business grows and not need to rebuild infrastructure as you scale.

Resource-Linked Cost

Like many AWS services, with Kinesis Data Streams, you pay for what you use. For each shard created, AWS charges for shard hours. The actual cost is dependent on the AWS region used, ranging from $0.03/shard hour in Sao Paulo to $0.015/shard hour in North Virginia. Users are also charged per million PUT payload units, again with a cost dependent on the region and similar to the shard hour cost. AWS charges for optional features like encryption and data retention separately.

Security and Encryption

AWS encrypts data in transit by default. They also allow users to encrypt data at rest optionally. Developers can choose between managing their encryption keys or having AWS encryption applied using AWS KMS. For streaming, security data encryption at rest could be necessary. Data from AWS CloudTrail or private user information should be encrypted to limit the ability of attacks to get information of use.

Kinesis Data Streams Versus SQS

Data Streams are AWS-managed infrastructure. When setting up this service, you do not need to consider storage, provisioning, or deployment of the stream. Both Data Streams and SQS are AWS-managed queue services. Each can be useful for different requirements and flows through your cloud platform. Here, we are discussing analyzing DevOps data to detect security, scalability, and bugs in your cloud platform. Features of Kinesis Data Streams make it the better choice for this end.

Kinesis can provide ordering of records which is not available with standard SQS queues. A value called the sequence number is assigned to the kinesis value. The data is guaranteed unique per partition key per shard. Data is guaranteed to arrive at the consumer in the correct order using this value.

Kinesis also can read and re-read data in the same order by the same or new consumers. Data is stored in the queue after reading for a predetermined amount of time. This differs from SQS, which will hold data only until a consumer processes it. Both Kinesis and SQS offer retries to read data.

SQS does not give the ability to have multiple consumers listen to the same queue. SQS provides load balancing if multiple consumers are reading from a queue. Kinesis, however, will provide the same data to all consumers. Throughput is calculated for each consumer. If you need real-time speed and have a significant amount of data, consider using the available enhanced fanout setting on Kinesis Data Streams. This setting will enable each consumer to have its throughput capacity without affecting other connected consumers

Writing to Kinesis Data Streams

AWS Kinesis data streams can collect data from many sources in AWS. Kinesis can then forward data to different analytics tools like the Coralogix log analytics platform.

AWS Lambda and Kinesis Data Streams

AWS Lambda is a serverless compute system managed by AWS. These functions are commonly used in cloud computing to run the features of your system. Alternatively, developers may choose to run Fargate tasks, or EC2 compute functions. Each of these can interface to Kinesis using a similar methodology.

Compute functions can send data to Kinesis Data Streams for further analysis. This data may include interactions with APIs, data from outside sources, or results from Lambda itself. To write to your Data Stream from Lambda, use the AWS SDK. Developers can add various valuable data to the Kinesis data stream using the function laid out below.

  let kinesis = new AWS.Kinesis();
  kinesis.putRecord({
    Data: 'STRING_VALUE',
    PartitionKey: 'STRING_VALUE',
    StreamName: 'STRING_VALUE',
    ExplicitHashKey: 'STRING_VALUE',
    SequenceNumberForOrdering: 'STRING_VALUE'
  }).promise();
Enter fullscreen mode Exit fullscreen mode

AWS CloudWatch to Kinesis Data Streams

CloudWatch allows users to configure subscriptions. These subscriptions will automatically send data to different AWS services, including Kinesis Data Streams. Subscriptions include filter configurations that allow developers to limit what data is sent to Kinesis.

Developers can also use these filters to send data to different Data Streams, allowing for different processing to occur based on the data’s content. For example, data needed to process DevOps logs may go to a single stream bound for an analytics engine, while user data may go to a different stream bound for long-term storage.

Use the AWS CLI to set up the subscription to a Kinesis Data Stream using the following commands. You must create the stream before assigning a subscription to it. You will also need to create an IAM role for your subscription to write to your stream. For a complete description of the steps to create a CloudWatch subscription, see the AWS documentation.

AWS CloudTrail to Kinesis Data Streams

CloudTrail can be configured to send data directly to AWS S3 or AWS CloudWatch, but not to AWS Kinesis. SInce CloudTrail can write directly to AWS CloudWatch, we can use the above configuration linking CloudWatch to Kinesis Data Streams to collect CloudTrail data.

If you are creating your CloudTrail from the console, an option to configure a CloudWatch linkage is available. Once turned on, CloudWatch pricing applies to your integration.

Consuming Kinesis Data Streams

Kinesis Data Streams can send data to different consumers for analysis or storage. Consumers available include AWS Lambda, Fargate, and EC2 compute functions. You can also configure data streams to send directly to another Kinesis product like Kinesis Analytics. Using AWS compute functions and stored data, you can calculate metrics and store information. However, doing this requires significant manual work and foreknowledge of what to look for in data. Kinesis Analytics makes computation easier by applying user-built SQL or Apache apps to process your data.

Developers can also configure Kinesis to send data to third-party tools that remove the need to set up analytics. Coralogix provides several tools that can analyze different data to produce essential metrics and notifications for your platform. The security platform can analyze AWS information streaming from Kinesis to give insights about breaches and retrospective analysis of your security weak points. Coralogix log analytics system can take CloudWatch data and notify your team when your platform is not performing optimally.

Summary

Cloud platforms use Real-time analytics to ensure your system is functioning optimally and is secure. AWS services can provide all stream data to AWS Kinesis, a real-time queue with an extensive range of capabilities that can be set to accommodate your platform’s needs. Kinesis can be set up to send data to multiple endpoints if different analysis is needed on the same data. Consumers of the Kinesis stream will perform the analytics. These may be solutions made by your platform team on AWS compute functions like Lambda or Fargate. They can be semi-manual functions made by your platform team using Kinesis Analytics tools. The most efficient way to perform analytics is to use a third-party tool designed for your needs, like Coralogix’s security or log analytics platforms.

Top comments (0)