There are so many enterprise logging solutions: Sumo Logic, Splunk, DataDog, and many more. Big, complex, powerful, costly logging solutions. But do you really need them?
If you are a Fortune 500 company, maybe yes. But if you are a startup or smaller team, you may have better options.
There are simpler, less costly alternative for small to medium sized sites and services running on AWS.
For large enterprises, logging has several key attributes:
- Ability to capture data from a myriad of sources into a single centralized repository
- Ability to ingest vast quantities of data and index into large powerful databases
- Support for multiple and diverse consumers of the logging and metric data
A typical enterprise scenario would be to capture logs from databases, clusters, servers, containers, lambdas, network flow logs, network hardware, gateways, firewalls, etc. Then, by a multiplicity of means, those logs are routed to an enterprise logging service such as SumoSplunkDog or to a massive ElasticSearch cluster with LogStash and Kibana for queries. Finally, a team of monitoring staff would watch over this infrastructure and assist internal customers to access the data. There would be a suite of tools to assist with logging, metrics and tracing. The net result: a very expensive and complex solution.
Enterprise Logging has its strengths, but what are the weaknesses, especially for smaller companies?
On its way from cloud resources to the enterprise logging service, the logging data is copied multiple times. If originating in AWS, it is being captured and stored first by AWS CloudWatch, then it will be copied, buffered and passed via Lambdas to be ingested by the enterprise logging service. These multiple copies and transfers all incur additional cloud charges. You are paying multiple times to transfer and store the data. Obviously, this can get very expensive with high volumes of data.
Multiple copies also mean delays. Buffering means waiting for more data to accumulate before sending it downstream. In short, it is typical for logging data to be delayed several minutes before being available in an enterprise logging solution. The marketing spiel is:
"The Live Tail feature gives you the ability to see all your log events in near real time"
But minutes of delay is not the same as "near real time". The speil is baloney. The reality is you will wait minutes, perhaps more than five minutes to access your logs.
Before you plonk down serious $$ on an enterprise logging solution, look closely at what AWS has to offer. Smaller sites still need centralized logging, metrics, alarms and a status dashboard. But on a much smaller scale. AWS has such an offering: CloudWatch.
If you are using AWS as your cloud provider, you are already using CloudWatch whether you know it or not. AWS CloudWatch is the unified monitoring service for AWS services, but you can also use it for all your cloud services and applications. AWS CloudWatch collects and stores operational metrics and log files from resources such as EC2 instances, RDS databases, VPCs, Lambda functions and many other services. With CloudWatch, you can monitor your AWS account and resources and generate a stream of events or trigger alarms and actions for specific conditions. To learn more about CloudWatch, read: What is Amazon Cloud Watch.
AWS CloudWatch is the unified monitoring service for AWS services and your apps.
CloudWatch is composed of two key services:
- A Logging service to capture, store and manage service and application logs.
- A unified metrics service to capture and manage resource performance and operational metrics.
Both the metrics service and logging service are solid but basic offerings.
The logging service captures, ingests and stores AWS and custom logs. It manages retention and provides multiple options to export logs to other services.
The metrics service provides essential operational metrics for AWS services. The metrics can trigger alarms, issue notifications, invoke automated actions and be rendered graphically via dashboards.
The key strengths of CloudWatch are:
- Single point of capture, storage and management of logs within a region
- Complete set of metrics, queries, alarms, events and dashboards
Okay, but what does CloudWatch lack?
- Logs are not aggregated
- The viewer is basic and slow
- Fetching log data is manual and not transparent
- All components are basic offerings
CloudWatch stores logs in a three level hierarchy of regions, log groups and log streams. Unfortunately, these are not aggregated into a unified view. Consequently, finding the right log stream can be difficult, especially when using AWS Lambda which creates a multitude of streams.
Logs are regional in scope, i.e. they are stored and accessed via the AWS region in which they were captured. There is no global view of a log across all regions.
The AWS CloudWatch Logs viewer is fairly basic and presents page by page of log data for a single log stream. The viewer is slow and this compounds the lack of log aggregation. It typically takes up to 2-4 seconds to load a new page of log events. Locating a specific log event when the owning log stream is unknown is a slow, laborious exercise.
The CloudWatch Insights product offers a compound query capability. But queries can take over a minute for a single query with a moderate amount of log data.
While developing serverless cloud applications at SenseDeep, we became frustrated accessing CloudWatch Logs. We wanted a fast log viewer that supported smooth infinite scrolling and structured log data presentation and we wanted more powerful queries. So we created SenseDeep, a CloudWatch Logs viewer that runs blazingly fast, 100% in your browser. It transparently downloads and stores log events in your browser application cache for immediate viewing without delays. It offers smooth scrolling, live tail and powerful structured queries.
SenseDeep builds upon the solid foundation of CloudWatch's log capture and storage and transforms your ability to quickly gain insights about your apps.
Together AWS CloudWatch and SenseDeep provide a comprehensive logging solution for small to medium sites and is an effective, simpler alternative to resorting to the traditional enterprise logging solutions.
Please let us know what you think, we thrive on feedback: firstname.lastname@example.org.