Whether you are a solo full-stack developer or a member of a team, your toolkit needs to have software that monitors your applications, infrastructure, managed services, and third-party dependencies.
Here is a list of 14 monitoring tools you can use to gain insights into your applications’ performance, reliability, and uptime. Some of these are hosted, and some of them can be self-hosted.
Apache SkyWalking
Apache SkyWalking is an open-source APM tool meant for distributed systems. It has support for distributed tracing, agents in multiple languages, and support for an eBPF agent.
SkyWalking has its own native APM database called BanyanDB which can ingest and store telemetry and observability data. It also allows you to parse logs and extract metrics from log entries.
One of the important features of SkyWalking is its ability to ingest data from other sources in well-known formats like OpenTelemetry. It can also forward data to external services like alerting systems. This allows you to plug in SkyWalking without replacing your other tools.
Better Stack
Better Stack is a managed log aggregation system that can ingest logs from your sources, run search queries, and set up alerts on queries. It also comes with hosted status pages.
The alerting feature of Better Stack has support for multiple team members as well as integration with third-party tools like PagerDuty and ZenDesk. You can also pull data from external cloud services like GCP, AWS, and Azure to create incidents in Better Stack.
In addition, Better Stack also supports website monitoring.
ELK (Elasticsearch/Logstash/Kibana)
This stack consists of three components - the Elasticsearch log ingestion and processing engine, the Logstash log processor, and the Kibana UI.
Elasticsearch supports advanced log aggregation features with support for indexing, sharding, and clustering. It also comes with a REST API. Elasticsearch and Kibana can work seamlessly together. It's easy to set up this stack with Docker images but it can take considerably more work to install, configure, and maintain a scalable ELK stack.
As of this writing, Elasticsearch is again open-source.
GlitchTip
GlitchTip is an open-source error, uptime, and performance monitoring tool which also has a managed version.
GlitchTip supports multiple languages and frameworks. Its uptime monitoring includes URL and heartbeat monitoring. It is also compatible with Sentry's API, thus you can use it to push data anywhere that supports Sentry's API. It has basic alerting support via email.
They are also pretty open about their hosted architecture.
Grafana
Grafana is an analytics and data visualization tool that can create dashboards of charts and graphs. It supports many different data sources via an extensive plugin ecosystem, so you can look at and correlate metrics from different systems in the same dashboard.
Grafana is open-source and also has a managed version. You can query both metrics and logs. It has a very active community. You can set up and try Grafana on your local machine easily using Docker.
Grafana's alerting feature supports sending alerts to external services like PagerDuty, OpsGenie, Slack, etc.
IncidentHub
IncidentHub monitors third-party Cloud and SaaS services and alerts you when they have an outage. It supports monitoring cloud platforms like GCP, AWS, Digital Ocean, communication/collaboration tools like Slack, Zoom, Office365, payment services like PayPal and Stripe, and dev tooling like GitHub, GitLab, and CircleCI.
IncidentHub periodically checks public data sources like status pages. It can notify you using channels like email, PagerDuty, Discord, Slack, etc.
If you are a developer, you can use IncidentHub to monitor your external dependencies like cloud services, CDNs, and CI/CD and deployment platforms. As of this writing, it supports 20 free monitors.
Parseable
Parseable is a managed log analytics solution that also has an open-source version. It's written in Rust. Parseable can use either Parquet or the Arrow format for storage. Both Arrow and Parquet are Apache open-source column-oriented data storage formats.
Parseable supports OpenTelemetry and common log collectors like Fluent Bit and LogStash for ingestion. You can also send logs programmatically. It has built-in support for alerting and can push alerts into webhooks, Prometheus Alertmanager, and Slack.
Parseable also has LLM-based SQL generation for querying logs, Role-based Access Control, and OpenID Connect.
Pinpoint
This is an open-source application performance management (APM) tool that is written in Java. Pinpoint can help understand how components in distributed systems interact with each other. Its UI can show you the topology of your system visually.
Pinpoint works on the agent model where you can hook into your applications without changing any code. You can integrate with Pinpoint either by calling its APIs or by using byte code instrumentation. The second approach does not require you to change any code.
Pinpoint supports common Java software out of the box.
Prometheus
Prometheus is an open-source metrics collection and monitoring tool written in Go. It has a very active developer and user community. Originally developed at SoundCloud, it is now an independently managed CNCF project.
Prometheus supports time series metrics ingestion and has a native query language PromQL. It works via the pull model where it collects metrics from "exporters", which collect data from different sources. The list of exporters is extensive, and you can also instrument your application to either expose metrics to be collected or send them directly to Prometheus.
Prometheus has a service discovery feature where it can automatically detect nodes to monitor. It can push metrics data into and read from external data stores.
Using PromQL you can define alerting rules in your Prometheus configuration. Prometheus comes with its own Alertmanager which can be used to configure alerting rules. Alerts emitted by Prometheus can be sent to different third-party systems like Slack and PagerDuty through Alertmanager.
Sentry
Sentry is an open-source error tracking and performance monitoring tool that also has a managed version.
Sentry has support for many languages and frameworks. It supports session replay and end-to-end tracing. You can dig into the root cause of slow requests by tracing requests across function calls and services.
Sentry's alerting feature supports both metrics-based checks and URL monitoring.
Sentry also integrates with a lot of popular developer tools.
SigNoz
SigNoz positions itself as an "open-source DataDog alternative". You can host it yourself or use the commercial cloud version.
SigNoz collects metrics, traces, and logs and presents them in one dashboard. It can track external API calls which is useful when your application uses third-party APIs. You can look at common metrics like p95/p99 and trace the root cause of slow requests - whether they are because of external API response times or slow DB queries. SigNoz also lets you filter out traces by tags, service name, errors, and latency.
SigNoz supports OpenTelemetry as its instrumentation library - which means that any language and framework supported by OpenTelemetry is also supported by SigNoz. SigNoz also has built-in alerting.
UptimeRobot
UptimeRobot is a website monitoring service that checks if your website is accessible periodically and alerts you.
It supports different types of monitoring like HTTP/S, checking for keywords, cron jobs, TLS certificate expiry, and domain monitoring. It integrates with different services like Slack, PagerDuty, Telegram, Email, ZenDesk, etc. It also gives you a status page that you can share with your team.
As of this writing the service supports 50 free monitors, making it useful for solo devs and small teams.
Victoria Metrics
VictoriaMetrics is a monitoring tool and time series database. It is open-source and has a managed version.
VictoriaMetrics can integrate with other monitoring tools. E.g. with Prometheus, it can function as a storage backend for long-term data retention. It take ingest data in all well-known formats including OpenTelemetry.
You can query VictoriaMetrics using either PromQL or its native MetricsQL. It's also straightforward to back up VictoriaMetrics data using its snapshots feature to any cloud storage like Amazon S3 or Google Cloud Storage.
WireShark
Now we are getting a bit low-level. WireShark is a network protocol analyzer that has been around for a long time.
WireShark is ideal if you have to inspect network traffic at the packet level. It supports many protocols with filtering capabilities. You can capture and inspect data live, or do offline analysis.
WireShark can run on multiple OSs including Windows, Linux, FreeBSD, and OSX.
Conclusion
Choosing the right monitoring tool can be daunting with so many options. A checklist for choosing what is right for your needs could be
- What are your top 5 feature requirements? This list can change over time.
- What is your budget?
- Do you prefer managing your own, or using a hosted solution? As your applications mature and your observability data volume grows, the scalability of your tool becomes important.
- Does your organization have regulatory requirements?
- Does your chosen tool do multiple things well? E.g. Does it handle logs and metrics equally well?
- Does the tool integrate with your existing toolkit?
You might end up with 2-3 or even more tools, each in its specialized niche, and that's ok. In that case, integration features become important. You might choose a distributed tracing tool that sends alerts to another alerting tool. Or you might have an uptime monitor which sends informational alerts to your Slack, and critical ones to PagerDuty. As your project needs change, so will your tools.
This is by no means an exhaustive list, and there are many other tools out there. Try out some of these and let others know what you think in the comments.
Cover photo by Martin Martz on Unsplash
Top comments (0)