DEV Community

velprove
velprove

Posted on • Originally published at velprove.com

How to Monitor Your REST API Health Endpoint (With Response Validation)

Most teams monitor their APIs the same way: hit the root URL, check
for a 200 OK, and move on. It is better than nothing,
but it gives you false confidence. Your API can return
200 OK while the database is unreachable, the cache
layer is down, or a critical dependency is timing out. The status
code tells you the web server is running. It tells you almost nothing
about whether your API is actually healthy.

A proper health endpoint, and proper monitoring of that
endpoint, gives you real visibility into the state of your
application. This guide covers how to design a health endpoint worth
monitoring and how to set up response validation so you catch real
failures, not just server crashes.

Shallow Ping vs Deep Health Check

There is a meaningful difference between a health endpoint that
returns {"{\"status\": \"ok\"}"} unconditionally and one
that actually checks your application's dependencies before
responding.

Shallow health check

A shallow health check simply confirms that your API process is
running and can handle HTTP requests. It returns a static response
without checking anything else. This is useful for load balancer
routing. It tells the infrastructure layer that the process is alive.
But it tells you nothing about whether the application can actually
serve real requests.

Deep health check

A deep health check queries your application's critical
dependencies (the database, cache, message queue, external APIs) and
reports their individual status. If the database is unreachable, the
health endpoint returns a degraded or unhealthy status instead of a
cheerful 200 OK. This is the endpoint you want to
monitor.

Why Status Code Monitoring Gives You False Confidence

Here is a scenario that plays out more often than anyone admits. Your
API's database connection pool is exhausted. Every real request
fails with a timeout. But your health endpoint does not check the
database. It just returns 200 OK. Your monitoring
dashboard shows 100% uptime while your users are getting 500 errors
on every request.

Status code monitoring catches one failure mode: the process is not
running. It misses everything else:

  • Database unreachable or connection pool exhausted
  • Redis or cache layer down, causing fallback behavior or errors
  • Third-party API dependency returning errors (payment processor, email service, auth provider)
  • Disk full, preventing writes to local storage or log files
  • Memory pressure causing garbage collection pauses and slow responses

Each of these failures can happen while your server happily returns
200 OK on a shallow health endpoint. You need response
validation to catch them.

Setting Up Response Validation With Velprove

Velprove's API check type lets you validate not just the status
code, but the actual JSON body of the response. Here is how to set
it up for your health endpoint.

Step 1: Create an API check

Sign up for a free Velprove account and
create a new API check. Set the method to GET and enter
your health endpoint URL, for example,
https://api.yourapp.com/health.

Step 2: Add JSON path validation

Configure a JSON path assertion to validate the response body.
Assuming your health endpoint returns a JSON object with a
status field:

  • JSON path: $.status
  • Expected value: ok

Now your check does not just verify that the endpoint responds. It
verifies that the application reports itself as healthy. If the
database goes down and your health endpoint changes its status
to degraded or error, Velprove catches
it immediately.

Step 3: Set a response time threshold

A health endpoint that takes 3 seconds to respond is a warning sign
even if it returns {"{\"status\": \"ok\"}"}. Configure a
response time threshold. We recommend 3 seconds as a starting
point. If your health endpoint normally responds in 50 milliseconds
and suddenly takes 2.5 seconds, something is wrong even though the
check technically passed the JSON validation.

Step 4: Configure check interval and alerts

On the free plan, your health check runs every 5 minutes with email
alerts, which is solid coverage for most applications. The Starter plan
($19/mo) offers 1-minute intervals with Slack and webhook
notifications for team visibility. For production APIs where every
second of downtime matters, the Pro plan ($49/mo) provides 30-second
check intervals with PagerDuty integration. For a full comparison of
monitoring tools, see our
UptimeRobot alternative breakdown
.

Designing Your Health Endpoint for Monitorability

If you control the API you are monitoring, you can design the health
endpoint to give your monitoring tool, and your on-call
team, maximum visibility. Here is what a well-designed health
response looks like:

  • Overall status. A top-level status field that returns ok, degraded, or error. This is the field your monitoring tool validates.
  • Application version. Include a version field with the deployed version or commit hash. When you investigate an incident, knowing exactly which version is running saves you from guessing.
  • Dependency checks. Report the health of each critical dependency individually: database, cache, queue, storage. When the overall status is degraded, the dependency breakdown tells you exactly which component failed.
  • Response time. Include the time the health check itself took to run. If your health endpoint normally responds in 20 milliseconds and starts taking 800 milliseconds, a dependency is getting slow.

Keep the health endpoint unauthenticated if possible. This makes it
easy to monitor from external tools without managing API keys in your
monitoring configuration. If you must restrict access, use a simple
shared secret in a header rather than your application's full
authentication flow.

Response Time Degradation as an Early Warning

Outages rarely happen instantly. Most production incidents follow a
pattern: performance degrades gradually, response times creep up, and
eventually something breaks. If you are only checking for pass or
fail, healthy or not, you miss the entire warning period.

Response time monitoring on your health endpoint gives you a leading
indicator. When your health check that normally responds in 50
milliseconds starts taking 500 milliseconds, something is changing.
Maybe the database is under unusual load. Maybe a dependency is
throttling you. Maybe a memory leak is starting to bite. Whatever it
is, you have a window to investigate and fix it before it becomes a
full outage.

This is especially valuable if you monitor alongside your
third-party API dependencies
. When both your health endpoint and an external API start slowing
down at the same time, you can quickly identify whether the problem
is on your side or theirs.

Go Beyond the Status Code

A 200 OK is the bare minimum. It tells you your process
is running, nothing more. Real API monitoring validates that your
application is healthy, its dependencies are reachable, and its
response times are within acceptable bounds.

Set up a free API health check with Velprove
and start monitoring what actually matters. If your health endpoint
says everything is fine, make sure your monitoring tool is smart
enough to verify that claim.

Top comments (0)