Mahdi BEN RHOUMA

Posted on Apr 23 • Originally published at iloveblogs.blog

How I Fixed My n8n Workflow That Was Failing Silently for Three Weeks

#n8n #debugging #workflowautomation #errors

Home
Automation
How I Fixed My n8n Workflow That Was Failing Silently

How I Fixed My n8n Workflow That Was Failing Silently for Three Weeks

A client emailed me on a Friday afternoon asking why they hadn't received their weekly report in "a while." I checked my n8n dashboard. The workflow showed green. All executions: successful.

I went to the execution history and clicked the most recent one. The workflow had run at 8 AM that morning, just like it was supposed to. Green checkmark on every single node.

But the report was empty.

Not "there was an error." Not "the email didn't send." The report sent — with zero data in it. For three weeks.

This is the story of what broke, how I found it, how I fixed it, and the monitoring workflow I built afterwards so I could never miss this kind of failure again.

The Workflow That Broke

The workflow was a weekly client report generator. Every Monday at 8 AM it:

Fetched all completed tasks from the client's project in ClickUp (tasks completed in the last 7 days)
Fetched time tracking entries from Toggl for the same period
Merged the two datasets by task name
Generated a formatted HTML email with a summary table
Sent the email to the client

It had been running perfectly for four months. Then it quietly stopped working.

Reading the Execution Log Properly

My first instinct was to look at the execution history and confirm the workflow ran. It had. But I was looking at it wrong.

I clicked into one of the "successful" recent executions. Here's what I saw:

Schedule Trigger: ✅ 1 item
ClickUp node (fetch tasks): ✅ 0 items
Merge node: ✅ 0 items
Code node (format report): ✅ 1 item (because my code created an empty template even with no data)
Gmail node (send email): ✅ 1 item

The workflow ran. Every node technically "succeeded." But the ClickUp node returned zero tasks, so the report was empty. The execution log showed all green.

This is what n8n calls a silent failure: no error, no crash, just a workflow that ran correctly and produced nothing useful.

Finding the Root Cause

I clicked into the ClickUp node's output in the execution log to see what it actually returned. It showed: [] — an empty array.

My first thought was that the ClickUp API was down or the credential had expired. I tested the credential manually — it worked fine. I ran the node with a test execution — it returned tasks correctly.

So the credential was fine. The API was working. Why were tasks showing as empty during the scheduled run?

Then I looked more carefully at my ClickUp API query. I was filtering tasks by "date completed" in the last 7 days. The query looked like this:

const sevenDaysAgo = new Date();
sevenDaysAgo.setDate(sevenDaysAgo.getDate() - 7);
const timestamp = sevenDaysAgo.getTime();

This looked right. But I had forgotten one thing: ClickUp's API uses Unix timestamps in seconds, not milliseconds.

JavaScript's Date.getTime() returns milliseconds. I was passing a timestamp like 1743580800000 to an API that expected 1743580800.

For the first four months, this "worked" because ClickUp was silently ignoring the invalid timestamp and returning all tasks. Then at some point, ClickUp tightened their API validation. Invalid timestamps now returned zero results instead of all results.

The workflow didn't break — it just started filtering out everything.

The fix was one character:

// Before (broken)
const timestamp = sevenDaysAgo.getTime();

// After (fixed)
const timestamp = Math.floor(sevenDaysAgo.getTime() / 1000);

I updated the Code node, tested it, and saw all 47 tasks from the last week appear immediately.

The Deeper Problem: I Had No Way to Know

The bug itself was simple once I found it. What bothered me was that it had been silently happening for three weeks before a client noticed. The workflow was "healthy" by every metric I was watching. I had done everything "right" — n8n was running, the workflow was active, executions were succeeding.

I had one gap: I had no validation that the output was actually meaningful.

I thought about how to fix this properly. The answer was a monitoring layer: a set of checks that run after the main logic and alert me if the output looks wrong.

Building the Monitoring Layer

I rebuilt the workflow with three layers of protection:

Layer 1: Output Validation Node

Right after the ClickUp fetch, I added an IF node:

Condition: {{ $items().length }} is greater than 0
True path: Continue as normal
False path: → Slack message → "⚠️ Weekly Report: ClickUp returned 0 tasks. Check the API filter. Workflow stopped."

The true path continues to the report generation. The false path sends me an immediate alert and stops the workflow — no empty report gets sent to the client.

// In a Code node before sending — sanity check
const tasks = $node["ClickUp"].json;
const timeEntries = $node["Toggl"].json;

if (!tasks || tasks.length === 0) {
  throw new Error(`No tasks found for period ${startDate} to ${endDate}. Check ClickUp filter.`);
}

if (!timeEntries || timeEntries.length === 0) {
  throw new Error(`No time entries found for period. Check Toggl connection.`);
}

// Continue...

Throwing an error here means the workflow fails loudly instead of silently. It shows up red in my execution log. If I have an Error Workflow set up (see Layer 2), I get a notification.

Layer 2: Error Workflow

Every workflow in n8n can have an "Error Workflow" — a secondary workflow that triggers when the primary one throws an uncaught error.

I created a simple error notification workflow:

Trigger: Error Trigger (receives error data from any workflow that fails)
Nodes:

Code node → format a clear error message with workflow name, node that failed, error message, timestamp
Slack node → send to my #automation-alerts channel
Gmail node → email me with full error details

Now, any unhandled error in any of my workflows sends me an alert within minutes.

Layer 3: "Heartbeat" Monitoring

The most invisible failure mode is a workflow that simply stops running — maybe n8n crashed, maybe the server ran out of memory, maybe I accidentally deactivated it.

I built a Heartbeat workflow:

Schedule Trigger: Every Sunday at 9 AM
What it does: Checks the execution history of my 5 most critical workflows

// Using n8n's internal API to check execution counts
// Simplified example
const criticalWorkflows = [
  { name: 'Weekly Client Report', id: 'xxx', expectedDayOfWeek: 1 }, // Monday
  { name: 'Invoice Follow-up', id: 'yyy', expectedEveryDays: 1 },
  { name: 'Daily Briefing', id: 'zzz', expectedEveryDays: 1 }
];

// Check if each workflow has run in its expected window
// Flag any that haven't

If a workflow hasn't run in its expected window, I get a Sunday morning Slack message listing which ones look suspicious.

What Changed in My Approach

After this incident, I made one rule for myself: every automation that matters to a client gets a sanity check before the output is delivered.

That means:

After any data fetch, verify the result isn't empty
After any transformation, spot-check the output looks structurally correct
Before any email to a client, validate required fields are populated

It adds maybe 20 minutes to building each workflow. It has saved me from at least two similar situations since then — one where a Notion API change caused a query to return data in a different format, and one where a timezone offset issue made a "last 7 days" filter silently exclude all weekend data.

The other change: I now run new workflows manually for the first 2-3 scheduled cycles, watching the execution log actively, before I consider them "trusted." Tedious, but worth it.

The Client's Response

I told the client what had happened. I explained the bug, how long it had been happening, and the new monitoring I'd added to prevent it. I offered to prepare a backfill report covering the three missed weeks.

They appreciated the transparency more than they were bothered by the gap. The backfill took an hour and cost me nothing except time. The trust that came from handling it well was worth more than hiding it would have been.

Clients don't expect perfection. They expect honesty when things go wrong and a plan to prevent it happening again.

Summary of Changes Made

Before	After
No output validation	IF node validates data before proceeding
Errors silently ignored	Error Workflow sends immediate Slack alert
No visibility into "healthy but empty" runs	Code nodes throw errors on unexpected empty results
No cross-workflow monitoring	Weekly heartbeat checks all critical workflows
Trusted "green" execution status	Verify output meaningfulness, not just execution success

These aren't complex changes. They took about 3 hours to implement across all my workflows. They've already paid for themselves.

Frequently Asked Questions

Why does n8n show a workflow as successful when it actually failed?

n8n considers a workflow "successful" if it completes without throwing an error. If a node processes zero items — like a filter matching nothing — the workflow finishes cleanly with no output. This is a "silent failure." Add validation at the end to alert when expected output is missing.

How do I set up error notifications in n8n?

Go to workflow settings (gear icon) → Error Workflow → select a workflow that sends you a Slack message or email. This catches actual errors. For silent failures (no error, just no output), add custom validation logic in your workflow.

What is the difference between a workflow error and a silent failure in n8n?

A workflow error stops execution and logs as "Error." A silent failure completes successfully but produces no useful output — like a filter matching zero records. The execution log shows "Success" but nothing happened. Silent failures require custom validation to detect.

How do I read the n8n execution log?

Go to your workflow → click the clock icon (Executions). Click any execution to open it. You'll see each node with its input and output data. Green nodes succeeded, red nodes errored. Click any node to inspect its exact data output.

Originally published at https://iloveblogs.blog

DEV Community

How I Fixed My n8n Workflow That Was Failing Silently for Three Weeks

How I Fixed My n8n Workflow That Was Failing Silently for Three Weeks

The Workflow That Broke

Reading the Execution Log Properly

Finding the Root Cause

The Deeper Problem: I Had No Way to Know

Building the Monitoring Layer

Layer 1: Output Validation Node

Layer 2: Error Workflow

Layer 3: "Heartbeat" Monitoring

What Changed in My Approach

The Client's Response

Summary of Changes Made

Frequently Asked Questions

Why does n8n show a workflow as successful when it actually failed?

How do I set up error notifications in n8n?

What is the difference between a workflow error and a silent failure in n8n?

How do I read the n8n execution log?

Top comments (0)