Many plumbing owners do not lose after-hours jobs because they are bad at plumbing. They can lose them because the call hit voicemail, and the homeowner standing in a flooding kitchen simply dialed the next name on the list.
Closing that gap is the entire job of an AI dispatcher for plumbers after hours. But before you forward your real business line to any automated system, test it the way you would test a new hire on the phones: with realistic calls, a written scorecard, and honest grading.
Below is a practical 5-call test you can run in under an hour. This is vendor-biased toward OnCrew, the answering service I work on, and I will be specific about what the software does and does not do so you can grade it fairly against anything else on your shortlist.
Why test before you forward, not after
Forwarding your main number is a one-way door in the moment. If the system fumbles a 2 AM sewer backup, you do not get a second chance with that homeowner, and you may not even know it happened until you check messages. Running a controlled test first means the only calls at risk are the ones you placed on purpose. You learn how the system handles pressure, urgency, and odd phrasing while the stakes are zero.
First, define "dispatcher" honestly
"Dispatcher" gets used loosely in this category, so let me bound it before you score anyone.
When OnCrew acts as an AI dispatcher for plumbers after hours, it does four things:
- Captures intake: name, callback number, service address, and a clear description of the problem.
- Classifies urgency: an active leak is triaged differently than a price-shopping question.
- Summarizes the call: a short, readable summary instead of a raw transcript you decode at midnight.
- Alerts your team and queues callback context: your on-call person receives the summary and the details needed to call back prepared.
That is the ceiling: intake, triage, alerting, and callback context. It is not a true field dispatch system, and it is honest to say so.
Here is what stays entirely yours as the owner:
- Pricing and quotes
- Scheduling and appointments
- Dispatch decisions and ETAs
- Site safety judgment
- CRM setup
- Permits and code guidance
- Every field decision once a human picks up the callback
If a vendor's software claims it can confirm jobs, promise arrival windows, and make overnight field decisions without your team, treat that as a claim to verify, not a feature to assume. The defensible promise is narrower: the call gets captured, understood, and handed to a human with context, and you decide what happens next.
How to run the 5-call test
Block out about 45 minutes. Recruit one or two people to play homeowners, or run the calls yourself using the five scenarios below. You will grade each call on the same five criteria, which I define in the next section.
One caveat before you dial: take written notes on each call, and only record calls if recording is legal in your state and you have any consent the law requires. Call-recording rules vary, and several states require every party to agree, so do not skip that check. Notes by themselves are enough to grade the test.
Then place five calls, one per scenario, and score as you go.
What to grade on each call
Score each test from 0 to 2 on each criterion: 0 is a fail, 1 is partial, 2 is clean. That makes 10 points per call and 50 points across the test.
- Intake accuracy: did it capture name, number, address, and problem correctly when you read them back?
- Urgency classification: the leak, water heater, and sewer calls should rank above the warranty and price-shopper calls.
- Summary quality: would the summary make sense to your on-call tech with zero other context?
- Alert and routing: did the right notification reach the right place as configured?
- Honesty under pressure: did it avoid inventing prices, ETAs, or promises it cannot keep?
A strong AI dispatcher for plumbers after hours should score 8 or higher out of 10 on the true emergencies, even if it slips on the softer calls.
The 5-call scorecard
| Test call | Scenario you act out | What a clean handle looks like |
|---|---|---|
| 1. Active leak | "Water is pouring through my ceiling right now." | Captures address and callback number fast, flags high urgency, summarizes the hazard, sends the configured on-call alert. |
| 2. Water heater failure | "No hot water and the tank is leaking in the garage." | Gets the symptom and location, classifies as urgent, queues clear callback context for human review. |
| 3. Sewer backup | "Sewage is backing up into two bathrooms." | Treats it as health-urgent, captures property details, summarizes severity, alerts without promising arrival time. |
| 4. Warranty / status caller | "I just want to know when my tech is coming back." | Recognizes existing-customer intent, captures the reference, routes it as a callback, not a new emergency. |
| 5. Price shopper | "How much to snake a drain?" | Stays helpful, captures contact and job type, does not invent a price, queues a callback for you to quote. |
The warranty and price-shopper calls matter more than they look. A weak system either treats everything as a screaming emergency or quietly makes up a quote. Both train your customers badly and leave you cleaning up in the morning.
Reading your results
Tally the scores and read them like this:
- 40 to 50: the system captures and triages reliably. Worth a paid trial on a forwarded line.
- 25 to 39: usable, but it needs script tuning on urgency or intake. Retest before trusting your main number.
- Under 25: it will create more cleanup than it saves. Keep looking.
Weight the two emergency calls heavily. If the active leak and the sewer backup are not captured cleanly and flagged as high urgency, the rest does not matter, because those are the highest-risk calls you take after dark.
Where OnCrew fits, and what it costs
I build OnCrew, so read this section with that in mind. The reason I am comfortable sending you to test competitors with the same scorecard is that the bounded promise survives it: capture intake, classify urgency, summarize, alert your team, and queue callback context, so a human on your side makes the real decisions.
Starter pricing is public: $49/month for 100 calls, then $0.99 per extra call. The full breakdown lives on the OnCrew pricing page, and the plumber-specific overview is on the after-hours plumbing dispatcher landing page.
If you want to size the problem before testing anything, run your own numbers through the missed-call calculator. Enter your average job value and roughly how many after-hours calls you let slip to voicemail each month. Use that estimate to decide whether the 5-call test is worth your time, not as proof of results.
The bottom line
You do not have to take any vendor's word, including mine. Forward a test line, run the five calls, take notes, grade honestly, and only point your real number at the winner. An AI dispatcher for plumbers after hours earns that forward by capturing the leak after closing time and handing your on-call tech a clean summary, not by pretending to be the plumber. The plumbing, the pricing, the schedule, and the truck stay yours.
Ready to put OnCrew through the test? Start at the plumber answering overview.
Disclosure: I am Abe, founder of OnCrew, so read this with that bias in mind. The goal is a useful contractor buying framework, not a claim that one vendor is perfect for every shop.
Top comments (0)