Jason Shouldice

Posted on Mar 25 • Originally published at vicistack.com

Turning VICIdial Call Recordings Into a QA Program That Actually Improves Conversion

#voip #asterisk #sysadmin #devops

VICIdial records every call. Those recordings sit on disk, consuming storage, and most operations never listen to them unless there's a customer complaint. That's a waste. A systematic QA process — scoring calls, identifying patterns, coaching based on data — is the difference between a call center that guesses at why conversion is flat and one that knows exactly which agents need help on which part of the call.

Recording Setup: Get This Right First

Use ALLFORCE for your campaign recording setting. This records every call and prevents agents from toggling recording off. Use MIXMON as the recording method — it's lower CPU than MONITOR because it records both channels into a single file in real time instead of mixing post-call.

Set the recording filename to FULLDATE_AGENT_CUSTPHONE — this makes searching recordings by date, agent, or phone number possible without needing database queries for simple lookups.

In VICIdial Admin, navigate to Campaigns > [Your Campaign] > Detail and set:

Setting	Value	Why
Campaign Recording	ALLFORCE	100% recording, agents cannot toggle off
Campaign Rec Exten	sip-vicidial	Default recording extension
Recording Filename	FULLDATE_AGENT_CUSTPHONE	Searchable filenames

Storage Planning

Call recordings consume serious storage. Plan for it before you run out of disk space during a busy month:

Codec	Per Minute	Per Hour	25 Agents x 6hr Talk/Day
WAV (16-bit, 8kHz)	960 KB	57 MB	8.5 GB/day
MP3 (32 kbps)	240 KB	14 MB	2.1 GB/day

Convert WAV to MP3 after 7 days (long enough for QA review) with a nightly cron:

#!/bin/bash
# /opt/vicistack/compress_recordings.sh
# Convert WAV recordings older than 7 days to MP3
RECORDING_DIR="/var/spool/asterisk/monitor"
find ${RECORDING_DIR} -name "*.wav" -mtime +7 -exec sh -c '
  lame --quiet -b 32 "$1" "${1%.wav}.mp3" && rm "$1"
' _ {} \;

0 3 * * * /opt/vicistack/compress_recordings.sh >> /var/log/recording-compress.log 2>&1

Recordings are stored in /var/spool/asterisk/monitor/ and referenced in the recording_log table. The location field contains the full URL or file path. If you run multiple servers, make recordings accessible from a central location (NFS mount or centralized web server).

A Scorecard That Doesn't Take 20 Minutes Per Call

The biggest QA mistake is building a 50-point checklist. Reviewers burn out. They start checking boxes without actually listening. They resent the process. And the data becomes unreliable because everyone's rushing through it.

Use 5-7 categories with a 1-5 scale:

Category	Weight	What to Listen For
Opening	15%	Identified self, stated purpose, asked permission to continue
Discovery	20%	Asked qualifying questions, identified pain points, actually listened
Presentation	20%	Connected features to pain points, used the customer's language
Objection Handling	20%	Addressed without getting defensive, used approved rebuttals
Closing	15%	Asked for commitment, handled final concerns, set clear next steps
Compliance	10%	Followed script requirements, disclosed required info, no unauthorized promises

Scoring scale:

5 = Exceptional — could use as a training example
4 = Proficient — met all requirements with minor gaps
3 = Acceptable — met minimum bar but missed opportunities
2 = Needs Work — missed key elements, coaching required
1 = Unacceptable — failed requirements, immediate intervention

Implementing Scorecards in VICIdial

VICIdial doesn't have a built-in QA scorecard module. For teams reviewing more than 25 calls per day, a custom MySQL table works better than a spreadsheet:

CREATE TABLE vicistack_qa_scores (
  score_id INT AUTO_INCREMENT PRIMARY KEY,
  recording_id VARCHAR(50) NOT NULL,
  lead_id INT NOT NULL,
  agent_user VARCHAR(20) NOT NULL,
  scorer_user VARCHAR(20) NOT NULL,
  score_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  campaign_id VARCHAR(20),
  opening_score TINYINT CHECK (opening_score BETWEEN 1 AND 5),
  discovery_score TINYINT CHECK (discovery_score BETWEEN 1 AND 5),
  presentation_score TINYINT CHECK (presentation_score BETWEEN 1 AND 5),
  objection_score TINYINT CHECK (objection_score BETWEEN 1 AND 5),
  closing_score TINYINT CHECK (closing_score BETWEEN 1 AND 5),
  compliance_score TINYINT CHECK (compliance_score BETWEEN 1 AND 5),
  weighted_total DECIMAL(5,2),
  comments TEXT,
  coaching_notes TEXT,
  flagged_for_review TINYINT DEFAULT 0,
  INDEX idx_agent (agent_user, score_date),
  INDEX idx_campaign (campaign_id, score_date)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Calculate the weighted score automatically with a trigger so scorers don't have to do math:

DELIMITER //
CREATE TRIGGER qa_score_calc BEFORE INSERT ON vicistack_qa_scores
FOR EACH ROW
BEGIN
  SET NEW.weighted_total = (
    (NEW.opening_score * 0.15) +
    (NEW.discovery_score * 0.20) +
    (NEW.presentation_score * 0.20) +
    (NEW.objection_score * 0.20) +
    (NEW.closing_score * 0.15) +
    (NEW.compliance_score * 0.10)
  );
END//
DELIMITER ;

For smaller teams, a Google Sheet linked to recording URLs works as a starting point. Pull recording metadata daily:

#!/bin/bash
mysql asterisk -e "
SELECT r.recording_id, r.filename, r.start_time, r.length_in_sec,
       r.user AS agent, r.lead_id, l.phone_number, l.first_name
FROM recording_log r
JOIN vicidial_list l ON r.lead_id = l.lead_id
WHERE r.start_time >= CURDATE()
AND r.length_in_sec > 30
ORDER BY r.start_time
" | tr '\t' ',' > /tmp/qa_review_$(date +%Y%m%d).csv

How Many Calls to Score

For a 25-agent team:

Minimum: 5 calls per agent per week (125 total)
Recommended: 10 calls per agent per week (250 total)
New agents: 5 calls per day for the first 2 weeks

Don't sample purely randomly. Weight toward:

Calls with extreme durations (very short or very long)
Agents with declining conversion rates
Calls flagged by automated systems (covered below)

Review at 1.25x or 1.5x playback speed. A 5-minute call takes 3.3 minutes at 1.5x, and you can still catch everything. Batch by agent — patterns emerge faster than reviewing random agents.

Automated Flagging: Let the Database Do the First Pass

Manual QA is essential but expensive. Let automated queries prioritize which recordings need human attention first.

Duration-Based Flagging

SELECT r.recording_id, r.user, r.start_time, r.length_in_sec,
       CASE
         WHEN r.length_in_sec < 30 THEN 'SHORT_CALL'
         WHEN r.length_in_sec > 1200 THEN 'LONG_CALL'
       END AS flag_reason
FROM recording_log r
WHERE r.start_time >= CURDATE()
AND (r.length_in_sec < 30 OR r.length_in_sec > 1200);

Suspicious Dispositions

A SALE disposition on a call under 2 minutes is suspicious — possible fraudulent disposition:

SELECT r.recording_id, r.user, r.start_time, r.length_in_sec,
       v.status, l.phone_number
FROM recording_log r
JOIN vicidial_log v ON r.vicidial_id = v.uniqueid
JOIN vicidial_list l ON r.lead_id = l.lead_id
WHERE v.status = 'SALE'
AND r.length_in_sec < 120
AND r.start_time >= DATE_SUB(CURDATE(), INTERVAL 1 DAY);

Silence Detection

High silence percentage on a call often means a disengaged agent or a customer who's lost interest. If you have SOX installed, basic silence detection:

#!/bin/bash
RECORDING_DIR="/var/spool/asterisk/monitor"
for wav in ${RECORDING_DIR}/*$(date +%Y%m%d)*.wav; do
  if [ -f "$wav" ]; then
    TOTAL_DURATION=$(soxi -D "$wav" 2>/dev/null)
    SILENCE_DURATION=$(sox "$wav" -n silence 1 0.5 0.1% \
      reverse silence 1 0.5 0.1% reverse stat 2>&1 | \
      grep "Length" | awk '{print $3}')
    if [ -n "$TOTAL_DURATION" ] && [ -n "$SILENCE_DURATION" ]; then
      SILENCE_PCT=$(echo "scale=0; ($SILENCE_DURATION / $TOTAL_DURATION) * 100" | bc)
      if [ "$SILENCE_PCT" -gt 40 ]; then
        echo "HIGH_SILENCE: $(basename $wav) - ${SILENCE_PCT}% silence"
      fi
    fi
  fi
done

Anything over 40% silence gets a human review.

The Coaching Conversation That Works

QA scores are worthless if they don't drive coaching. Here's the structure:

Start positive. Play the agent's highest-scored call from the week. Highlight what they did well.
Pick one focus area. Don't dump 10 things on someone. Identify the category with the lowest average score.
Play the specific example. Let the agent hear themselves. Ask what they'd do differently.
Role play. Practice the corrected approach right there.
Set a measurable goal. "Next week, I want your discovery average to go from 2.3 to 3.0."

Weekly Coaching Reports

SELECT agent_user,
       COUNT(*) AS calls_scored,
       ROUND(AVG(weighted_total), 2) AS avg_score,
       ROUND(AVG(opening_score), 1) AS avg_opening,
       ROUND(AVG(discovery_score), 1) AS avg_discovery,
       ROUND(AVG(presentation_score), 1) AS avg_presentation,
       ROUND(AVG(objection_score), 1) AS avg_objection,
       ROUND(AVG(closing_score), 1) AS avg_closing,
       ROUND(AVG(compliance_score), 1) AS avg_compliance,
       MIN(weighted_total) AS worst_score,
       MAX(weighted_total) AS best_score
FROM vicistack_qa_scores
WHERE score_date >= DATE_SUB(CURDATE(), INTERVAL 7 DAY)
GROUP BY agent_user
ORDER BY avg_score ASC;

Track Improvement Over Time

SELECT agent_user,
       YEARWEEK(score_date) AS week,
       ROUND(AVG(weighted_total), 2) AS avg_score,
       COUNT(*) AS calls_scored
FROM vicistack_qa_scores
WHERE score_date >= DATE_SUB(CURDATE(), INTERVAL 28 DAY)
GROUP BY agent_user, YEARWEEK(score_date)
ORDER BY agent_user, week;

If an agent's score isn't improving after 3 coaching cycles focused on the same category, the problem isn't knowledge — it's either motivation or fit.

Calibration: Make Sure Scorers Agree

QA scores are only meaningful if different scorers would give the same recording similar marks. Run monthly calibration sessions: 3 recordings (one good, one average, one poor), all scorers grade independently, then compare. If any category differs by more than 1 point between scorers, discuss and align. Document the decisions as scoring precedents.

Without calibration, your "3.5 average agent" might be a "4.2 agent" or a "2.8 agent" depending on who scored them. The data becomes noise.

Compliance Recording Requirements

Two-Party Consent

If you dial into two-party consent states (California, Florida, Illinois, Maryland, Massachusetts, and others), play a disclosure at the start of every call. Upload the WAV to /var/lib/asterisk/sounds/ and enable Recording Disclosure in campaign settings.

PCI-DSS: Pause During Payments

If agents collect credit card numbers, pause recording during payment collection. Enable Allow Recording Pause in campaign settings and train agents to click "Pause Recording" before asking for the card and "Resume Recording" after. Audit this regularly — forgetting to pause is a PCI violation.

Retention

Regulation	Retention Period	Notes
TCPA	No specific requirement	Maintain records for 4 years (statute of limitations)
PCI-DSS	Do NOT record credit card data	Pause recording during payment
HIPAA	6 years minimum	Healthcare-related calls

The Complete Weekly Workflow

For a 25-agent center:

Daily: Automated flagging runs overnight. QA reviewer scores 50 calls (2 per agent) from flagged + random pool. Critical flags (compliance issues, fraudulent dispositions) escalated to manager immediately.

Weekly: Agent reports generated. Coaching sessions for agents below 3.0 weighted score. Top performers above 4.5 recognized.

Monthly: Calibration session with all scorers. Scorecard review — adjust categories or weights based on what's driving conversions. Compliance audit — verify two-party consent disclosures. Storage management — archive old recordings, verify retention compliance.

QA programs that combine systematic scoring, automated flagging, and structured coaching typically increase agent conversion rates by 15-30% within 60 days. The recordings are already on your disk.

For the complete QA workflow including recording search optimization, keyword spotting, and advanced flagging queries, see ViciStack's QA scoring guide.

Originally published at https://vicistack.com/blog/vicidial-qa-scoring/

DEV Community