Jason Shouldice

Posted on Mar 25 • Originally published at vicistack.com

Your VICIdial Database Will Melt Down in 6 Months. Here's How to Prevent It.

#voip #asterisk #sysadmin #devops

It starts the same way every time. Your VICIdial runs fine for six months. Then one Monday morning, agents can't log in. Real-time reports take 30 seconds to load. The hopper drains faster than it fills. CPU is pegged -- not by Asterisk, but by MySQL grinding through a query on a vicidial_log table with 47 million rows because nobody set up archival.

The database is VICIdial's single most critical component. Every agent action, call event, disposition, real-time report, hopper query, and dial-level calculation flows through one MySQL instance. In a VICIdial cluster, you can add telephony servers for call capacity and web servers for agent connections. You cannot cluster the database. When it slows down, everything slows down.

Why VICIdial's Database Is Different

VICIdial's access patterns are unusual and demanding compared to most web applications:

High-frequency writes. Multiple Perl daemons update agent states, call statuses, and dial metrics across dozens of tables every second. The AST_update.pl process alone performs hundreds of UPDATEs per second in a busy 50-agent campaign.

Constant polling. Real-time reports don't use WebSockets or push notifications. They poll the database every second. Ten managers with reports open = 10 SELECTs per second against vicidial_live_agents and vicidial_auto_calls, on top of everything else.

Mixed workload. Short, fast transactional queries (agent state updates) run simultaneously with long analytical queries (campaign reports spanning millions of records). Transactional queries need low latency. Analytical queries need throughput. MySQL serves both from the same instance.

MyISAM table-level locking. VICIdial uses MyISAM exclusively. A single slow SELECT on vicidial_log blocks every write on that table until it completes. This is the fundamental scaling constraint.

The Table That Eats Your Server

vicidial_manager is the fastest-growing table in most deployments and the one most overlooked. It stores every AMI command VICIdial sends to Asterisk -- Originate, Hangup, Monitor. At 50 agents with a 5:1 dial ratio, that's 250+ Originate commands per cycle, plus hangups, plus monitors. This table hits 100 million rows in under a year if not archived.

Growth rates for a 50-agent outbound operation running 8 hours daily:

Table	Growth/Month	6-Month Size
vicidial_log	~1M rows	6M rows, ~2 GB
vicidial_log_extended	~1M rows	6M rows, ~3 GB
call_log	~1M rows	6M rows, ~2 GB
vicidial_carrier_log	~2-3M rows	15M rows, ~4 GB
vicidial_manager	~5-10M rows	40M+ rows, ~8 GB
recording_log	~500K rows	3M rows, ~1 GB
server_performance	~15K/day/server	2.7M rows/server

MEMORY Tables: VICIdial's Real-Time Engine

Several critical tables run in the MEMORY storage engine -- entirely in RAM, no disk I/O:

Table	Purpose
vicidial_live_agents	State of every logged-in agent
vicidial_live_inbound_agents	Inbound-eligible agents
vicidial_auto_calls	Every active auto-dial call
vicidial_hopper	Leads queued for dialing
vicidial_live_sip_channels	Active SIP channels

MEMORY tables lose all data on MySQL restart. This is by design. Agents log back in, hopper refills automatically. Brief disruption during production hours if MySQL restarts unexpectedly.

The MEMORY table ceiling is max_heap_table_size in my.cnf. If a MEMORY table exceeds this, INSERTs fail -- agents can't log in or calls don't get placed. Set to 64 MB minimum. 128-256 MB for 500+ agents.

my.cnf Tuning: What Actually Matters

Universal Settings (Every Deployment)

[mysqld]
# THE most important single setting
skip-name-resolve

# VICIdial's connection-heavy architecture
max_connections = 2000

# Disable query cache -- more harm than good with VICIdial's write patterns
query_cache_size = 0
query_cache_type = 0

# Allow concurrent inserts during SELECT on MyISAM
concurrent_insert = 2

# VICIdial opens hundreds of tables
table_open_cache = 4096

# Temp table limits
tmp_table_size = 64M
max_heap_table_size = 64M

# Thread cache
thread_cache_size = 128

# Slow query logging -- enable always
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow-query.log
long_query_time = 2

The skip-name-resolve story: Without it, MySQL performs a reverse DNS lookup on every incoming connection. In a VICIdial cluster with telephony servers, web servers, and dozens of cron-spawned Perl scripts maintaining constant connections, those DNS lookups pile up. You get "Too many connections" errors even with max_connections set high enough -- because connections are stuck waiting for DNS. This single line has saved more VICIdial installations than any other my.cnf change.

When you enable skip-name-resolve, use IP addresses (not hostnames) in GRANT statements.

Sizing by Deployment

Agents	key_buffer_size	RAM Budget	Notes
50	512M	16 GB server, ~2 GB MySQL buffers	Leaves 14 GB for OS cache
100	1024M	32 GB server, ~4 GB MySQL buffers	Leaves 28 GB for OS cache
500	4096M	64 GB server, ~10 GB MySQL buffers	Leaves 54 GB for OS cache

Why not allocate more to MySQL? MyISAM stores data in .MYD files read through the OS filesystem cache. key_buffer_size only caches index blocks (.MYI files). The real performance for data reads comes from the OS cache -- which uses whatever RAM MySQL isn't consuming. Rule of thumb: MySQL buffers should use no more than 40-50% of total RAM.

Why set innodb_buffer_pool_size at all? VICIdial uses MyISAM, but MySQL's internal system tables use InnoDB. A small InnoDB buffer pool (256M-2G) prevents internal operations from thrashing disk.

Slow Query Identification

Enable the slow query log always. Overhead is negligible compared to debugging time saved.

Install Percona Toolkit and analyze:

yum install percona-toolkit
pt-query-digest /var/log/mysql/slow-query.log > /tmp/slow-query-report.txt

The usual suspects in VICIdial deployments:

Report queries on vicidial_log -- SELECTs spanning millions of rows, triggered by managers running "last 90 days" reports
Hopper fills on vicidial_list -- Scanning for callable leads on a 10M+ row table with inadequate indexes
Recording search on recording_log -- Millions of rows, date range searches
vicidial_manager cleanup -- Periodic DELETEs on a 50M+ row table that lock it for seconds

When MySQL is actively struggling, check what's running:

SHOW FULL PROCESSLIST;

Look for "Locked" state (MyISAM contention), "Copying to tmp table" (complex queries hitting disk), "Sorting result" (ORDER BY without indexes), and multiple identical queries (report polling storms).

Essential Missing Indexes

VICIdial's install scripts create most indexes, but these are commonly missing and make a measurable difference:

vicidial_log

The default indexes cover lead_id and uniqueid, but report queries filter by call_date:

ALTER TABLE vicidial_log ADD INDEX idx_call_date (call_date);
ALTER TABLE vicidial_log ADD INDEX idx_campaign_call_date (campaign_id, call_date);
ALTER TABLE vicidial_log ADD INDEX idx_user_call_date (user, call_date);

The composite (campaign_id, call_date) index makes campaign-specific reports dramatically faster -- the most common analytical query pattern.

vicidial_list

The hopper fill query scans for callable leads. Default indexes are insufficient for large lists:

ALTER TABLE vicidial_list ADD INDEX idx_list_status (list_id, status);
ALTER TABLE vicidial_list ADD INDEX idx_status_modify (status, modify_date);
ALTER TABLE vicidial_list ADD INDEX idx_phone_number (phone_number);

recording_log

ALTER TABLE recording_log ADD INDEX idx_lead_date (lead_id, start_time);
ALTER TABLE recording_log ADD INDEX idx_start_time (start_time);

Before adding: check with SHOW INDEX FROM tablename to verify the index doesn't already exist. Adding an index to a 10M+ row MyISAM table locks it for the duration. Schedule during off-hours. Periodically run OPTIMIZE TABLE on heavily-written tables to rebuild indexes and reclaim space from deleted rows -- but expect 5-30 minutes of table lock on large tables.

Table Archival: The Maintenance That Saves Everything

If your deployment has run 90+ days without archival, your database carries dead weight that slows every query, every lock, every OPTIMIZE operation.

Table	Archive After	Reason
vicidial_manager	7 days	Fastest growing, almost never queried historically
vicidial_carrier_log	30 days	High volume, low long-term value
call_log	60 days	Raw CDR, rarely accessed after initial reporting
vicidial_log	90 days	Reports rarely go back further
vicidial_agent_log	90 days	Agent session data
recording_log	180 days	Keep metadata longer for compliance

VICIdial includes ADMIN_archive_log_tables.pl. Set up as a daily cron:

0 1 * * * /usr/share/astguiclient/ADMIN_archive_log_tables.pl --daily 2>&1 >> /var/log/astguiclient/archive.log

When deleting from large MyISAM tables, always use LIMIT to avoid locking the table for minutes. A single DELETE of 50 million rows locks the table for the duration. Break it into chunks of 100,000-500,000.

Creating Archive Tables

Before archiving, verify _archive tables exist. If not:

SHOW TABLES LIKE 'vicidial_log_archive';
CREATE TABLE vicidial_log_archive LIKE vicidial_log;
ALTER TABLE vicidial_log_archive ADD INDEX idx_call_date (call_date);

The Nuclear Option

If tables have grown to hundreds of millions of rows and archival takes too long:

Rename the bloated table
Create a fresh table with the same structure
Copy recent data (last 90 days) into the fresh table
Verify data integrity
Drop or rename the old table to archive

This is faster than DELETE because it builds sequentially rather than performing random deletes. Requires enough disk for both tables temporarily, and VICIdial must be stopped during the swap.

MEMORY Table Conversion for High-Scale

Above 150-200 agents, converting high-traffic tables from MyISAM to MEMORY is one of the most impactful single optimizations.

vicidial_live_agents tracks every agent's real-time state and gets read by every report, every dialer process, every AGI script. On MyISAM, table-level locks create contention: a report SELECT blocks the dialer's UPDATE marking an agent available, which blocks an AGI routing a call. MEMORY eliminates disk I/O and drops lock duration from milliseconds to microseconds.

Convert: vicidial_live_agents, vicidial_live_inbound_agents, vicidial_auto_calls, vicidial_hopper, vicidial_live_sip_channels. Never convert vicidial_log or vicidial_list (too large, persistent data).

MEMORY requires fixed-width CHAR fields (no VARCHAR). Recent VICIdial SVN revisions (3800+) include built-in MEMORY table support with automatic conversion on startup. Check your version before doing manual conversions.

The TIMESTAMP Migration

Recent VICIdial SVN updates convert DATETIME columns to TIMESTAMP across 40+ tables. TIMESTAMP uses 4 bytes vs DATETIME's 8 bytes -- half the storage, faster comparisons. But the ALTER TABLE on a 50M-row MyISAM table locks it for 30+ minutes. Plan this during a maintenance window with a full backup first.

Corruption Recovery

MyISAM tables corrupt during power failures, unexpected shutdowns, and full disks. Symptoms: "Table marked as crashed" errors, unexpected query results, MySQL refusing to start.

First check: CHECK TABLE tablename. If issues found: REPAIR TABLE tablename. For severe cases, use myisamchk directly on the .MYI and .MYD files with MySQL stopped.

Replication for Reporting

The single most effective way to eliminate report queries from impacting production is MySQL master-slave replication. Run reports against the replica. Production call operations hit the master only.

Configure binary logging on the master:

[mysqld]
log-bin = mysql-bin
binlog_format = MIXED
expire_logs_days = 7
sync_binlog = 0    # Async for performance

The replica receives all writes via the binary log and stays in sync. Report queries -- the big SELECTs that scan millions of rows -- hit the replica instead of locking tables on the master. For 100+ agent deployments, this is often the single biggest performance improvement after basic my.cnf tuning.

Per-Thread Buffer Warning

Settings like sort_buffer_size, read_buffer_size, and join_buffer_size are allocated per-thread when needed. With max_connections = 6000, the theoretical maximum for sort buffers alone: 6000 x 16 MB = 96 GB. MySQL allocates on-demand, not all at once, but if you see unexpected memory pressure, per-thread buffers are the first place to look.

The PROCESSLIST Emergency Toolkit

When things are actively on fire:

SELECT * FROM information_schema.PROCESSLIST
WHERE COMMAND != 'Sleep' AND TIME > 2
ORDER BY TIME DESC;

What the states mean:

Locked -- MyISAM table-level lock contention. Another query is holding the table lock.
Copying to tmp table -- Complex query creating a temp table on disk. Usually means a missing index or a report query scanning too many rows.
Sorting result -- ORDER BY on a large result set without an appropriate index.
Multiple identical queries -- Real-time report polling storm. Ten managers, one query each per second. Reduce the refresh interval or set up a caching layer.

Kill a runaway query if needed: KILL process_id; -- but understand that killing a MyISAM write can leave the table in a state requiring REPAIR TABLE.

MyISAM vs InnoDB: The VICIdial Position

You might wonder why VICIdial uses MyISAM when InnoDB has row-level locking that would solve the contention problem. Matt Florell has been explicit: "We do not recommend using InnoDB under any circumstances." The codebase is designed and tested against MyISAM behavior. The Perl daemons, keepalive processes, and hopper logic all assume MyISAM's specific behaviors around table-level locking, concurrent inserts, and fulltext indexing.

Converting VICIdial tables to InnoDB would introduce subtle bugs that only manifest under production load -- exactly the worst time to discover them. Stick with MyISAM. Optimize it aggressively. Use MEMORY tables for the hot real-time tables. Set up replication for reporting. And archive religiously.

Connection Management

VICIdial's architecture is connection-heavy. Perl daemons, web servers, cron scripts, agent sessions -- they all maintain database connections. Without proper configuration, you'll see "Too many connections" errors during peak hours.

Set max_connections based on your deployment size. 2000 for 50 agents, 3000 for 100 agents, 6000 for 500 agents. These seem high, but VICIdial's keepalive and update processes create far more connections than the agent count suggests.

wait_timeout and interactive_timeout control how long idle connections persist. Default MySQL values (28800 seconds / 8 hours) are too high for VICIdial. Set both to 600 seconds (10 minutes). This cleans up stale connections from crashed Perl processes that didn't disconnect properly.

thread_cache_size avoids the overhead of creating new threads for frequent connections. Set to 128 for small deployments, 256-512 for large ones. MySQL reuses cached threads instead of spawning new ones, which matters when connection turnover is high.

Disk I/O: The Hidden Bottleneck

On busy VICIdial deployments, disk I/O often matters more than CPU. MyISAM writes data synchronously to .MYD files. Recording 200 concurrent calls writes ~200 MB/minute. When the database and recordings compete for the same disk, everything slows.

Solution: separate disks or partitions. Database files on SSD (preferably NVMe). Recordings on a separate volume. If you only have one disk, at minimum put the MySQL data directory and the recording spool on separate partitions so I/O operations don't interleave.

Monitor with iostat -x 5 during peak hours. If %util is above 80% on your database disk, you need either faster storage or to split the workload.

Maintenance Schedule

Task	Frequency	Impact
Archival (ADMIN_archive_log_tables.pl)	Daily, 1 AM	Low -- runs in background
vicidial_manager cleanup	Daily	Medium -- LIMIT deletes to avoid lock
OPTIMIZE TABLE on vicidial_log	Monthly, maintenance window	High -- locks table 5-30 min
OPTIMIZE TABLE on vicidial_list	Monthly, maintenance window	High -- locks table
pt-query-digest analysis	Weekly	None -- reads log only
Index review	Quarterly	Varies
Full database backup	Daily	Low if using mysqldump during quiet hours

The database is the one part of VICIdial you can't horizontally scale. Treat it accordingly. ViciStack ships deployments with MySQL pre-tuned, MEMORY tables optimized, and archival jobs running from day one.

Originally published at https://vicistack.com/blog/vicidial-mysql-optimization/

DEV Community