The Unsung Hero: Deep Dive into /var on Ubuntu
A recent production incident involving a runaway log rotation issue on a critical database server highlighted a fundamental truth: a deep understanding of /var is no longer optional for modern infrastructure engineers. The incident, triggered by a misconfigured logrotate setup, rapidly filled the /var/log partition, causing application crashes and a significant outage. This wasn’t a simple “disk full” problem; it exposed a lack of visibility into /var’s architecture, its dependencies, and the cascading effects of its misconfiguration. This post aims to provide a comprehensive, no-nonsense guide to /var on Ubuntu, geared towards experienced system administrators, DevOps engineers, and SREs operating in production environments – whether that’s cloud VMs, on-prem servers, or containerized deployments utilizing Ubuntu as a base image. We’ll focus on practical application, not introductory concepts.
What is "/var" in Ubuntu/Linux context?
/var (variable data) is a directory in the Linux filesystem hierarchy designed to hold files that change frequently during normal system operation. Unlike /, which holds the root filesystem and static binaries, /var is intended for dynamic data. On Ubuntu (and Debian-based systems), this includes log files, spool directories (for print queues, mail, etc.), temporary files, databases, and package management caches.
Historically, /var was often a separate partition, a practice still recommended for production systems. This isolation prevents a runaway process filling /var from impacting the root filesystem and potentially rendering the system unbootable. Ubuntu’s installer defaults have evolved, but the principle remains crucial.
Key system tools and services interacting with /var include:
-
systemd: Manages services that write to
/var/logand utilizes/var/runfor runtime data. -
journald: The systemd journal, storing logs in
/var/log/journal. -
APT: Package manager, caching downloaded packages in
/var/cache/apt. -
logrotate: Rotates, compresses, and manages log files in
/var/log. -
rsyslog/syslog-ng: System logging daemons, writing logs to
/var/log. -
Database Servers (PostgreSQL, MySQL): Store data files within
/var/lib/<database_name>.
Use Cases and Scenarios
-
High-Volume Logging: A web application generating terabytes of logs daily. Proper
logrotateconfiguration, log shipping to a centralized system (e.g., Elasticsearch, Splunk), and disk space monitoring are critical. -
Containerized Applications: Docker containers often write logs to
/var/logwithin the container filesystem. Managing these logs requires volume mounts or log drivers to prevent container disk exhaustion. -
Database Server Performance: A PostgreSQL database experiencing slow I/O due to insufficient disk space in
/var/lib/postgresql. Monitoring disk usage, optimizing database settings (e.g.,wal_buffers), and potentially migrating data to faster storage are necessary. -
Security Auditing: Using
auditdto track system calls and writing audit logs to/var/log/audit. Analyzing these logs is crucial for identifying security breaches or suspicious activity. -
Cloud Image Customization: Building a custom Ubuntu cloud image. Cleaning
/var/cache/aptand removing unnecessary logs reduces image size and improves deployment speed.
Command-Line Deep Dive
- Disk Space Usage:
df -h /var
du -sh /var/* | sort -hr | head -n 10
- Log Rotation Configuration:
cat /etc/logrotate.d/rsyslog
- Journald Disk Usage:
journalctl --disk-usage
journalctl --vacuum-size=1G # Limit journal size to 1GB
- APT Cache Cleaning:
apt clean
apt autoclean
apt autoremove
-
Checking
logrotateStatus:
logrotate -d /etc/logrotate.conf # Dry run
logrotate -f /etc/logrotate.conf # Force rotation
- Systemd Service Status (rsyslog):
systemctl status rsyslog
systemctl logs rsyslog
System Architecture
graph LR
A[Application] --> B(/var/log);
C[systemd] --> B;
D[rsyslog/syslog-ng] --> B;
E[journald] --> F(/var/log/journal);
G[APT] --> H(/var/cache/apt);
I[Database Server] --> J(/var/lib/<database>);
K[logrotate] --> B;
L[Filesystem] --> B;
L --> F;
L --> G;
L --> J;
style B fill:#f9f,stroke:#333,stroke-width:2px
style F fill:#f9f,stroke:#333,stroke-width:2px
style G fill:#f9f,stroke:#333,stroke-width:2px
style J fill:#f9f,stroke:#333,stroke-width:2px
/var sits at the intersection of numerous critical system services. systemd orchestrates many of these, while journald provides a structured logging mechanism. rsyslog or syslog-ng handle traditional syslog messages. APT relies on /var/cache/apt for package management. Database servers store their data within /var/lib. logrotate is the crucial component for managing the growth of log files, preventing disk exhaustion. The underlying filesystem (ext4, XFS, etc.) provides the storage layer.
Performance Considerations
/var’s performance directly impacts application responsiveness. High I/O activity in /var/log can starve other processes. Insufficient disk space can lead to application crashes.
-
I/O Monitoring: Use
iotopto identify processes generating the most disk I/O. -
CPU Usage:
htopcan reveal processes consuming excessive CPU while writing to/var. - Sysctl Tuning:
sysctl vm.swappiness=10 # Reduce swapping
sysctl vm.vfs_cache_pressure=50 # Balance cache pressure
-
Filesystem Choice: XFS generally performs better than ext4 for large files and high I/O workloads, making it a good choice for
/varon database servers. -
Kernel Parameters: Consider tuning kernel parameters related to I/O scheduling (e.g.,
elevator) based on the storage type (SSD vs. HDD). Useperfto profile I/O performance.
Security and Hardening
/var contains sensitive data, including logs that may contain credentials or other confidential information.
-
File Permissions: Ensure appropriate file permissions on all files and directories within
/var. Logs should typically be readable by thesysloggroup. -
AppArmor/SELinux: Utilize AppArmor or SELinux to restrict access to
/varfor specific applications. -
UFW/iptables: Firewall rules should restrict network access to services writing to
/var. -
Fail2ban: Monitor logs in
/var/logfor failed login attempts and automatically block malicious IPs. -
Auditd: Enable
auditdto track access to sensitive files within/var. - Log Encryption: Consider encrypting logs at rest to protect sensitive data.
Automation & Scripting
Ansible example to clean APT cache and rotate logs:
---
- hosts: all
become: true
tasks:
- name: Clean APT cache
apt:
autoclean: yes
autoremove: yes
- name: Force logrotate
command: logrotate -f /etc/logrotate.conf
changed_when: false # Logrotate doesn't always return a meaningful change status
Cloud-init snippet to resize /var partition during instance creation:
#cloud-config
resizefs:
- device: /var
filesystem: ext4
Logs, Debugging, and Monitoring
-
journalctl: The primary tool for viewing systemd journal logs. Use filters to narrow down the results (e.g.,journalctl -u rsyslog). -
dmesg: Kernel ring buffer, useful for diagnosing hardware or driver issues. -
netstat/ss: Monitor network connections related to services writing to/var. -
strace: Trace system calls made by a process to understand its interaction with/var. -
lsof: List open files, revealing which processes are accessing files within/var. -
Monitoring Tools: Prometheus, Grafana, Nagios, or similar tools should monitor disk space usage in
/var, log file sizes, and I/O performance.
Common Mistakes & Anti-Patterns
-
No Separate
/varPartition: Leads to root filesystem exhaustion. Correct: Create a dedicated partition for/varduring installation. -
Ignoring
logrotateConfiguration: Results in uncontrolled log growth. Correct: Customizelogrotateconfigurations for each service. - Insufficient Disk Space Allocation: Causes application crashes. Correct: Allocate sufficient disk space based on anticipated log volume and data growth.
-
Using
rm -rf /var/log/*: Destroys valuable debugging information and can disrupt logging. Correct: Uselogrotateorjournalctl --vacuum-size. - Storing Sensitive Data in Plaintext Logs: Creates a security vulnerability. Correct: Redact sensitive data from logs or use encryption.
Best Practices Summary
-
Dedicated
/varPartition: Always. -
Proactive
logrotateConfiguration: Tailor configurations to each service. - Regular Disk Space Monitoring: Alert on low disk space.
- Appropriate File Permissions: Restrict access to sensitive data.
- Utilize AppArmor/SELinux: Enforce least privilege.
- Centralized Logging: Ship logs to a centralized system for analysis.
- Automate Maintenance: Use Ansible or similar tools to automate tasks.
- Filesystem Choice: Consider XFS for high I/O workloads.
- Journald Management: Regularly vacuum the journal to prevent disk exhaustion.
-
Regular Audits: Review
/varconfiguration and security settings.
Conclusion
/var is a critical component of any Ubuntu-based system. Ignoring its intricacies can lead to performance issues, security vulnerabilities, and even complete system outages. Mastering its architecture, understanding its dependencies, and implementing robust monitoring and automation are essential for building reliable, maintainable, and secure infrastructure. Take the time to audit your existing systems, build automation scripts, monitor /var’s behavior, and document your standards. The investment will pay dividends in the long run.
Top comments (0)