DEV Community

Wang - C++ Developer
Wang - C++ Developer

Posted on

Finding Memory Leaks in Legacy C++ Applications with Valgrind

Legacy C++ services don't crash — they slowly bleed memory until someone restarts them at 3 AM.

If you've inherited a 20‑year‑old codebase with mysterious memory growth, this guide is for you.

You can't fix a leak if you can't reproduce.

This is your complete, production‑focused Valgrind investigation playbook.
It's based on real systems, real leaks, and real debugging pain.


Table of Contents


The Workflow

[Leak Trigger] → [Static Analysis] → [Compile Debug Build]
        ↓
   [Run Valgrind] → [Interpret Leak Types] → [Stack Trace]
        ↓
     [Regression Test]
Enter fullscreen mode Exit fullscreen mode

Step 1 — Reproduce the Leak

You cannot find a leak if you cannot trigger it.

Measure memory growth

use the following script to track the total memory of your application used.

PID=$(pgrep your_service)
while true; do
  echo "$(date): $(pmap $PID | grep total | awk '{print $2}')"
  sleep 60
done
Enter fullscreen mode Exit fullscreen mode

Interpretation:

Pattern Meaning
Linear growth Per‑operation leak
Step‑function growth Specific trigger
No growth Wrong hypothesis

Find the minimum trigger

Your goal: reproduce the leak in under 10 minutes.

Why?

  • Valgrind slows execution by 20–50×
  • A 10‑minute trigger becomes 3–8 hours
  • A 1‑hour trigger becomes 2–5 days

Why this matters: If your trigger is too slow, Valgrind becomes unusable.


Step 2 — Static Analysis (5 minutes, zero runtime cost)

Before running anything, let the compiler find the obvious issues.

Clang Static Analyzer

scan-build make
Enter fullscreen mode Exit fullscreen mode

Look for "Memory leak" warnings (ignore "Potential leak").

clang‑tidy

clang-tidy legacy_file.cpp \
  --checks='-*,clang-analyzer-*,cppcoreguidelines-owning-memory'
Enter fullscreen mode Exit fullscreen mode

Finds:

  • new without delete
  • malloc without free
  • Raw owning pointers

Misses:

  • Cycles
  • Third‑party leaks
  • Runtime‑dependent leaks

Why this matters: Static analysis gives you free wins before you even run the program.


Step 3 — Compile for Valgrind

Valgrind is useless without debug symbols. So first thing you should do is to compile the whole application with debug flag.

g++ -g3 -O0 -fno-omit-frame-pointer -o your_service your_service.cpp
Enter fullscreen mode Exit fullscreen mode
Flag Purpose
-g3 Full debug info
-O0 Clean stack frames
-fno-omit-frame-pointer Reliable backtraces

Why this matters: Without debug symbols, Valgrind can't show you file/line numbers.


Step 4 — Run Valgrind

Run only the trigger you identified in Step 1.

valgrind --leak-check=full \
         --show-leak-kinds=definite,indirect \
         --track-origins=yes \
         --log-file=valgrind_out.txt \
         ./your_service --run-trigger
Enter fullscreen mode Exit fullscreen mode

For long‑running services

Use vgdb to inspect leaks mid‑run:

valgrind --vgdb=yes --vgdb-error=0 --leak-check=full ./your_service
Enter fullscreen mode Exit fullscreen mode

Then:

vgdb leak_check full definite indirect
Enter fullscreen mode Exit fullscreen mode

Why this matters: You don't need to wait hours — you can inspect leaks while running.


Step 5 — Understand Valgrind's Leak Types

After the run, Valgrind will give you a report about memory lost in valgrind_out.txt. Example summary:

definitely lost: 1,024 bytes
indirectly lost: 6,144 bytes
possibly lost: 0 bytes
still reachable: 45,000 bytes
Enter fullscreen mode Exit fullscreen mode

What each type means

Valgrind gives the following types of memory lost. Based on the types, you decides your action.

Type Meaning Action
definitely lost Real leak Fix first
indirectly lost Child of a lost block Fix parent
possibly lost Pointer arithmetic / corruption Investigate
still reachable Globals/statics Ignore unless growing

If "still reachable" grows

Use Massif:

valgrind --tool=massif ./your_trigger
ms_print massif.out
Enter fullscreen mode Exit fullscreen mode

Why this matters: "Still reachable" is not a leak — unless it grows.


Step 6 — Capture the Stack Trace

A real leak looks like the following. With the stack trace and debug symbols, exact source file name and line number will be given. That is where memory is allocated. To fix it, you need to find out why the allocated memory was not released, e.g. delete is only called on one running path. With the trigger, another running path is active.

1,024 bytes in 1 blocks are definitely lost
at operator new
by DatabaseConnection::ExecuteQuery (db_connection.cpp:67)
by CustomerLoader::FetchCustomer (customer_loader.cpp:89)
Enter fullscreen mode Exit fullscreen mode

Extract only leak blocks:

grep -A10 "definitely lost" valgrind_out.txt
Enter fullscreen mode Exit fullscreen mode

Why this matters: The stack trace is the map that leads you to the leak.


Step 7 — Optional Regression Test

Useful when multiple developers touch the code.

TEST(LeakTest, ConfirmLeakExists) {
    size_t before = get_current_rss();
    for (int i = 0; i < 100; i++) {
        suspect_function();
    }
    size_t after = get_current_rss();
    EXPECT_LT((after - before) / 100, 1024);
}
Enter fullscreen mode Exit fullscreen mode

Why this matters: Regression tests prevent old leaks from returning.


Quick Reference

Task Command
Basic leak check valgrind --leak-check=full ./binary
Only real leaks --show-leak-kinds=definite,indirect
Save output --log-file=leak.log
Check running service vgdb leak_check full definite indirect
Heap profiling valgrind --tool=massif
Extract leak grep -A10 "definitely lost"

Real‑World Example

Imagine a legacy service that loads customers from a database and caches them.

The bug

// customer_loader.h
struct Customer {
    int id;
    std::string name;
};

class CustomerRepository {
public:
    Customer* LoadCustomer(int id);
};
Enter fullscreen mode Exit fullscreen mode
// customer_loader.cpp
#include "customer_loader.h"
#include "db_connection.h"

Customer* CustomerRepository::LoadCustomer(int id)
{
    DatabaseConnection* conn = DatabaseConnection::Get(); // singleton
    ResultSet* rs = conn->ExecuteQuery("SELECT id, name FROM customers WHERE id = " + std::to_string(id));

    if (!rs->Next()) {
        return nullptr;
    }

    Customer* c = new Customer{};
    c->id = rs->GetInt(0);
    c->name = rs->GetString(1);

    // BUG: ResultSet is never deleted
    // delete rs;  // missing

    return c; // caller owns Customer*
}
Enter fullscreen mode Exit fullscreen mode

Caller code:

void ProcessRequest(int customerId)
{
    CustomerRepository repo;
    Customer* c = repo.LoadCustomer(customerId);

    if (!c) {
        return;
    }

    // ... use c ...

    delete c; // correct
}
Enter fullscreen mode Exit fullscreen mode

At first glance, this looks “fine” because Customer is deleted.

But ResultSet is leaked on every call.


Valgrind report

You run your request handler under Valgrind:

valgrind --leak-check=full \
         --show-leak-kinds=definite,indirect \
         --track-origins=yes \
         --log-file=valgrind_leak.log \
         ./service --handle-request 42
Enter fullscreen mode Exit fullscreen mode

Relevant part of the report:

==12345== 128 bytes in 1 blocks are definitely lost in loss record 3 of 5
==12345==    at 0x4C2F1A3: operator new(unsigned long) (vg_replace_malloc.c:422)
==12345==    by 0x401F8B: ResultSet::ResultSet(DBHandle*) (result_set.cpp:27)
==12345==    by 0x4023D1: DatabaseConnection::ExecuteQuery(std::string const&) (db_connection.cpp:88)
==12345==    by 0x4039A4: CustomerRepository::LoadCustomer(int) (customer_loader.cpp:11)
==12345==    by 0x40412F: ProcessRequest(int) (request_handler.cpp:25)
==12345==    by 0x4043C9: main (main.cpp:17)
Enter fullscreen mode Exit fullscreen mode

Key points:

  • “128 bytes in 1 blocks are definitely lost” → real leak
  • Allocation happens in ResultSet::ResultSet
  • The call chain leads to CustomerRepository::LoadCustomer

You don’t need to know ResultSet internals—only that you allocated it and never freed it.


The fix

Customer* CustomerRepository::LoadCustomer(int id)
{
    DatabaseConnection* conn = DatabaseConnection::Get();
    ResultSet* rs = conn->ExecuteQuery("SELECT id, name FROM customers WHERE id = " + std::to_string(id));

    if (!rs->Next()) {
        delete rs;          // ✅ free on early return
        return nullptr;
    }

    Customer* c = new Customer{};
    c->id = rs->GetInt(0);
    c->name = rs->GetString(1);

    delete rs;              // ✅ free after use

    return c;
}
Enter fullscreen mode Exit fullscreen mode

Re‑run Valgrind:

==12345== HEAP SUMMARY:
==12345==     in use at exit: 0 bytes in 0 blocks
==12345==   total heap usage: 1,234 allocs, 1,234 frees, 98,765 bytes allocated
==12345== 
==12345== All heap blocks were freed -- no leaks are possible
Enter fullscreen mode Exit fullscreen mode

The Golden Rule

Never start fixing until you can reproduce the leak in under 10 minutes.

The trigger is your truth.
The stack trace is your map.

Top comments (0)