You're on call. A production C++ service just crashed — no logs, no stack trace, just a dead process and maybe a core file.
This guide gives you a clear, repeatable workflow to diagnose any crash, even when you're missing debug symbols or working with a stripped legacy binary. Whether you have a core file, a symbol file, an unstripped build, or nothing at all, you will always know the next step.
Why This Matters
Debugging crashes in legacy C++ systems is notoriously difficult because:
- Deployments often strip symbols
- Core dumps are disabled in production
- Build IDs don’t match
- ASLR shifts memory layouts
- Frame pointers are omitted
-
Systemd overrides
ulimitsettings
This workflow eliminates guesswork and gives you a deterministic path from crash to root cause.
Crash Debugging Decision Map
CRASH
|
v
HAVE CORE FILE?
|-- No --> Enable cores (Path B) → Reproduce crash
|
|-- Yes (Path A)
|
v
HAVE DEBUG SYMBOLS?
|-- Yes --> Debug now (A4)
|
|-- No
|
v
HAVE SYMBOL FILE?
|-- Yes --> Load with -s (A6)
|
|-- No
|
v
CAN REPRODUCE WITH SYMBOLS?
|-- Yes --> Rebuild with -g (A7)
|
|-- No
|
v
HAVE ORIGINAL BUILD?
|-- Yes --> Map addresses (A8)
|
|-- No --> Fallback analysis (A9)
Path A — You Have a Core File
A1 — Locate the Core File
Common locations:
ls -la core*
ls -la /var/core/
find / -name "core*" -type f 2>/dev/null
cat /proc/sys/kernel/core_pattern
If you find a core file, continue to A2.
If not, jump to Path B.
A2 — Identify Which Binary Produced the Core
file core
gdb -c core -batch -ex "info files"
Confirm that the core file belongs to the binary you intend to debug (path, build, version).
A3 — Check Whether the Binary Has Debug Symbols
file ./myapp
nm ./myapp | head
- If the binary is not stripped and you see symbol names → go to A4.
- If it is stripped → go to A5.
A4 — Debugging With Symbols (Best Case)
gdb ./myapp core
Useful GDB commands:
bt full
info threads
thread apply all bt
frame 0
info locals
print var
list
At this point you usually have:
- The crashing function and line
- The call stack
- Local variables and arguments
A5 — Binary Is Stripped: Find the Symbol File
In many production setups, the deployed binary is stripped, but symbol files are archived separately.
A5.1 — Extract Build ID
readelf -n ./myapp | grep "Build ID"
You’ll get something like:
Build ID: 1234567890abcdef...
A5.2 — Locate Matching Symbol File
Search your symbol store (example path):
find /symbols -type f -exec grep -l "1234567890abcdef" {} \;
- If you find a matching symbol file → go to A6.
- If not → go to A7.
A6 — Debug Using Separate Symbol Files
If your symbol file is myapp.dbg:
gdb -s myapp.dbg -e ./myapp -c core
Or recombine into a single unstripped binary:
eu-unstrip ./myapp myapp.dbg -o myapp.full
gdb ./myapp.full core
Now you can use the same commands as in A4.
A7 — No Symbol File: Reproduce With Debug Build
If you can rebuild and reproduce the crash:
g++ -g -O0 -fno-omit-frame-pointer -o myapp_debug ...
Run the debug build under the same conditions until it crashes and generates a new core file. Then debug that core with full symbols as in A4.
If you cannot reproduce the crash (e.g., one‑off production incident), continue with A8 or A9.
A8 — Map Raw Addresses Using an Unstripped Build
If you still have the original unstripped build (or can reconstruct it):
- Extract the crash address from the core:
gdb -c core -batch -ex "info registers" | grep rip
- Get the memory map of the process:
gdb -c core -batch -ex "info proc mappings"
- Compute the offset:
offset = crash_address - base_address_of_binary
- Map the offset to source:
addr2line -e /path/to/unstripped/myapp -f 0xOFFSET
This gives you the function and line number corresponding to the crash address.
A9 — No Symbols, No Reproduction: Fallback Forensics
Even with no symbols and no way to reproduce, you can still extract useful information.
Inspect Registers
gdb -c core -batch -ex "info registers"
Look for:
- Null pointers (
rax,rdi, etc. equal to0x0) - Suspicious addresses in your binary’s range
Inspect Instructions Around the Crash
gdb -c core -batch -ex "x/20i $rip-40"
You might see something like:
mov %rax,%rdi
test %rdi,%rdi
je <skip>
mov (%rdi),%rdx ← crash here
If rdi is 0x0, you can infer a null pointer dereference, even without symbols.
Path B — No Core File Generated
B1 — Check Core Dump Settings
ulimit -c
cat /proc/sys/kernel/core_pattern
If ulimit -c is 0, core dumps are disabled for your shell or service.
B2 — Enable Core Dumps
Temporary (current shell):
ulimit -c unlimited
Permanent (system‑wide):
echo "* soft core unlimited" | sudo tee -a /etc/security/limits.conf
echo "* hard core unlimited" | sudo tee -a /etc/security/limits.conf
You may need to log out and back in, or restart services.
B3 — Set Core File Location
Configure a directory for core files:
echo "/var/core/core.%e.%p.%t" | sudo tee /proc/sys/kernel/core_pattern
This pattern includes:
-
%e— executable name -
%p— PID -
%t— timestamp
B4 — Fix Permissions
sudo mkdir -p /var/core
sudo chmod 1777 /var/core
This ensures any process can write core files there.
B5 — Test Core Dump Generation
Create a small crash program:
int main() {
int* p = nullptr;
*p = 42;
}
Compile and run it:
g++ -g crash_test.cpp -o crash_test
./crash_test
Verify that a core file appears in /var/core (or your configured directory).
B6 — Rerun the Crashed Application
Now rerun the real application under the same conditions.
When it crashes, it should generate a core file.
Then return to Path A and continue from A2.
Common Pitfalls
- Core dumps disabled in production (
ulimit -c 0) - Stripped binaries deployed without archiving symbol files
- Mismatched Build IDs between binary and symbol file
- ASLR causing incorrect address mapping when computing offsets
- Missing frame pointers (
-fomit-frame-pointer) breaking backtraces - Systemd or other service managers overriding
ulimit - Symbol files not stored or indexed by Build ID
Quick Reference Table
| Task | Command |
|---|---|
| Enable cores | ulimit -c unlimited |
| Find cores | find / -name "core*" |
| Check symbols | file ./myapp |
| Get Build ID | readelf -n ./myapp |
| Debug with symbols | gdb -s myapp.dbg -e myapp -c core |
| Map address | addr2line -e myapp -f 0xOFFSET |
| Check core pattern | cat /proc/sys/kernel/core_pattern |
Pro Tips
- Always compile with
-g, then strip separately for release. - Store symbol files indexed by Build ID in a central, backed‑up location.
- Use
-fno-omit-frame-pointerfor more reliable backtraces. - Test core dump generation in a staging environment that mirrors production.
- Automate core collection and symbol archiving as part of your deployment pipeline.
Conclusion
This workflow covers every crash scenario — from “no core file” to “no symbols” to “full debug context.”
Bookmark it, share it with your team, and use it as your standard operating procedure for production crash analysis.
Top comments (0)