🔥 kubectl-prof: Profile Your Kubernetes Apps Without Touching Them
Have you ever needed to debug a performance issue in a production Kubernetes pod and thought: "I wish I could just attach a profiler without restarting anything"?
That's exactly what kubectl-prof solves.
It's a kubectl plugin that lets you profile running pods — generating FlameGraphs, JFR files, heap dumps, thread dumps, memory dumps, and more — without modifying your deployments, without restarting pods, and with minimal overhead.
✨ What Makes It Special?
- 🎯 Zero modifications — attach to any running pod, no sidecar needed
- 🌐 Multi-language — Java, Go, Python, Ruby, Node.js, Rust, Clang/Clang++, PHP, .NET
- 📊 Rich output formats — FlameGraphs, JFR, SpeedScope, thread dumps, heap dumps, GC dumps, memory dumps, memory flamegraphs, allocation summaries, and more
- ⚡ Low overhead — minimal impact on production workloads
- 🔄 Continuous profiling — support for both one-shot and interval-based modes
- 🐳 Multiple runtimes —
containerdandCRI-Osupported
🚀 Quick Start
Install via Krew:
kubectl krew index add kubectl-prof https://github.com/josepdcs/kubectl-prof
kubectl krew install kubectl-prof/prof
Profile a Java app for 1 minute and get a FlameGraph:
kubectl prof my-pod -t 1m -l java
Profile a Python app and save the output to /tmp:
kubectl prof my-pod -t 1m -l python --local-path=/tmp
That's it. A profiling Job is spun up on the same node, profiles the target pod, and delivers the result back to your terminal.
🔧 How It Works
When you run kubectl prof, it:
- Identifies the node where your target pod is running
- Launches a Kubernetes Job on that same node with the appropriate profiling agent image
- The agent attaches to the running container process using language-specific tools
- Results are streamed back and saved locally
No changes to your application. No restarts. No sidecars.
💻 Language Support
☕ Java (JVM)
kubectl-prof supports both async-profiler and jcmd:
# FlameGraph (default, uses async-profiler)
kubectl prof mypod -t 5m -l java -o flamegraph
# JFR recording
kubectl prof mypod -t 5m -l java -o jfr
# Thread dump
kubectl prof mypod -l java -o threaddump
# Heap dump
kubectl prof mypod -l java -o heapdump --tool jcmd
# Heap histogram
kubectl prof mypod -l java -o heaphistogram --tool jcmd
You can also target specific profiling events with async-profiler:
# CPU (default: ctimer), memory allocation, or lock contention
kubectl prof mypod -t 5m -l java -e alloc
kubectl prof mypod -t 5m -l java -e lock
And pass extra arguments directly to async-profiler:
# Wall-clock profiling in per-thread mode
kubectl prof mypod -t 5m -l java -e wall --async-profiler-args -t
For Alpine-based containers, add --alpine:
kubectl prof mypod -t 1m -l java -o flamegraph --alpine
🐍 Python
Uses py-spy under the hood:
kubectl prof mypod -t 1m -l python -o flamegraph
kubectl prof mypod -l python -o threaddump
kubectl prof mypod -t 1m -l python -o speedscope
🧠 Memory Profiling with Memray
For memory profiling, kubectl-prof now integrates Memray — Bloomberg's powerful Python memory profiler. While py-spy reveals where your CPU time goes, memray reveals where your memory goes: every allocation, every deallocation, tracked in real time.
How it works (zero-downtime, zero code changes):
Memray attaches to the running Python process via GDB injection, enters the target container's network namespace via nsenter, and runs memray attach --aggregate directly against the live process. The agent automatically stages a version-matched memray package into the target container's filesystem — no memray installation is required in your application image.
Requirements:
-
SYS_PTRACE+SYS_ADMINcapabilities — added automatically when--tool memrayis used - Python 3.10, 3.11, 3.12, 3.13 (glibc-based images only)
- ❌ Not supported: Alpine/musl targets or statically-linked Python builds
Output types:
| Output | Flag | Format | Description |
|---|---|---|---|
| Memory flamegraph | -o flamegraph |
HTML | Interactive flamegraph of allocation call stacks & sizes |
| Allocation summary | -o summary |
Text | Tabular list of top allocators by total bytes |
# Interactive HTML memory flamegraph — open in any browser
kubectl prof mypod -t 1m -l python --tool memray -o flamegraph --local-path=/tmp
# Text summary of the biggest allocators
kubectl prof mypod -t 1m -l python --tool memray -o summary --local-path=/tmp
# Long session with custom heartbeat (keeps the connection alive through proxies)
kubectl prof mypod -t 10m -l python --tool memray -o flamegraph --heartbeat-interval=15s
# Target a specific process in a multi-process pod
kubectl prof mypod -t 2m -l python --tool memray -o flamegraph --pid 1234
kubectl prof mypod -t 2m -l python --tool memray -o flamegraph --pgrep my-worker
Note:
--tool memraymust be set explicitly. The default Python tool remains py-spy.
🐹 Go
Uses eBPF profiling. Two options available:
# BPF (default) — requires kernel headers
kubectl prof mypod -t 1m -l go -o flamegraph
# BTF (CO-RE) — no kernel headers needed, works on modern kernels
kubectl prof mypod -t 1m -l go --tool btf
The BTF option is great for cloud providers like DigitalOcean where kernel headers may not be available.
📗 Node.js
# FlameGraph via eBPF
kubectl prof mypod -t 1m -l node -o flamegraph
# Heap snapshot
kubectl prof mypod -l node -o heapsnapshot
Tip: Run Node.js with
--perf-basic-proffor better JavaScript symbol resolution.
💎 Ruby
Uses rbspy:
kubectl prof mypod -t 1m -l ruby -o flamegraph
kubectl prof mypod -t 1m -l ruby -o speedscope
kubectl prof mypod -t 1m -l ruby -o callgrind
🦀 Rust
Uses cargo-flamegraph for Rust-optimized profiling with great symbol resolution:
kubectl prof mypod -t 1m -l rust -o flamegraph
🐘 PHP
Uses phpspy — works with PHP 7+, zero modifications needed:
kubectl prof mypod -t 1m -l php -o flamegraph
kubectl prof mypod -t 1m -l php -o raw
🟣 .NET (Core / .NET 5+)
This is where kubectl-prof really shines. Four tools from the .NET diagnostics suite are fully supported:
| Tool | Output | Use case |
|---|---|---|
dotnet-trace (default) |
.speedscope.json or .nettrace
|
CPU traces & runtime events |
dotnet-gcdump |
.gcdump |
GC heap snapshot |
dotnet-counters |
.json |
Real-time performance counters |
dotnet-dump |
.dmp |
Full memory dump |
# CPU trace → open in speedscope.app
kubectl prof mypod -t 30s -l dotnet -o speedscope
# GC heap dump
kubectl prof mypod -l dotnet --tool dotnet-gcdump -o gcdump
# Performance counters (CPU, GC, thread pool, exceptions…)
kubectl prof mypod -t 30s -l dotnet --tool dotnet-counters -o counters
# Full memory dump
kubectl prof mypod -l dotnet --tool dotnet-dump -o dump
🎯 Advanced Features
Profile Multiple Pods at Once
kubectl prof --selector app=myapp -t 5m -l java -o jfr
⚠️ Use with caution — this profiles ALL matching pods.
Continuous Profiling
Generate results at regular intervals:
kubectl prof mypod -l java -t 5m --interval 60s
Target a Specific Process
# By PID
kubectl prof mypod -l java --pid 1234
# By process name
kubectl prof mypod -l java --pgrep java-app-process
Custom Resource Limits
kubectl prof mypod -l java -t 5m \
--cpu-limits=1 \
--cpu-requests=100m \
--mem-limits=200Mi \
--mem-requests=100Mi
Cross-Namespace Profiling
kubectl prof mypod -n profiling \
--service-account=profiler \
--target-namespace=my-apps \
-l go
Handle Large Output Files
For heap dumps, memory dumps, and other large files, split them into chunks for easier transfer:
kubectl prof mypod -l java -o heapdump --tool jcmd --output-split-size=100M
kubectl prof mypod -l dotnet --tool dotnet-dump -o dump --output-split-size=500M
Node Tolerations
Profile pods on nodes with taints:
kubectl prof my-pod -t 5m -l java \
--tolerations=node.kubernetes.io/disk-pressure=true:NoSchedule \
--tolerations=dedicated=profiling:PreferNoSchedule
📦 Installation Options
Krew (Recommended)
kubectl krew index add kubectl-prof https://github.com/josepdcs/kubectl-prof
kubectl krew install kubectl-prof/prof
Pre-built Binaries
# Linux x86_64
wget https://github.com/josepdcs/kubectl-prof/releases/download/1.11.1/kubectl-prof_1.11.1_linux_amd64.tar.gz
tar xvfz kubectl-prof_1.11.1_linux_amd64.tar.gz
sudo install kubectl-prof /usr/local/bin/
Build from Source
go get -d github.com/josepdcs/kubectl-prof
cd $GOPATH/src/github.com/josepdcs/kubectl-prof
make install-deps
make build
🧠 When Should You Use kubectl-prof?
kubectl-prof is the tool you want when:
- 🔥 You're hitting unexpected CPU spikes in production and need a FlameGraph now
- 💾 You suspect a memory leak and want a heap dump without restarting
- 🐌 Your app is slow and you need to identify the bottleneck across any language
- 🔄 You need continuous profiling over time with interval-based snapshots
- 🚫 You can't modify the running workload (no sidecars, no redeploys)
🤝 Contributing
The project is open source (Apache 2.0) and welcomes contributions:
- 🐛 Bug reports and fixes
- 💡 Feature requests
- 📝 Documentation improvements
- 🔧 Pull requests
Check the Contributing guide and give the repo a ⭐ if you find it useful!
👉 GitHub: https://github.com/josepdcs/kubectl-prof
Have you tried kubectl-prof? What language or profiling scenario would you like to see covered next? Drop a comment below! 👇
Top comments (0)