DEV Community

# cuda

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Why CUDA kernels silently corrupt memory and how to catch the bug

Why CUDA kernels silently corrupt memory and how to catch the bug

Comments
5 min read
CUDA Out of Memory at 60% Utilization: Tracing PyTorch GPU Memory Fragmentation

CUDA Out of Memory at 60% Utilization: Tracing PyTorch GPU Memory Fragmentation

Comments
4 min read
How I optimized a Solana vanity address grinder to 44M keys/sec on GPU

How I optimized a Solana vanity address grinder to 44M keys/sec on GPU

Comments
2 min read
From Black Magic to Science: The Evolution of the CUDA Optimization Skill

From Black Magic to Science: The Evolution of the CUDA Optimization Skill

Comments
11 min read
Learning Resources Tech

Learning Resources Tech

Comments
1 min read
512MiB 512MB — the silent trtexec bug

512MiB 512MB — the silent trtexec bug

Comments
2 min read
Memory Coalescing: Same computation, 6x Performance Difference

Memory Coalescing: Same computation, 6x Performance Difference

Comments
6 min read
Setting Up NVIDIA Drivers and CUDA for ML/DL on Ubuntu 22.04

Setting Up NVIDIA Drivers and CUDA for ML/DL on Ubuntu 22.04

1
Comments
3 min read
Achieving Neuro‑Sama‑Tier Speech‑to‑Text for Your Local AI Companion (Whisper + CUDA + LivinGrimoire)

Achieving Neuro‑Sama‑Tier Speech‑to‑Text for Your Local AI Companion (Whisper + CUDA + LivinGrimoire)

Comments
5 min read
CUDA Graphs: The 8-Year Overnight Success and the Observability Gap

CUDA Graphs: The 8-Year Overnight Success and the Observability Gap

Comments
9 min read
124x Slower: What PyTorch DataLoader Actually Does at the Kernel Level

124x Slower: What PyTorch DataLoader Actually Does at the Kernel Level

1
Comments
5 min read
Tracing a 13x PyTorch Slowdown to a Hidden NumPy Synchronization

Tracing a 13x PyTorch Slowdown to a Hidden NumPy Synchronization

2
Comments
4 min read
Installing NVIDIA Drivers Without CUDA

Installing NVIDIA Drivers Without CUDA

2
Comments
7 min read
AMD ROCm on Consumer GPUs: The Open-Source CUDA Alternative That Actually Works Now [2026 Guide]

AMD ROCm on Consumer GPUs: The Open-Source CUDA Alternative That Actually Works Now [2026 Guide]

2
Comments
7 min read
I built the first open-source FP8 linear solver in Python — 2-3x faster than cuBLAS

I built the first open-source FP8 linear solver in Python — 2-3x faster than cuBLAS

2
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.