DEV Community

Vinodh Thiagarajan
Vinodh Thiagarajan

Posted on

The Vocabulary of GPUs for Gen AI Engineers

The conversation around GPUs in Gen AI talks often jumps straight to "just rent an H100" without explaining why.

I wrote a visual guide covering the vocabulary that actually matters:

πŸ”Ή Why GPUs over CPUs (it's not just "more cores")
πŸ”Ή HBM vs GDDR β€” why your RTX 4090 can't run Llama 405B
πŸ”Ή FLOPs, TFLOPS, and what those spec sheets actually mean
πŸ”Ή Precision formats: FP32 β†’ FP16 β†’ BF16 β†’ FP8
πŸ”Ή The memory formula: Parameters Γ— Bytes = VRAM needed
πŸ”Ή How inference actually works β€” from prompt to prediction
πŸ”Ή Temperature: the inference-time knob everyone uses but few explain

This isn't about which GPU to buy.

It's about building the mental model so you can read a spec sheet, estimate memory requirements, and have informed conversations about infrastructure.

Part 1 of a 3-part series - https://medium.com/@vinodh.thiagarajan/the-vocabulary-of-gpus-for-ml-budding-gen-ai-engineers-7a693b53b74b

Top comments (0)