DEV Community

Vinodh Thiagarajan
Vinodh Thiagarajan

Posted on

The Vocabulary of GPUs for Gen AI Engineers

The conversation around GPUs in Gen AI talks often jumps straight to "just rent an H100" without explaining why.

I wrote a visual guide covering the vocabulary that actually matters:

๐Ÿ”น Why GPUs over CPUs (it's not just "more cores")
๐Ÿ”น HBM vs GDDR โ€” why your RTX 4090 can't run Llama 405B
๐Ÿ”น FLOPs, TFLOPS, and what those spec sheets actually mean
๐Ÿ”น Precision formats: FP32 โ†’ FP16 โ†’ BF16 โ†’ FP8
๐Ÿ”น The memory formula: Parameters ร— Bytes = VRAM needed
๐Ÿ”น How inference actually works โ€” from prompt to prediction
๐Ÿ”น Temperature: the inference-time knob everyone uses but few explain

This isn't about which GPU to buy.

It's about building the mental model so you can read a spec sheet, estimate memory requirements, and have informed conversations about infrastructure.

Part 1 of a 3-part series - https://medium.com/@vinodh.thiagarajan/the-vocabulary-of-gpus-for-ml-budding-gen-ai-engineers-7a693b53b74b

Top comments (0)