DEV Community

# nvidia

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
FlashAttention CUDA Speedup, RTX 5090 LLM Performance, & NVIDIA Blackwell GPU Launch

FlashAttention CUDA Speedup, RTX 5090 LLM Performance, & NVIDIA Blackwell GPU Launch

Comments
3 min read
Stop Guessing Your Next GPU: I Built a GPU Upgrade Value Calculator

Stop Guessing Your Next GPU: I Built a GPU Upgrade Value Calculator

2
Comments
4 min read
RTX 4090 Cooling, LLM KV Cache Quantization, & Deepseek V4 Flash Models

RTX 4090 Cooling, LLM KV Cache Quantization, & Deepseek V4 Flash Models

Comments
3 min read
Photonic NPU Chips: The Light-Based Tech That Could Make NVIDIA GPUs Obsolete [2026]

Photonic NPU Chips: The Light-Based Tech That Could Make NVIDIA GPUs Obsolete [2026]

Comments
8 min read
Deepseek TileKernels, RTX 3090 LLM Benchmarks & Nvidia Inference Dashboard

Deepseek TileKernels, RTX 3090 LLM Benchmarks & Nvidia Inference Dashboard

Comments
3 min read
Google TPU 8 vs Nvidia: 8t and 8i Specs Explained

Google TPU 8 vs Nvidia: 8t and 8i Specs Explained

Comments
9 min read
CUDA Triton Optimization, RTX Remix VFX Update, and VSR Benchmarks

CUDA Triton Optimization, RTX Remix VFX Update, and VSR Benchmarks

Comments
4 min read
Tesla P40 in a Homelab: 24GB of Inference on a Budget

Tesla P40 in a Homelab: 24GB of Inference on a Budget

Comments 2
6 min read
NVIDIA Pushes GPU Tech: DLSS 4.5, Streamline 2.11.1 SDKs & RTX Remix Updates

NVIDIA Pushes GPU Tech: DLSS 4.5, Streamline 2.11.1 SDKs & RTX Remix Updates

Comments
3 min read
When Tokens Cost 12 Cents Per Million, The Bottleneck Isn't Cost. It's Control.

When Tokens Cost 12 Cents Per Million, The Bottleneck Isn't Cost. It's Control.

Comments
4 min read
Local LLM on NVIDIA GPU vs Cloud API: A Real Cost Analysis

Local LLM on NVIDIA GPU vs Cloud API: A Real Cost Analysis

Comments
5 min read
NVIDIA Vera Rubin 192GB SOCAMM2 Memory, SASS Reverse Engineering, & CUDA Kernel Dev

NVIDIA Vera Rubin 192GB SOCAMM2 Memory, SASS Reverse Engineering, & CUDA Kernel Dev

Comments
3 min read
Personal token factory: OpenClaw in AWS but Nvidia GB10 at home

Personal token factory: OpenClaw in AWS but Nvidia GB10 at home

Comments
18 min read
CUDA Kernels in Python, GDDR7 Memory Breakthrough, and Radeon RX 9060 XT Launch

CUDA Kernels in Python, GDDR7 Memory Breakthrough, and Radeon RX 9060 XT Launch

Comments
4 min read
Diffusion Language Models: How NVIDIA Nemotron-Labs Diffusion Shatters the Autoregressive Speed Ceiling

Diffusion Language Models: How NVIDIA Nemotron-Labs Diffusion Shatters the Autoregressive Speed Ceiling

Comments
18 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.