DEV Community

# nvidia

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Intel Xe3P Leaks 160GB LPDDR5X; FlashAttention-2 in CuTe & Custom CUDA GPT-2 Engine

Intel Xe3P Leaks 160GB LPDDR5X; FlashAttention-2 in CuTe & Custom CUDA GPT-2 Engine

Comments
3 min read
GPU Bottleneck Analyzer, NVIDIA Rubin VRAM Demands, and Qwen VRAM Optimization

GPU Bottleneck Analyzer, NVIDIA Rubin VRAM Demands, and Qwen VRAM Optimization

1
Comments
4 min read
GPU Hardware & Driver Update: RTX 5090 Benchmarks, llama.cpp MTP, Windows 11 Fix

GPU Hardware & Driver Update: RTX 5090 Benchmarks, llama.cpp MTP, Windows 11 Fix

Comments
3 min read
CUDA Cutile-rs Beta, AMD FSR 4.1 Release, & Forza Horizon 6 GPU Benchmarks

CUDA Cutile-rs Beta, AMD FSR 4.1 Release, & Forza Horizon 6 GPU Benchmarks

Comments
3 min read
One Open Source Project a Day (No. 66): NVIDIA Video Search and Summarization - Building GPU-Accelerated Vision Agents

One Open Source Project a Day (No. 66): NVIDIA Video Search and Summarization - Building GPU-Accelerated Vision Agents

Comments
4 min read
AMD RDNA 4 & AI PRO GPUs Launch, FSR 4.1 Benchmarks, DGX Water Cooling

AMD RDNA 4 & AI PRO GPUs Launch, FSR 4.1 Benchmarks, DGX Water Cooling

Comments
3 min read
RTX 5080 Launched, Rust for CUDA, & LLM GPU Scheduling Deep Dive

RTX 5080 Launched, Rust for CUDA, & LLM GPU Scheduling Deep Dive

Comments
3 min read
Run NVIDIA NIM on Your Own GPU — Same API, Different Endpoint

Run NVIDIA NIM on Your Own GPU — Same API, Different Endpoint

1
Comments
5 min read
Custom CUDA Kernels, Modded RTX 4090 48GB VRAM, & DLSS DLL Manager

Custom CUDA Kernels, Modded RTX 4090 48GB VRAM, & DLSS DLL Manager

1
Comments
3 min read
DeepSeek-V4-Flash Benchmarks, FlashRT CUDA Runtime, & V100 LLM Performance

DeepSeek-V4-Flash Benchmarks, FlashRT CUDA Runtime, & V100 LLM Performance

Comments
3 min read
RTX 5090, LLaMA.cpp TurboQuant, & Blackwell CUDA Scheduling Boosts GPU Performance

RTX 5090, LLaMA.cpp TurboQuant, & Blackwell CUDA Scheduling Boosts GPU Performance

1
Comments
3 min read
CUDA-Oxide 0.1, RTX 5070 Launch, & BeeLlama.cpp Boost 3090 Inference

CUDA-Oxide 0.1, RTX 5070 Launch, & BeeLlama.cpp Boost 3090 Inference

Comments
3 min read
CUDA-Oxide 0.1 Lands; RTX 5090 Launches with 32GB & Hits 600 Tok/s

CUDA-Oxide 0.1 Lands; RTX 5090 Launches with 32GB & Hits 600 Tok/s

Comments
3 min read
AMD MI350P, CUDA WarpReduction, & Adrenalin 26.5.1 Driver Updates

AMD MI350P, CUDA WarpReduction, & Adrenalin 26.5.1 Driver Updates

Comments
3 min read
A Framework for Building My First Multi-Agent System

A Framework for Building My First Multi-Agent System

Comments
2 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.