Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
gpu
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Why I Self-Host 7 RTX 5090 GPUs Instead of Using Cloud AI
Biricik Biricik
Biricik Biricik
Biricik Biricik
Follow
Apr 4
Why I Self-Host 7 RTX 5090 GPUs Instead of Using Cloud AI
#
ai
#
gpu
#
selfhosted
#
infrastructure
Comments
Add Comment
6 min read
Hopper/Blackwell Tensor Core Optimization, llama.cpp VRAM Fix & 4W NPU Inference
soy
soy
soy
Follow
Apr 5
Hopper/Blackwell Tensor Core Optimization, llama.cpp VRAM Fix & 4W NPU Inference
#
gpu
#
nvidia
#
hardware
Comments
Add Comment
3 min read
Why I Self-Host 7 RTX 5090 GPUs Instead of Using AWS
Biricik Biricik
Biricik Biricik
Biricik Biricik
Follow
Apr 4
Why I Self-Host 7 RTX 5090 GPUs Instead of Using AWS
#
gpu
#
selfhosted
#
ai
#
startup
Comments
Add Comment
6 min read
8-Bit Quantization Destroyed 92% of Code Generation — The Culprit Wasn't Bit Count
plasmon
plasmon
plasmon
Follow
Apr 4
8-Bit Quantization Destroyed 92% of Code Generation — The Culprit Wasn't Bit Count
#
ai
#
llm
#
machinelearning
#
gpu
Comments
Add Comment
5 min read
I Couldn't Build a Local LLM PC for $1,300 — Budget Tiers and the VRAM Cliffs Between Them
plasmon
plasmon
plasmon
Follow
Apr 4
I Couldn't Build a Local LLM PC for $1,300 — Budget Tiers and the VRAM Cliffs Between Them
#
llm
#
gpu
#
localllm
#
vram
Comments
Add Comment
6 min read
From one model to seven — what it took to make TurboQuant model-portable
Alberto Nieto
Alberto Nieto
Alberto Nieto
Follow
Apr 1
From one model to seven — what it took to make TurboQuant model-portable
#
python
#
vllm
#
gpu
#
triton
Comments
Add Comment
3 min read
GPU Power Tools & CUDA Deep Dives for Local LLM Builders
soy
soy
soy
Follow
Apr 4
GPU Power Tools & CUDA Deep Dives for Local LLM Builders
#
gpu
#
nvidia
#
hardware
Comments
Add Comment
3 min read
Parameter Count Is the Worst Way to Pick a Model on 8GB VRAM
plasmon
plasmon
plasmon
Follow
Apr 2
Parameter Count Is the Worst Way to Pick a Model on 8GB VRAM
#
llm
#
locallm
#
gpu
#
llamacpp
Comments
Add Comment
5 min read
How Much GPU Memory Does Your LLM Actually Need?
Vishal Vishwakarma
Vishal Vishwakarma
Vishal Vishwakarma
Follow
Apr 2
How Much GPU Memory Does Your LLM Actually Need?
#
ai
#
llm
#
gpu
#
machinelearning
Comments
Add Comment
2 min read
I Couldn’t Debug My AI/ML GPU Incident - So I Built gpuxray
Vu Nguyen
Vu Nguyen
Vu Nguyen
Follow
Apr 2
I Couldn’t Debug My AI/ML GPU Incident - So I Built gpuxray
#
ai
#
gpu
#
opensource
#
linux
Comments
Add Comment
3 min read
What do you want to know about hardware acceleration? Ask the Google team!
Jess Lee
Jess Lee
Jess Lee
Follow
for
The DEV Team
Apr 3
What do you want to know about hardware acceleration? Ask the Google team!
#
discuss
#
datascience
#
analytics
#
gpu
8
 reactions
Comments
1
 comment
1 min read
MoE Beat Dense 27B by 2.4x on 8GB VRAM — The 35B-A3B Benchmark Nobody Expected
plasmon
plasmon
plasmon
Follow
Mar 31
MoE Beat Dense 27B by 2.4x on 8GB VRAM — The 35B-A3B Benchmark Nobody Expected
#
llm
#
machinelearning
#
ai
#
gpu
Comments
Add Comment
5 min read
The Memory Bandwidth Gap Is 49x and Growing — Why Local LLMs Hit a Ceiling
plasmon
plasmon
plasmon
Follow
Mar 31
The Memory Bandwidth Gap Is 49x and Growing — Why Local LLMs Hit a Ceiling
#
hardware
#
ai
#
machinelearning
#
gpu
Comments
Add Comment
7 min read
PyRadiomics Inefficiency in Large-Scale Studies Addressed by GPU Acceleration for Faster Processing
Roman Dubrovin
Roman Dubrovin
Roman Dubrovin
Follow
Mar 31
PyRadiomics Inefficiency in Large-Scale Studies Addressed by GPU Acceleration for Faster Processing
#
radiomics
#
gpu
#
pytorch
#
medicalimaging
Comments
Add Comment
8 min read
I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.
Christopher Maher
Christopher Maher
Christopher Maher
Follow
Mar 30
I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.
#
llm
#
kubernetes
#
gpu
#
ai
Comments
Add Comment
6 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account