Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
llm
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Prefix caching in vLLM under multi-tenant agent traffic
Marcus Chen
Marcus Chen
Marcus Chen
Follow
May 26
Prefix caching in vLLM under multi-tenant agent traffic
#
llm
#
mlops
#
infrastructure
#
pytorch
Comments
1
 comment
4 min read
I used LLMs to rewrite meta descriptions for 1,600 articles — honest results
Ayi NEDJIMI
Ayi NEDJIMI
Ayi NEDJIMI
Follow
May 21
I used LLMs to rewrite meta descriptions for 1,600 articles — honest results
#
ai
#
seo
#
webdev
#
llm
Comments
Add Comment
5 min read
Using Qwen 3.6 Plus: Great but a Bit Expensive
hummingbirdLabs
hummingbirdLabs
hummingbirdLabs
Follow
May 25
Using Qwen 3.6 Plus: Great but a Bit Expensive
#
ai
#
csharp
#
llm
#
programming
Comments
Add Comment
2 min read
Our AI Inference Bill Dropped 65% After We Stopped Treating Every Query the Same
Karthik S
Karthik S
Karthik S
Follow
May 21
Our AI Inference Bill Dropped 65% After We Stopped Treating Every Query the Same
#
ai
#
javascript
#
llm
#
backend
Comments
Add Comment
5 min read
Qwen 3.6 & llama.cpp Push Local Inference Limits on Consumer GPUs
soy
soy
soy
Follow
May 21
Qwen 3.6 & llama.cpp Push Local Inference Limits on Consumer GPUs
#
ai
#
llm
#
selfhosted
Comments
Add Comment
3 min read
AI Weekly — 2026-05-15 to 2026-05-22 | The Agentic Inflection Is Real, But the Enterprise Gap Is Wider Than Ever
Yang Goufang
Yang Goufang
Yang Goufang
Follow
May 22
AI Weekly — 2026-05-15 to 2026-05-22 | The Agentic Inflection Is Real, But the Enterprise Gap Is Wider Than Ever
#
ai
#
machinelearning
#
tech
#
llm
Comments
Add Comment
4 min read
I tested cheap vs expensive LLMs across 3 real agent tasks. The cheap model won every time.
Aimilios Vikatos
Aimilios Vikatos
Aimilios Vikatos
Follow
May 21
I tested cheap vs expensive LLMs across 3 real agent tasks. The cheap model won every time.
#
ai
#
python
#
llm
#
opensource
Comments
Add Comment
4 min read
Routing Event-Camera Pipelines Through an LLM Gateway: A Field Report
Marco Rinaldi
Marco Rinaldi
Marco Rinaldi
Follow
May 21
Routing Event-Camera Pipelines Through an LLM Gateway: A Field Report
#
machinelearning
#
computervision
#
mlops
#
llm
Comments
Add Comment
4 min read
Measuring AI Gateway Failover: 30 Days of Production Data
Marcus Chen
Marcus Chen
Marcus Chen
Follow
May 21
Measuring AI Gateway Failover: 30 Days of Production Data
#
mlops
#
llm
#
infrastructure
#
devops
Comments
Add Comment
3 min read
Routing diffusion inference traffic across three providers
Elise Moreau
Elise Moreau
Elise Moreau
Follow
May 21
Routing diffusion inference traffic across three providers
#
machinelearning
#
llm
#
mlops
#
infrastructure
Comments
Add Comment
4 min read
ToolRouter: Switch AI Coding Tools Freely Without Losing Context
Nilofer 🚀
Nilofer 🚀
Nilofer 🚀
Follow
May 22
ToolRouter: Switch AI Coding Tools Freely Without Losing Context
#
machinelearning
#
llm
#
python
#
opensource
2
 reactions
Comments
Add Comment
6 min read
Beyond the Stateless Prompt: Building an Auditable Product Intelligence Pipeline with Cascadeflow and Hindsight
Ritu.R
Ritu.R
Ritu.R
Follow
May 21
Beyond the Stateless Prompt: Building an Auditable Product Intelligence Pipeline with Cascadeflow and Hindsight
#
ai
#
architecture
#
dataengineering
#
llm
Comments
Add Comment
5 min read
Putting an LLM Gateway in Front of Our Build Agents
claire nguyen
claire nguyen
claire nguyen
Follow
May 21
Putting an LLM Gateway in Front of Our Build Agents
#
infrastructure
#
devops
#
sre
#
llm
Comments
Add Comment
4 min read
You Probably Don't Need 8-Bit Quantization
Billy Bob Gurr
Billy Bob Gurr
Billy Bob Gurr
Follow
May 21
You Probably Don't Need 8-Bit Quantization
#
ai
#
llm
#
opensource
#
hardware
Comments
Add Comment
2 min read
The README Was a Protocol. The Entrypoint Was Still Optional.
Kwansub Yun
Kwansub Yun
Kwansub Yun
Follow
May 21
The README Was a Protocol. The Entrypoint Was Still Optional.
#
llm
#
contextengineering
#
architecture
#
ai
Comments
Add Comment
8 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account