DEV Community

Christopher Maher profile picture

Christopher Maher

Husband, dad, and software engineering leader. Passionate about automation, AI, emerging tech, and ham radio (N7CPM).

Joined Joined on  github website
TurboQuant on a MacBook Pro: two findings the upstream discussion missed

TurboQuant on a MacBook Pro: two findings the upstream discussion missed

Comments
7 min read
62.2% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

62.2% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

Comments
16 min read
We ran Qwen3.6-27B on $800 of consumer GPUs, day one: llama.cpp vs vLLM

We ran Qwen3.6-27B on $800 of consumer GPUs, day one: llama.cpp vs vLLM

1
Comments
15 min read
LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

Comments
3 min read
I tested speculative decoding on my home GPU cluster. Here's why it didn't help.

I tested speculative decoding on my home GPU cluster. Here's why it didn't help.

Comments
5 min read
Google Released Gemma 4 Yesterday. I Had It Fixing Real Bugs by Lunch.

Google Released Gemma 4 Yesterday. I Had It Fixing Real Bugs by Lunch.

1
Comments
5 min read
I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.

I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.

2
Comments
6 min read
The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

Comments
4 min read
llama.cpp on Kubernetes: The Guide I Wish Existed

llama.cpp on Kubernetes: The Guide I Wish Existed

2
Comments
9 min read
loading...