DEV Community

Christopher Maher profile picture

Christopher Maher

Husband, dad, and software engineering leader. Passionate about automation, AI, emerging tech, and ham radio (N7CPM).

Joined Joined on  github website
A local model opened 41 of our pull requests in five weeks. The model is the least interesting part.

A local model opened 41 of our pull requests in five weeks. The model is the least interesting part.

Comments
10 min read

Want to connect with Christopher Maher?

Create an account to connect with Christopher Maher. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
A 27B model on an AMD mini-PC fixed a bug in our operator. Then it overreached.

A 27B model on an AMD mini-PC fixed a bug in our operator. Then it overreached.

Comments
5 min read
Trust the harness, not the model: a weekend of local agents building their own guardrails

Trust the harness, not the model: a weekend of local agents building their own guardrails

Comments 3
7 min read
Making a fleet of self-hosted LLM agents trustworthy

Making a fleet of self-hosted LLM agents trustworthy

1
Comments
6 min read
TurboQuant on a MacBook Pro, part 2: perplexity, KL divergence, and asymmetric K/V on M5 Max

TurboQuant on a MacBook Pro, part 2: perplexity, KL divergence, and asymmetric K/V on M5 Max

Comments
8 min read
TurboQuant on a MacBook Pro: two findings the upstream discussion missed

TurboQuant on a MacBook Pro: two findings the upstream discussion missed

Comments
7 min read
62.2% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

62.2% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

Comments
16 min read
We ran Qwen3.6-27B on $800 of consumer GPUs, day one: llama.cpp vs vLLM

We ran Qwen3.6-27B on $800 of consumer GPUs, day one: llama.cpp vs vLLM

2
Comments
15 min read
LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

Comments
3 min read
I tested speculative decoding on my home GPU cluster. Here's why it didn't help.

I tested speculative decoding on my home GPU cluster. Here's why it didn't help.

Comments
5 min read
Google Released Gemma 4 Yesterday. I Had It Fixing Real Bugs by Lunch.

Google Released Gemma 4 Yesterday. I Had It Fixing Real Bugs by Lunch.

1
Comments
5 min read
I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.

I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.

2
Comments
6 min read
The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

Comments
4 min read
llama.cpp on Kubernetes: The Guide I Wish Existed

llama.cpp on Kubernetes: The Guide I Wish Existed

3
Comments
9 min read
loading...