Discussion on: How to Run Gemma 4 Locally With Ollama, llama.cpp, and vLLM

View post

I found Ollama likes to use a context window of 4096 by default. Because I'm working with opencode and larger development projects that is just not enough. I had to save a version of the Gemma4 model with a 32K context window explicitly (see dev.to/grovertek/running-gemma-4-l... for details). Then calling it via opencode worked as expected - other than the periodic tool call issues you mentioned.