Simon Massey

Posted on Apr 19

GenAI Test File Summarisation In Chunks With Ollama Or Cloud

#productivity #ai #python

✨ Sometimes you just want to throw a large file at a model and ask:

“Summarise this without losing the good bits.”

Then reality appears.

🫠 Local models often do not have giant context windows.

🌱 Smaller, cheaper, more eco-friendly cloud models also often do not have giant context windows.

So instead of pretending one huge file will fit cleanly, this little toolkit does the sensible thing:

✂️ split the file into overlapping chunks
🤖 summarise each chunk with either Ollama or cloud models
🧵 stitch the chunk summaries back together

Simple. Reusable. No drama.

In Action 🥷

The Code 🚀

gist.github.com/simbo1905/053b482...

Fetch it with:

for f in $(gh gist view 053b48269b1e95800500b85190adf427 --files); do gh gist view 053b48269b1e95800500b85190adf427 -f "$f" > "$f"; done && chmod +x *.py *.awk

What’s in here? 📦

chunk_text.awk
process_chunks_cloud.py
process_chunks_ollama.py

Why chunk at all? 🧠

Because smaller models are not magic.

If your source is too big, they either:

miss details
flatten nuance
hallucinate structure
or just do a bad job

So chunk_text.awk creates overlapping chunks.

Default settings:

size: 11000 characters
step: 10000 characters
overlap: 1000 characters

That overlap is there on purpose so ideas near a chunk boundary do not get chopped in half and quietly vanish into the void. ☠️

Chunk a file ✂️

awk -v size=11000 -v step=10000 -f chunk_text.awk notes.md

You’ll get files like:

notes00.md
notes01.md
notes02.md

Summarise with Ollama 🦙

uv run process_chunks_ollama.py example_chunks transcript summary

Default local model:

gemma4:26b

It accepts either:

transcript00.md
transcript00.log

style chunk files.

Summarise with cloud models ☁️

uv run process_chunks_cloud.py example_chunks transcript summary

The script checks for API keys in this order:

shell MISTRAL_API_KEY
shell GROQ_API_KEY
repo-root .env MISTRAL_API_KEY
repo-root .env GROQ_API_KEY

It is trivial to add your own API key, yet those are small, fast, open models.

Change the prompt 🛠️

Yes, obviously, you can change the prompt. That is the whole point.

Both summariser scripts support -p/--prompt.

Example:

uv run process_chunks_cloud.py example_chunks transcript summary -p "Summarise the argument, key evidence, and open questions."

Transcript-style prompt shape used in testing:

This is a transcript of an only video. The title is: ${title}. The transcript format is 'Speaker Name, timestamp, what they said'. Summarise the content in a terse, business-like, action-oriented way. Preserve substantive points, facts, figures, citations, and practical recommendations. Do not be chatty.

Outputs 🧾

For each input chunk, the scripts write:

summary_transcript00.md
thinking_summary_transcript00.md

The thinking_ file keeps the raw model output. The clean summary file strips thinking blocks where possible.

Stitch the summaries back together 🧵

cat summary_transcript*.md > all_summary.md

If there is a bit of overlap or repetition due to overlapping junk, so that split-up ideas are not lost. YMMV.

Want to add another provider later? 🔌

Update process_chunks_cloud.py in these places:

add the key lookup and its priority order
add the approved model names / aliases
add the provider request function
route the models to that provider in call_model

Happy chunking. Happy summarising. May your small models punch above their weight. 🚀

DEV Community