DEV Community

Simon Massey
Simon Massey

Posted on

GenAI Test File Summarisation In Chunks With Ollama Or Cloud

✨ Sometimes you just want to throw a large file at a model and ask:

“Summarise this without losing the good bits.”

Then reality appears.

🫠 Local models often do not have giant context windows.

🌱 Smaller, cheaper, more eco-friendly cloud models also often do not have giant context windows.

So instead of pretending one huge file will fit cleanly, this little toolkit does the sensible thing:

  1. ✂️ split the file into overlapping chunks
  2. 🤖 summarise each chunk with either Ollama or cloud models
  3. 🧵 stitch the chunk summaries back together

Simple. Reusable. No drama.

In Action 🥷

The Code 🚀

gist.github.com/simbo1905/053b482...

Fetch it with:

for f in $(gh gist view 053b48269b1e95800500b85190adf427 --files); do gh gist view 053b48269b1e95800500b85190adf427 -f "$f" > "$f"; done && chmod +x *.py *.awk
Enter fullscreen mode Exit fullscreen mode

What’s in here? 📦

  • chunk_text.awk
  • process_chunks_cloud.py
  • process_chunks_ollama.py

Why chunk at all? 🧠

Because smaller models are not magic.

If your source is too big, they either:

  • miss details
  • flatten nuance
  • hallucinate structure
  • or just do a bad job

So chunk_text.awk creates overlapping chunks.

Default settings:

  • size: 11000 characters
  • step: 10000 characters
  • overlap: 1000 characters

That overlap is there on purpose so ideas near a chunk boundary do not get chopped in half and quietly vanish into the void. ☠️

Chunk a file ✂️

awk -v size=11000 -v step=10000 -f chunk_text.awk notes.md
Enter fullscreen mode Exit fullscreen mode

You’ll get files like:

  • notes00.md
  • notes01.md
  • notes02.md

Summarise with Ollama 🦙

uv run process_chunks_ollama.py example_chunks transcript summary
Enter fullscreen mode Exit fullscreen mode

Default local model:

  • gemma4:26b

It accepts either:

  • transcript00.md
  • transcript00.log

style chunk files.

Summarise with cloud models ☁️

uv run process_chunks_cloud.py example_chunks transcript summary
Enter fullscreen mode Exit fullscreen mode

The script checks for API keys in this order:

  1. shell MISTRAL_API_KEY
  2. shell GROQ_API_KEY
  3. repo-root .env MISTRAL_API_KEY
  4. repo-root .env GROQ_API_KEY

It is trivial to add your own API key, yet those are small, fast, open models.

Change the prompt 🛠️

Yes, obviously, you can change the prompt. That is the whole point.

Both summariser scripts support -p/--prompt.

Example:

uv run process_chunks_cloud.py example_chunks transcript summary -p "Summarise the argument, key evidence, and open questions."
Enter fullscreen mode Exit fullscreen mode

Transcript-style prompt shape used in testing:

This is a transcript of an only video. The title is: ${title}. The transcript format is 'Speaker Name, timestamp, what they said'. Summarise the content in a terse, business-like, action-oriented way. Preserve substantive points, facts, figures, citations, and practical recommendations. Do not be chatty.
Enter fullscreen mode Exit fullscreen mode

Outputs 🧾

For each input chunk, the scripts write:

  • summary_transcript00.md
  • thinking_summary_transcript00.md

The thinking_ file keeps the raw model output. The clean summary file strips thinking blocks where possible.

Stitch the summaries back together 🧵

cat summary_transcript*.md > all_summary.md
Enter fullscreen mode Exit fullscreen mode

If there is a bit of overlap or repetition due to overlapping junk, so that split-up ideas are not lost. YMMV.

Want to add another provider later? 🔌

Update process_chunks_cloud.py in these places:

  • add the key lookup and its priority order
  • add the approved model names / aliases
  • add the provider request function
  • route the models to that provider in call_model

Happy chunking. Happy summarising. May your small models punch above their weight. 🚀

Top comments (0)