✨ Sometimes you just want to throw a large file at a model and ask:
“Summarise this without losing the good bits.”
Then reality appears.
🫠 Local models often do not have giant context windows.
🌱 Smaller, cheaper, more eco-friendly cloud models also often do not have giant context windows.
So instead of pretending one huge file will fit cleanly, this little toolkit does the sensible thing:
- ✂️ split the file into overlapping chunks
- 🤖 summarise each chunk with either Ollama or cloud models
- 🧵 stitch the chunk summaries back together
Simple. Reusable. No drama.
In Action 🥷
The Code 🚀
gist.github.com/simbo1905/053b482...
Fetch it with:
for f in $(gh gist view 053b48269b1e95800500b85190adf427 --files); do gh gist view 053b48269b1e95800500b85190adf427 -f "$f" > "$f"; done && chmod +x *.py *.awk
What’s in here? 📦
chunk_text.awkprocess_chunks_cloud.pyprocess_chunks_ollama.py
Why chunk at all? 🧠
Because smaller models are not magic.
If your source is too big, they either:
- miss details
- flatten nuance
- hallucinate structure
- or just do a bad job
So chunk_text.awk creates overlapping chunks.
Default settings:
- size:
11000characters - step:
10000characters - overlap:
1000characters
That overlap is there on purpose so ideas near a chunk boundary do not get chopped in half and quietly vanish into the void. ☠️
Chunk a file ✂️
awk -v size=11000 -v step=10000 -f chunk_text.awk notes.md
You’ll get files like:
notes00.mdnotes01.mdnotes02.md
Summarise with Ollama 🦙
uv run process_chunks_ollama.py example_chunks transcript summary
Default local model:
gemma4:26b
It accepts either:
transcript00.mdtranscript00.log
style chunk files.
Summarise with cloud models ☁️
uv run process_chunks_cloud.py example_chunks transcript summary
The script checks for API keys in this order:
- shell
MISTRAL_API_KEY - shell
GROQ_API_KEY - repo-root
.envMISTRAL_API_KEY - repo-root
.envGROQ_API_KEY
It is trivial to add your own API key, yet those are small, fast, open models.
Change the prompt 🛠️
Yes, obviously, you can change the prompt. That is the whole point.
Both summariser scripts support -p/--prompt.
Example:
uv run process_chunks_cloud.py example_chunks transcript summary -p "Summarise the argument, key evidence, and open questions."
Transcript-style prompt shape used in testing:
This is a transcript of an only video. The title is: ${title}. The transcript format is 'Speaker Name, timestamp, what they said'. Summarise the content in a terse, business-like, action-oriented way. Preserve substantive points, facts, figures, citations, and practical recommendations. Do not be chatty.
Outputs 🧾
For each input chunk, the scripts write:
summary_transcript00.mdthinking_summary_transcript00.md
The thinking_ file keeps the raw model output. The clean summary file strips thinking blocks where possible.
Stitch the summaries back together 🧵
cat summary_transcript*.md > all_summary.md
If there is a bit of overlap or repetition due to overlapping junk, so that split-up ideas are not lost. YMMV.
Want to add another provider later? 🔌
Update process_chunks_cloud.py in these places:
- add the key lookup and its priority order
- add the approved model names / aliases
- add the provider request function
- route the models to that provider in
call_model
Happy chunking. Happy summarising. May your small models punch above their weight. 🚀

Top comments (0)