Your LLM call isn't atomic, it's a conversation paused mid-sentence.
It's late. I'm staring at a 38KB system prompt I've read forty times this week. The agent just called "stock_report" when I asked it to list my products, and I'm scrolling for whatever sentence misled it. Again.
My eyes unfocus.
I think: I could just ask it.
I paste the request into a new tab and add one line: "which prompt sentence steered you toward stock_report?" The model quotes the exact paragraph in under a second. I remove one line, add one sentence. The chain that took six rounds yesterday takes four today.
I've been debugging LLM prompts like API logs for a month. The whole time, the thing could talk.
So I built midsentence, a proxy you run locally, dropped between your app and any OpenAI-compatible LLM API. Your app points at midsentence; it forwards your calls to whichever provider you use (OpenAI, OpenRouter, Anthropic, local vLLM), and captures every request and response on the way. When a response looks weird, you click the capture, type a follow-up, and the same model answers, often quoting the exact sentences in your own prompt that steered the choice.
Debug prompts by subtraction, not addition.
It's a debug tool, not a magic fix. Each debrief is still a separate model call, so answer quality depends on the model's introspection strength. Bigger models give sharp citations. Smaller ones go dumb sometimes. Useful especially for vibe-debugging.
Open source: https://lnkd.in/dbzAAjN2
If you're still grep-and-guessing through prompts, this can save you the month I lost.
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)