Hi there,
I recently updated my project and changed my model from sonnet 3.7 to sonnet 4. Unfortunately, I've noticed some issues with sonnet 4 understanding my main prompt and my system prompt. The quality of the thinking response is quite poor compared to sonnet 3.7.
Is there anything I might be doing wrong?
Has anyone else experienced this problem?
Is sonnet 3.7 better than sonnet 4?
Top comments (2)
I've noticed that the models are trained just different enough that the same set of instructions just isn't optimal for both. Not just these two specifically either. The same is true of Sonnet 4 and Opus, Gemini 2.0 vs 2.5, and GPT 4 vs 4.1. I really like Sonnet 4 better for my use case than 3.7, but I invested a couple hours just playing with the instructions to get them returning something useful again.
I have found a couple prompt optimizer plug-ins for VSCode that I wanna try. I'm working on something similar for my team now but I can't decide on an approach. I'm thinking something really simple with just a couple files and a dev container should be plenty. So if that ends up being something they'll let me share, then I will.
Feel the same