We all by now has heard the term Prompt Engineering but ever guessed why do we need it? The answer lies in the fact that current LLMs pose a number of challenges that make reliable and consistent completions more challenging to achieve without putting effort into prompt construction and optimization.
Currently LLM models face two big challenges - Stochastic and Fabrications
Model responses are stochastic:
This means if you ask the model a question and provide a text input, there is no guarantee that it will give you the same response every time. There is no guarantee that the response will be correct, and there is no guarantee that the response will be what you expected.
For example, the prompt: “Tell me about the element Calcium”
gives different responses in different models.
With GPT-4o:
GPT-4o mini:
Clearly responses are not same, i.e. different model can produce different responses for same prompt.
Models can fabricate responses:
LLMs are trained on massive text datasets, but they don't understand the meaning of the words in the prompt or tokens; they just recognize patterns they can "complete" with their next prediction.
We use the term Fabrication to refer to the phenomenon where LLMs sometimes generate factually incorrect information due to limitations in their training or other constraints. This means they can generate realistic responses that are not factually accurate.
For example, with GPT-4o, the prompt:
“Who won the Player of the Match award for the ICC Men's T20 World Cup final 2026?”
As we can see, the two responses for the same prompt vary. The first one is close to being accurate, but the second response has made up a fictional award that hasn't happened yet.
Prompt engineering techniques like metaprompting and temperature configuration may reduce model fabrications to some extent.
Top comments (0)