Edvaldo Freitas for Kodus

Posted on Dec 3, 2025 • Originally published at kodus.io

Prompt Engineering Best Practices to Turn LLMs into Reliable Pair Programmer

#promptengineering #prompt #llm

Let’s be honest, coding with an LLM often feels like working with a junior developer, but with wild inconsistency. It can generate a great starter version of your code, and then immediately hallucinate a dependency that doesn’t even exist. We want a senior partner, but in reality it turns into a trial-and-error game. That gap only closes when you apply proper prompt engineering best practices whenever you ask an LLM to do something.

First, Stop Treating the LLM Like a Magic Crystal Ball

Before you even write the prompt, you need to define the problem precisely. LLMs can only work well with what you give them. If the input is confusing, generic, or too large, the output will follow the same pattern: broad, not very helpful, and often incorrect.

Define a Clear and Specific Goal

It sounds obvious, but this is where we fail the most. We ask for too many things at once.

Bad: "Write a user authentication microservice."
Good: "Write a function in Go using the Gin framework. It should be a POST endpoint at /login. It must parse a JSON body with the fields email and password, hash the provided password with bcrypt, and compare it with the stored hash retrieved from a PostgreSQL database."

The second prompt defines the language, framework, endpoint, method, data format, and core logic. There is far less room for the LLM to go off the rails.

Adjust the Scope of the Task

LLMs are great for small and well defined tasks. Problems appear when you try to make them manage long plans full of steps and dependencies. Do not ask it to build the entire car at once. Ask it to assemble the carburetor, then the spark plugs, then the engine block, one piece at a time, the way they work best.

Break the work down. For example, instead of "Refactor this entire class," try a sequence:

"Identify methods in this Java class that exceed 30 lines or have cyclomatic complexity greater than 5."
"Alright, for the method calculateUserPermissions you found, suggest a refactoring strategy to split it into smaller private helper methods."
"Now write the code for the new helper method getUserRoles you suggested."

This step by step approach creates a snowball effect: you keep adding context little by little and keep the LLM focused on one solvable problem at a time.

How to Build a Prompt That Actually Works

A good prompt is not just a question, it is a specification. I have found that a layered structure works best, giving the model everything it needs to succeed. Think of it like a well written Jira ticket.

Prompt Engineering Best Practices for Structure

A good prompt usually has four layers. You will not use all of them every time, but understanding these pieces helps a lot when assembling a request that actually works.

Persona and Context: Tell the LLM who it is and provide the background information.
The Specific Task: Clearly state what you want it to do.
Constraints and Rules: Define boundaries and non negotiable points.
Examples (Few Shot): Show what “good” looks like.

Use delimiters or XML tags to keep these sections organized. The model handles structured formats well.

This structure drastically reduces the margin of error. You define the persona, pass in the relevant snippet, explain exactly which refactoring techniques to use, and still control the output format. The reliability difference compared to a simple “make this component faster” prompt is absurd.

Making the LLM Think Before Writing

One of the biggest advancements in prompting is forcing the model to think before responding. This is the famous Chain of Thought (CoT) and today it is one of the best defenses against absurd answers said with great confidence.

The logic is simple: you ask the model to explain its reasoning step by step before reaching the conclusion. This is how a senior engineer works: they do not start writing code immediately. They first describe the plan and then move on to implementation.

Applying Chain of Thought in Code

Imagine you are debugging a complicated SQL query. Instead of asking for a direct fix, guide the reasoning.

A bad prompt:

"Fix this SQL query. It is slow."

A good prompt with CoT:

"I need you to analyze and optimize the following SQL query. Follow these steps in your response:

Analyze the Query: First, explain what the query is trying to do in plain English. Identify the tables, joins, and filtering conditions.
Identify Potential Bottlenecks: Based on your analysis, list the most likely causes for poor performance. Consider things like missing indexes, full table scans, or inefficient join types.
Propose an Optimized Query: Finally, write the rewritten, optimized SQL query.

Here is the original query:
sql
-- [Paste SQL query here]
"

This forces the LLM to make the “how” explicit. And that changes everything. Now you can inspect the reasoning: did it understand the problem? If the analysis in step 2 is off, you already know the code in step 3 will also come out wrong. It is better to correct the reasoning than waste time testing broken code later.

Debugging Your AI Pair: common bugs and how to fix them

Even with good prompts, LLMs make mistakes. The key is recognizing failure patterns, the classic signs that the AI is slipping, and having a clear way to handle each one.

Hallucinations and how to avoid them

Hallucination is when the model invents something with absolute confidence. The best way to reduce this is simple: put the real information inside the prompt itself.

Do not ask like this: "How do I use AcmeCorp’s new analytics library to track a user_login event?"
Ask like this: "Given the documentation below for the AcmeCorp library, write a Python snippet that tracks a user_login event with a user_id parameter."

And then you paste the real documentation.

The point is: you are not testing the model’s memory. You are testing its ability to understand and synthesize the information you provided. This removes the burden of remembering and greatly reduces the chance of inventing methods, endpoints, or parameters that do not exist.

Recognizing AI error patterns

Pay attention to these common behaviors:

Autocompletion on steroids
When the model loops or just “completes” the code in an obvious and useless way. This almost always means the prompt is too vague. Add more constraints, concrete context, or a clear example of the output you expect.

Anchoring bias
The LLM latches onto the first solution or example and insists on it, even when you ask for alternatives. If this starts happening, open a new conversation and reframe the problem from scratch, more objectively.

Contaminated context
A long conversation where an incorrect piece of information from the beginning starts leaking into every response. When you feel like things have gone off the rails, do not try to save the thread. It is faster to start a new one with a clean and well structured summary of the problem.

In the end, the rule is pretty simple: trust the model, but always validate. Your engineering practices are still what actually guarantees quality.

Turning This Into Habit

If you use LLMs for repetitive tasks, you can’t treat prompts as loose chat messages. They become part of the engineering workflow and need to be handled like any other technical artifact.

Version Your Prompts

If a prompt generates your API client or drives a refactor that follows the company’s style guide, it’s no longer something incidental. It becomes part of your engineering workflow. And it should be in Git alongside the rest of the code.

Why?

Reproducibility. You know exactly which version of the prompt generated that file and can reproduce the result whenever needed.
Collaboration. When the prompt is in Git, the team can review, adjust, and improve it through pull requests like any other part of the codebase.
History. If a new model version breaks a prompt that always worked, you have a clear trail to understand what changed.

Create a Feedback Loop

Test prompts systematically. If you find a prompt that works well for a specific task, save it.

Document what works and what doesn’t. Keep a simple doc or a directory with a “recipe book” of prompts that consistently produce good results. This prevents a lot of rework and keeps the team from reinventing the same thing every week.

Define what “good” means. Align with the team on how to measure whether a prompt is delivering: code quality, fewer back-and-forths, fewer fixes, less noise in PRs. Whatever makes sense in your workflow.

When you treat prompts as part of the engineering process, everything becomes more predictable. You move away from trial-and-error and toward a clear path to a reliable result.

There’s no trick here. It’s just process.

And as LLMs become more embedded in your team’s day-to-day work, this process stops being “nice to have” and becomes essential.

DEV Community