How to use LLMs effectively in your daily work — a practical tutorial
1. Core principles for dev work
- LLMs are best at transforming and iterating on artifacts (requirements → design, design → code, code → tests, code → docs).
- You get better results from small, focused prompts than from “build my whole system” requests.
- Always run outputs through a structured review (alignment, accuracy, completeness, risk) rather than trusting “smart‑sounding” text.
Exercise:
Next time you’re stuck, instead of “write this whole feature,” ask: “Which artifact is missing here (requirements/design/code/tests/docs)?” Then ask the model for that artifact only.
2. Prompt patterns that actually help
Several prompt patterns are particularly useful for software engineering.
a) “You are X” role prompts
Give the model a role, constraints, and an output format.
Pattern
You are a senior {language/framework} engineer in a {domain} team.
Goal: {one clear goal}.
Constraints: {standards, stack, limits}.
Output: {bullet list, code block, checklist, etc.}.
Example (backend)
You are a senior TypeScript engineer on a fintech backend.
Goal: suggest a safe design for retrying failed payments.
Constraints: Node 20, PostgreSQL, no external queues.
Output: brief architecture description, then pseudocode for the retry logic.
Exercise:
Take a current task and rewrite your next AI question into this pattern. Compare output quality with your usual “casual” prompt.
b) Atomized (single‑purpose) prompts
Break work into tiny steps: one prompt per sub‑task.
Instead of:
Build a full auth system with JWTs in NestJS and write tests.
Use a chain like:
- “Given this app description, list the auth concerns I must solve.”
- “Design the data model and endpoints for auth.”
- “Generate NestJS code just for the login endpoint using my style guide: …”
- “Generate unit tests for this login endpoint using Jest.”
This is “task decomposition” or “decomposed prompting.”
Exercise:
Pick one real task you have this week. Write a list of 5-8 prompts that would get you from problem → working, tested code. Use them sequentially instead of one mega‑prompt.
c) “Critic” and “referee” prompts
Ask the model to attack or review something, not to create from scratch.
Patterns
-
Code review:
Act as a strict code reviewer. Given this code and these standards, list specific issues and suggested changes.
-
Design critique:
Act as a system architect. Here is my design. Identify scaling risks, failure scenarios, and unclear responsibilities.
GitHub’s Copilot guidance is essentially this: have AI review snippets and propose improvements; you still decide what to accept.
Exercise:
Paste one of your recent PRs (or a simplified version) and ask:
“Act as a strict reviewer. What are the top 5 issues (correctness, readability, performance, security)?”
Then decide which comments you’d actually address.d) Self‑check / reflection prompts
Use the model on its own output:
Re‑read your last answer. Identify at least 3 potential errors, missing edge cases, or unsafe assumptions. Propose concrete fixes.
This leverages “introspective” prompt patterns shown to improve quality.
Exercise:
After a long answer you like, always follow with a self‑check prompt like above and compare.
3. Task decomposition: when and how
Trying to have an LLM build a whole feature usually produces fragile, untestable code. Instead, decompose tasks along artifacts and complexity.
Good decomposition axes
- By artifact: requirements → API spec → data model → endpoint stubs → implementation → tests → docs.
- By complexity: pure functions first, then integration, then UX / wiring.
- By risk: ask the model to explore risky areas (concurrency, auth, failure modes) separately before coding.
A blog on task decomposition shows exactly this: instead of “fix this whole feature,” it breaks debugging into focused questions like “What causes this error message?” and “What needs to change in auth.js?”
Example sequence (bugfix)
- “Here is the failing test and error message. Summarize what’s going wrong.”
- “Here is auth.js. Point out lines likely related to this failure.”
- “Propose a minimal fix that passes the test without changing behavior elsewhere.”
- “Generate a new test that would catch regressions for this bug.”
Exercise:
Take a real bug. Before you touch it, use those four prompts. Compare your own diagnosis to the model’s; note where it helped or confused things.
4. Choosing between ChatGPT, Claude, and Gemini
You can treat different models as tools in a belt, picking per task.
Common usage patterns reported by practitioners:
- One model for heavy coding and refactoring (often Claude or a code‑tuned ChatGPT).
- One model for deep research and reasoning (often ChatGPT’s higher “thinking” modes).
- One model for multi‑modal or tooling‑heavy tasks like using APIs, browsing, or working with code in context (often Gemini or ChatGPT with plugins).
A typical “toolbelt” approach is to start with the fastest model to brainstorm or generate ideas, then move to a slower, more deliberate one to structure and refine.
Practitioner example:
One architect uses Gemini wired into Claude so Claude can work over a large codebase, then uses Claude itself for heavier coding tasks and iterative refinement-but still doesn’t ask it to “design the complete product.”
Exercise:
Write a small “LLM playbook” for yourself like:
- “Use ChatGPT for: …”
- “Use Claude for: …”
- “Use Gemini for: …”
Then force yourself to follow it for a week and adjust based on what actually worked.
5. Reviewing AI‑generated code like a professional
Think of AI as a junior dev who can type very fast but is overconfident. Your job is to review.
Review checklist (adapted from real guidance)
GitHub suggests validating that AI code compiles, runs tests, and matches your requirements and patterns. A broader “reviewer‑mode” checklist adds alignment, accuracy, completeness, and risk.
Ask yourself:
- Alignment - Does this solve the problem I actually asked? Or just something nearby?
- Accuracy - Are library calls, types, and APIs real and correctly used? (Check docs as needed.)
- Completeness - Are important edge cases, errors, and validations covered?
- Risk - Could this introduce security holes, data loss, race conditions, or performance issues?
Exercise:
Take one AI‑generated file. For each of the four dimensions above, write 1-2 concrete comments (like you would in a PR). Only then decide what to keep.
6. Keeping your judgment sharp
The risk with heavy AI use is “outsourcing thinking.” You can avoid that by explicitly practising evaluation.
A 2026 article on “reviewer‑mode” suggests: strip away tone and just inspect the logic and assumptions. Ask:
- What assumptions is this making? Are they stated or hidden?
- If the same structure were used with different facts, could it still look convincing but be wrong?
- Would I accept this from a junior colleague without edits? If not, what would I ask them to change?
Concrete habits
- Always do one alignment check, one accuracy check, and one risk check on any non‑trivial AI suggestion.
- Periodically solve a problem without AI first, then compare with AI’s approach to calibrate your own skills.
- Use AI to explain concepts in your own code: “Explain this function line by line and how I might break it.” This can surface gaps in your understanding without surrendering agency.
Exercise:
Pick one AI answer this week and spend 10 minutes writing a short “review” of it in a markdown file: what’s good, what’s wrong, what you’d change, what you learned. That’s how you turn AI usage into deliberate practice.
7. Concrete usage examples in real projects
Real‑world reports show successful patterns like:
- Drafting code and then iterating with reviews: using LLMs to suggest code snippets, debug small issues, and refactor, while the human controls architecture and final decisions.
- Using AI to write tests and docs: generating unit tests for new code and creating API docs from comments, so humans can focus on tricky logic and design.
- Architecture feedback: presenting a solution to ChatGPT or Claude and asking for an architect‑style critique before implementation.
Mini‑project exercise:
Take a small feature (e.g., “add rate limiting to this API”) and intentionally use AI in each phase:
- Requirements clarification (LLM helps list user and system constraints).
- Design critique (LLM attacks your design).
- Skeleton code generation.
- Test generation.
- Documentation draft.
At each step, you review and edit as if a junior wrote it.
8. Practice exercises you can repeat
Here’s a compact set you can cycle through:
-
Decomposition drill:
- Pick any ticket. Write a 5-10 step LLM interaction plan (what you’ll ask and in what order).
- Run it; after, write 3 notes: what helped, what wasted time, what you’d change.
-
Prompt‑pattern drill:
- For one task, try: role prompt → atomized prompts → critic prompt → self‑check prompt.
- Compare the effect of adding each layer.
-
Model‑choice drill:
- Solve the same problem with ChatGPT, Claude, and Gemini, using identical initial prompts.
- Evaluate each output with the review checklist; decide which model you prefer for that category of task.
-
Red‑team drill:
- Ask an LLM for a solution you know is tricky.
- Then explicitly ask it to find flaws in its own solution.
- See which issues you can spot that it missed.
If you tell me your main stack (e.g., “TypeScript/React”, “Java/Spring”, “Python/data”), I can turn this into a short, concrete “training plan” with stack‑specific example prompts and exercises.
What tech stack and kind of work (feature dev, backend APIs, data, DevOps, etc.) are you mostly using LLMs for right now?
Rizwan Saleem — https://rizwansaleem.co
Top comments (0)