Using AI Intentionally: A Guide to Token-Aware Development

#ai #programming #productivity #tutorial

There's been a noticeable trend recently with AI models, they are getting smarter, easier to use and increasingly capable, but that capability comes at a cost.

We are now paying for performance in a way that we weren't a few weeks or months ago and because of that, the way we utilise AI tools is shifting. Understanding what consumes tokens and how to use them effectively has become an important skill for anyone working with AI tools.

At the same time, it's important to remember that token optimisation isn't about using fewer tokens to save money, it's about spending tokens where they add the most value and avoiding waste where they don't.

What is eating up your tokens

At the start of your session:
There are a few tools that I know a lot of people use but what they don't realise is that these can take up a large amount of tokens before any prompt has even been given.

MCP definitions, tools and schemas may be included in context provided to the model
Definitions of skills are often included in the sessions context window and can end up costing tokens, especially if there's a large list of skills to be bought in
Large instruction files (CLAUDE.md/copilot-instructions.md)
Context accumulation if restarting a previous session

These are all present at the start of your session and while you may not actually use the MCP tools or require everything in your instruction documentation, tokens may be required to initialise them and setup your session.

During your session:
How high your token usage is during a session will depend on a number of things, including:

Large, verbose model responses
Images and screenshots included in a prompt
Unnecessary files referenced in prompts
Long prompts with unnecessary information to the current context

How to be more token conscious

I feel like a lot of what's mentioned above has become common practice but there are some ways to make your sessions more efficient while also improving on the output.

Keep instruction files short and precise! Instruction files should have the bare minimum to enable your coding tool to understand file structure, coding standards and business specific detail. Instruction files can be read multiple times through a session, especially if you're compacting your session, so every additional line has a token cost.
Initiate MCPs only when required: MCPs can be disabled by default. This ensures that they are only setup to be used when they are actually required.
Be specific in your prompts: Prompts should be specific and to the point. Short and vague prompts can result in poor or incorrect responses which have a lot more back and forth to get the correct result.
Choose models wisely! Don't reach for a high performing model when a low cost model can perform just as well.
Utilise tools: Using tools like /compact or /handoff can help with context and memory when rejoining sessions.
Know when to start a new session! Sessions can get long and complex quickly. Knowing when to stop using an existing session and to start over can help minimise context rot, hallucinations and can provide better overall results.

Recommended tools

There are a number of tools I recommend to help keep an eye on your usage and minimise token spending:

ccusage for token tracking to see where your tokens are being spent
caveman reduces long unnecessary text in responses
handoff helps move important context between sessions without bringing in unrelated work
copilot-skills-budget shows how much of your context window is being taken up by skills

As we start to utilise AI tools more for our everyday coding tasks, it can be easy to fall back into bad practices. I'm sure we've all been guilty of defaulting to the same model for every task when a cheaper model could actually have completed the task just as well. Or forgetting to disable MCPs and then wondering why our token usage this session is so high when we haven't actually utilised any external tools.

It's also very easy to stay in the one session because all our context is there, even though all that context is probably not necessary for the next question we are going to be asking.

While these may seem to make life easier in the short term, in the long term the cost really adds up. For many of us (myself definitely included) we can sometimes have an over-reliance on AI tools because they're there, they do the work faster and the outcome is actually often better than if we coded ourselves.

What we really should be asking, before we even start on token optimisation, is should AI be doing this or is it something I could do as effectively on my own. Token optimisation starts with deciding when AI is the right tool for the job, choosing a model or trimming an instruction file is just the next step in the process.

DEV Community

Using AI Intentionally: A Guide to Token-Aware Development

What is eating up your tokens

How to be more token conscious

Recommended tools

Top comments (0)