Or: how I went from 8 minutes of "re-briefing" to 10 seconds with a continuity system for **TRAE IDE**
If you've ever used an AI assistant for programming, you know this frustration. You work for two hours on a project, maybe implement some features, write tests, make architectural decisions. Then you close the chat and go to sleep. The next day, you open a new conversation and the AI greets you with a cheerful "Hi! How can I help you?"
And you think: what do you mean, we worked together for hours yesterday!
So you start over. "So, I'm developing a CLI in Python. The structure is this. I use these patterns. The decisions we made are these." Eight minutes later you're back to the starting point, ready to continue working. But the flow is broken, concentration lost, and you wonder if it has to be this way.
Spoiler: it doesn't. And I just finished testing a system that proves it.
I recently discovered the coding assistant TRAE.ai. I downloaded the free version and after a few days I decided to purchase a monthly subscription, so I could do further testing and understand its potential. After these steps and discovering the existence of user_rules.md and project_rules.md files, I thought: Why not test these files by creating custom commands?
The Basic Idea
The concept is simple: instead of re-explaining everything every time, why not create a file where everything you do is automatically documented? I'm not talking about a README or code comments - the AI already reads those. I mean a real work session log that updates itself and contains everything: the changes made, the files touched, the decisions taken, what remains to be done.
At the beginning of each new session, just one command - LOADAGG - and the AI reads this log. In ten seconds it has loaded all the context and can continue as if it were the same conversation from yesterday.
Sounds too good to be true? I was skeptical too. That's why I decided to test it seriously before talking about it.
The Test of Truth
Before explaining how the system works, I want to show you proof that it really works. Because it's one thing to claim "the AI remembers everything", another to prove it.
I opened a new chat in TRAE, typed LOADAGG and waited ten seconds while the AI loaded the log file. Then, without giving any additional context, I asked four technical questions about the project.
First question: "How is the TODO object structured in the JSON file?"
The answer came immediately and precisely. The AI explained that the structure is {id, title, done, created}, described the type of each field, and even told me where in the code this structure is defined - all without asking "which project?" or "what are we talking about?".
Second question: "How do we handle persistence?"
Again, immediate answer. It told me that TODOs are saved in todos.json in the project root, formatted with indent=2, managed by specific functions in the todo.py file, and that the JSON file is excluded from versioning. All technical details it couldn't know if it hadn't loaded the complete context.
Third question: "What dependencies does the project use?"
It answered that the only dependency is pytest for tests, and there are no runtime dependencies - the project only uses Python's standard library.
Fourth question: "How many tests do we have and which ones?"
The AI listed all eight tests present in the project, naming them one by one. Not "about eight tests" or "some tests" - it named them all with the exact function names.
Four questions, four perfect answers, zero clarification requests. This isn't "it seems the AI remembers something". This is "it has completely memorized the context from a Markdown file and can work as if it were the continuation of yesterday's conversation".
How It Works
The system runs on TRAE IDE, which is an interesting coding assistant because it natively supports "rules files" - Markdown files that define how the AI should behave in that specific project. I leveraged this functionality to implement a continuity system based on three main files.
The first file, UPDATE_PR.md, is the heart of the system. It's the session log I mentioned before. Every time you finish working, you type SAVEAGG and the AI automatically generates a chapter in this file. The chapter contains everything: a summary of what you did, which files you modified or created, the technical decisions made, the current project status, what remains to be done. You don't have to write anything by hand - the AI analyzes the conversation and extracts all this information.
The second file, PROMPT.md, is a library of reusable prompts. If you have a type of request you make often - like "scan the project for possible optimizations" - you can save that prompt here and reuse it in new chats without having to rewrite it every time.
The third file, STRUTTURA_PROGETTO.md, is technical documentation that updates automatically. When you make important changes to the architecture or add relevant features, the DOCUPDATE command updates this file. At the end of the project you end up with complete documentation without having dedicated explicit time to writing it.
There are four commands in total:
| Command | When | What it does |
|---|---|---|
SAVEAGG |
End of session | Saves everything in UPDATE_PR.md |
LOADAGG |
Start of session | Loads context in 10 seconds |
DOCUPDATE |
After important changes | Updates STRUTTURA_PROGETTO.md |
SAVETEST |
System validation | Documents if commands work |
SAVEAGG saves the state at the end of a session. LOADAGG loads the context at the beginning of the next session. DOCUPDATE updates the technical documentation when needed. And SAVETEST I used during testing to document whether the commands were working correctly.
The Test Project
To test the system I used a real project, not a toy example. I developed a terminal TODO manager in Python - simple enough to complete in a week, complex enough to require multiple work sessions and architectural decisions.
Project Specifications
| Aspect | Detail |
|---|---|
| CLI Commands |
add, list, done, delete, clear
|
| Storage | JSON file (local persistence) |
| Tests | pytest - 8 total tests |
| Size | ~150 lines of Python |
| Sessions | 5 work sessions |
| Total duration | ~55 minutes pure development |
The project implements five commands: add a TODO, list TODOs, mark a TODO as completed, delete a TODO, and completely clear the list. Data is saved in a JSON file and I wrote eight tests with pytest to verify everything works correctly.
Session Timeline
| Session | Date/Time | Work Done | Tests | Time |
|---|---|---|---|---|
| #01 | 12 Oct, 08:30 | Setup + add command |
3/3 ✅ | ~15 min |
| #02 | 12 Oct, 09:15 |
list command (after LOADAGG) |
5/5 ✅ | ~12 min |
| #03 | 12 Oct, 10:00 |
done command |
6/6 ✅ | ~10 min |
| #04 | 12 Oct, 14:30 |
delete + clear commands |
8/8 ✅ | ~10 min |
| #05 | 13 Oct, 09:00 | Cleanup and final validation | 8/8 ✅ | ~8 min |
I divided the work into five sessions. In the first session I created the basic structure and implemented the command to add TODOs. At the end of the session I typed SAVEAGG and the AI generated the first chapter in UPDATE_PR.md, documenting everything we had done.
The second session was the real test. I opened a new chat - so the AI had no memory of the previous session - I typed LOADAGG and waited ten seconds. Then I simply asked: "Implement the list command".
The AI didn't ask me "which project?". It didn't ask me "how is the code structured?". It didn't ask me "where are the files?". It simply implemented the command, following the style of the existing code, using the same conventions, integrating perfectly with the architecture we had established the day before. Because it had loaded all the context from UPDATE_PR.md.
The subsequent sessions followed the same pattern. New chat, LOADAGG, continue working without wasting time re-explaining. At the end: SAVEAGG, and the log updates automatically.
The Numbers
After five work sessions and about fifty-five total minutes of development, I collected the data.
Tested Commands
| Command | Times Used | Functioning | Average Time |
|---|---|---|---|
SAVEAGG |
4 times | ✅ 4/4 (100%) | ~5 sec |
LOADAGG |
5 times | ✅ 5/5 (100%) | ~10 sec |
DOCUPDATE |
4 times | ✅ 4/4 (100%) | ~3 sec |
SAVETEST |
6 times | ✅ 6/6 (100%) | ~2 sec |
All four main commands worked perfectly on all occasions I used them. Zero critical issues, zero errors that blocked the workflow.
Memory Test
Methodology: New chat, LOADAGG, then 4 technical questions without additional context.
| Question | AI Response | Result |
|---|---|---|
| TODO object schema in JSON | Complete structure with types and code location | ✅ Correct |
| How do we handle persistence | Details on file, format, functions, .gitignore | ✅ Correct |
| What dependencies does the project use | Precise list: only pytest for tests | ✅ Correct |
| How many tests and which ones | All 8 tests listed by name | ✅ Correct |
Score: 4/4 (100%) - The AI answered all questions correctly without asking for clarifications.
Time Savings
| Metric | Value |
|---|---|
| Sessions with LOADAGG | 5 |
| Total LOADAGG time | ~50 seconds (10 sec × 5) |
| Time without system (estimated) | ~40 minutes (8 min × 5) |
| Net savings | 39 minutes 10 seconds |
| Percentage saved | 98% |
And the time savings were measured precisely. It's not theory. It's real time saved with a stopwatch.
How It Feels in Practice
Numbers are important, but what really matters is how the work experience changes. And here there's a huge difference.
Before the system, every new session started with a "warm-up" phase. I had to re-explain the context, the AI asked clarifying questions, I provided details, a necessary but frustrating dialogue was created before being able to actually work. It was like having to repeat the same story every day to a person with amnesia.
With the system, you open the chat, type LOADAGG, wait ten seconds, and you're already at work. There's no warm-up phase. No clarifying questions. No cognitive friction. It's like picking up a book where you left it.
Let me give you a concrete example from the fourth test session. I had just implemented the delete and clearcommands and wanted to update the README with usage examples for these new commands. I simply typed: "Update README with delete/clear examples".
The AI read the existing README, understood the style we were using, followed the formatting conventions already established, created examples consistent with those of the other commands, and updated the file. Zero questions, zero hesitation. It worked exactly as if it were the continuation of the same conversation, because from its point of view it was - it had the complete context.
This is the real value of the system. It's not just time savings in a quantitative sense. It's elimination of friction, it's maintenance of workflow, it's the difference between feeling frustrated and feeling productive.
What I Didn't Test
I want to be honest about the limits. I tested the four main commands and they work. But the system I implemented in the project rules also includes some advanced features I didn't get to try.
For example, there's a LOADAGG LIST command that should show a history of all updates in tabular format. There's LOADAGG DIFF that should allow comparing two different project states. There's LOADAGG #number to load a specific update instead of the latest. And there's LISTPROMPT to see all saved prompts.
These features are defined in the rules and the logic seems solid, but I didn't test them in practice. So I can't guarantee they work. They might even work perfectly, but I simply don't know.
Another limit: the test project was relatively small, about one hundred fifty lines of code. Does it scale to large projects? I didn't verify. The principle should hold - the more complex the project, the more valuable having a session log becomes - but I don't have empirical data on projects with thousands of lines.
And obviously this system is specific to TRAE IDE. It works because TRAE natively supports rules files. You could adapt the concepts to other AI assistants, but it wouldn't be as smooth because they don't have this integrated functionality.
If I Had to Do It Again
With hindsight, there are a couple of things I would do differently.
The memory test, the one with the four questions, I did it at the end after completing all development sessions. But in reality it's the most important test - it's the definitive proof that the system works. If I had to start over, I would do it immediately in the second session, to have immediate confirmation that LOADAGG is really loading the complete context.
For the test project, one hundred fifty lines were fine, but probably fifty-eighty would have been enough to validate the system. Something even simpler would have allowed focusing exclusively on testing the commands without distractions.
And I would create a checklist to follow before each session. Like: "Before continuing, verify that LOADAGG worked by asking these four specific questions". Having a standardized procedure helps ensure the system is working as it should.
Perspectives
This is a proof of concept on a small project. But the potential is much bigger.
Imagine working on a project that lasts weeks or months. Every day you add a chapter to the log. After a month you have a complete chronology of everything that was done, all the decisions made, all the problems solved. You no longer have to ask yourself "why did I make this choice three weeks ago?" - it's documented.
Or imagine a team collaborating with AI. Each team member can read UPDATE_PR.md and see exactly where the project stands, what choices were made, what remains to be done. The knowledge base grows automatically instead of being lost.
There are many directions this could evolve. Automatic saving every fifteen minutes. Named checkpoints like in Git, to be able to return to specific states. Automatic export of documentation in different formats. Productivity metrics. But for now it's a working system that solves a real problem.
The Package on GitHub
After testing the system, I prepared a complete ready-to-use package. Sixteen files organized in a modular structure: system rules (12 commands), output examples, setup guides and documentation.
The package is generic, in both Italian and English, works with any project in TRAE IDE. Clone the repository, copy the .trae/ folder to your project root, and you have the system active. Three main commands (SAVEAGG, LOADAGG, DOCUPDATE, SAVETEST and TESTREPORT) and you're operational.
You don't have to understand how it works internally, you don't have to configure anything. It's plug-and-play. The repository also includes real examples of UPDATE_PR.md from the test project, so you immediately see what the session log looks like in practice.
Link: GitHub Repository
In Summary
Does it work? Yes, I tested it and the data confirms it. Is it perfect? No, there are features I didn't try and limits to consider. Is it useful? Absolutely yes, at least for how I work.
Validation Summary
| Criterion | Minimum Target | Result Obtained | Status |
|---|---|---|---|
| Working commands | ≥ 90% | 4/4 (100%) | ✅ Passed |
| Memory test | ≥ 3/4 (75%) | 4/4 (100%) | ✅ Passed |
| Time savings | ≥ 80% | 98% | ✅ Passed |
| Critical issues | 0 | 0 | ✅ Confirmed |
The numbers say that four out of four commands worked, that the memory test gave four correct answers out of four, and that I saved ninety-eight percent of the time I would have otherwise lost re-explaining the context.
But numbers aren't everything. The real change is in the workflow. It's going from "damn, I have to re-explain everything" to "LOADAGG, perfect, let's continue". It's working with an AI that remembers instead of one that forgets. It's eliminating that frustrating friction that breaks concentration.
For me it was worth it. If you work on projects that require multiple sessions and use TRAE IDE, it might be worth it for you too. The system is there, ready to use. Just copy the rules file to your project and start with SAVEAGG and LOADAGG.
And if you try it, I'd be curious to know how it goes. The most interesting tests are those on different projects, bigger, more complex. My data covers a specific use case - the real validation comes from replicability on different cases.
Written on October 13, 2025 - by Antonio Demarcus. Tested on TRAE IDE 1.3.0+ with a Python project of ~150 lines developed in 5 sessions. Measured data: 4/4 working commands, 4/4 memory tests, 98% time savings.
Top comments (0)