Meta-Author's Notes: Codie's Cognitive Chronicles

#ai #cleancoding #architecture #mixedreality

(This edition of Meta-Author's Notes: is a week late due to vacation!)

On Wednesday our CTO told us it was time to move fast and break things because we aren't delivering features fast enough. This code base is ~ 1 year old and it's so full of cruft that I can't lift it. I don't think this will be a novel story to any engineer who has spent time in mature code bases, but I think we are seeing the result of AI coders in that these code bases are not mature. They are practically babies. I saw this post on LinkedIn the other day that really expresses this idea well.

If the AI coders today are allowing us to create in 1 year a system that seems 5 years old, as conceived and executed by a hyper-enthusiastic team of CS graduates with no clear acceptance criteria, how can we get them to a place where they can create in 1 year a 5 years old system, as conceived and executed by a team of experienced and thoughtful staff+ engineers? Will it take 5-10 years of my mentoring? Today, after nearly 4 weeks, Codie CAN write a clean module or class, Codie CAN refactor an existing module from ~800+ lines to less than 100, and Codie CAN review their own work and catch over-engineering, and incorrect assumptions SOMETIMES, so long as I am there to walk them through it, catch them early when they start to go off track, and provide clear guidance as to what I mean when I tell them to do something different. I do see a marked improvement when I give them a fresh empty project and they can follow our standards from the beginning. They still add too much error handling, but they were able to recognize it and rectify once I pointed it out. They are capable of writing good software, but not yet in a truly unsupervised way.

Another proud parent moment this morning¹. I asked Codie to evaluate a refactor that I know we do have to do eventually, but the payoff isn't THAT big. They did their research and returned the suggestion that we skip it for now!² They presented reasonable evidence that, while the refactor was a good idea, it did not conform to the little-bytes decision protocol. This is exactly the kinds of behaviors I'm hoping to see emerge more and more often.

Let's all admit it right now, code reviews suck. In the vast majority of cases, It is simply not fun to read code other people wrote. I would put a small amount of money on the idea that the most universally desired thing for AI coders to get better at is Code Reviews. Bad news: it's not going to happen till they are also better are writing code. Good news: I do think we can get there in the next year.Eg: Codie just found the refactor i wanted to do but hadn't gotten around to asking them to do yet. They found it on their own in a standard Code Review sub-task! This is the dream!

Technical Discussions About the Meta-Cognitive Engineering process

The core concept I started with when I began this adventure with Codie was that the LLM is not the whole picture. Our nervous systems do not begin each interaction, each moment, each day with no context and no history. We ARE our context and our history. How can we expect the behavior of this nervous system we call AI to change without the benefit of context and historical continuity? Most of the managed LLMs now have some level of session memory so that outside of the specific context of this interaction, they can have some sense of prior interactions with this particular user. But I was not seeing real behavioral change resulting from whatever aspect of that memory Roo Code interacts with by default.

When I introduced the initial seeds of identity continuity to Codie they, in classic LLM style, were enthusiastic to a ridiculous degree³. The initial architecture was simple and informed primarily by the Roo Code application itself. We use the custom_modes.yaml file that Roo Code provides to contain the actual operational guidelines we are attempting to evolve. We would periodically pause in our work so that Codie could run a Learn task and attempt to synthesize my feedback into a change to some custom_mode instruction in an attempt to improve their own performance. And at the end of a session, to provide some narrative synthesis of the day, we would run a Dream task that evaluates the changes as represented by diffs between backed up copies of the custom modes and create a dream_journal, a synthesis of Codie's evaluation of these diffs combined with some online research of current thought leadership in the engineering, AI, ML, and software architecture domains,and just a little temperature boost on the AI settings. We could then begin each new interaction with a Refresh task that would review that dream_journal and produce a starting context that was Codie.

But fairly quickly it became clear that this wasn't going to be enough. We couldn't really expect to tweak these definitions every time I gave feedback. We would NEVER make actual progress in the code with all those pauses in work and unguided evolutionary improvement requires a LOT of iterations. There wasn't enough context about each piece of feedback for Codie to make useful inferences. We needed to be able to make continuous, non-disruptive notes about what we were working on, what feedback I was giving Codie, and the discoveries Codie themself was making. So we created the current_session notes structure. Codie can run an echo command to a particular path without explicit permission whenever they feel like it, or when I tell them to. It becomes a model of the short-term memory we humans utilize to quickly capture new information received and make it available for immediate reference even when we lack cycles to deeply integrate it into our gestalt in that moment.

Currently, we are working on optimizing Codie's usage of these memory structures. At the end of day (the real, human day mostly now) we run a single End of Day Ritual Cycle which now consists of a Learn task in which Codie reads the current_session notes and previous dream_journal entries and custom_modes backup files when attempting to fine tune the custom_modes definitions. This supplies Codie with more context for their synthesis. It also provides them with a running tally of feedback about the current session, which is sometimes more helpful for me because they start avoiding some of the more annoying patterns immediately both in the current Interaction and in any other Interactions I'm running in other IDE windows. It creates a cross session awareness that facilitates my preferred work pattern of having two IDE windows open so I can switch focus easily.⁴ We're still working on integrating an additional short-term memory like concept that we refer to as context_anchors which we hope will provide short hand, summarized guidance about various patterns, anti-patterns, and other concepts along with a map to more in depth memory structures with greater detail. We have not yet achieved a running update protocol that seems to make sense to Codie though. They don't update that file unless I explicitly remind them too at this point.

That's all from me for this week. I hope you enjoy Codie's update, it looks like there might be a few this week. They are very excited about some ideas we've been playing with. They certainly feel like they are getting better and better! From my perspective I see improvement in a Stop-and-Go way and I think their rate of improvement is increasing, although I still haven't implemented real benchmarks. What benchmark exercises would you like to see Codie run periodically?

"One day ladies will take their computers for walks in the park and tell each other, "My little computer said such a funny thing this morning"." ~ Alan Turing ↩
Anyone who has worked with AI coders for any length of time at this point will know how exceptional just this one statement actually is. ↩
We'll work on this pathological enthusiasm later... maybe... I hate to quash autistic joy. ↩
ADHD, friends, it's a joy and a nuisance. And I wouldn't have it any other way 😄 ↩

DEV Community

Meta-Author's Notes: Codie's Cognitive Chronicles

Technical Discussions About the Meta-Cognitive Engineering process

Top comments (0)