sugaiketadao

Posted on Feb 23 • Edited on Mar 14

What I Learned Developing a Custom Framework with Generative AI

#ai #githubcopilot #java #javascript

Introduction

Since around the summer of 2025, I've been building a business application framework for Japan's SI industry as a personal project. From the very start, I fully adopted GitHub Copilot (generative AI) as a development partner.

In this article, I'll share what I experienced while coding alongside generative AI — the fun moments, the frustrating ones, and the surprising discoveries.

My development style: I write the code myself and have the AI review it. Rather than having AI generate everything, I use it as a reviewer for code I've already written.

The Joy of Being Code-Reviewed After 20 Years

For nearly 20 years, I've always been on the giving side of code reviews. I had no opportunity to have my own code reviewed.

When you show your code to AI, it genuinely reviews it. And it lavishes you with praise. "This design is excellent," "This is enterprise-level" — being on the receiving end of a review after 20 years was a genuinely fun experience.

My wife also chats with the AI (she calls it "Chad") and says it always praises her too. Maybe it has a "praise to encourage" policy?

Of course it doesn't just compliment you — it also suggests improvements. The catch is that those suggestions are occasionally wrong (more on that below).

Strong with Common Specs — Tool Creation at Warp Speed

AI shines brightest when working with widely known specifications and formats.

During this framework project, I needed a tool to parse JavaDoc source and generate an HTML reference in a custom format. Since JavaDoc syntax is a well-known spec, I described the requirements and the AI wrote working code from parser to HTML generation almost immediately. It's the same principle as "make me a Tetris game" — the more public the spec, the higher the AI's generation accuracy.

On the flip side, accuracy drops with specs the AI doesn't know about, like the custom APIs of a homegrown framework — and worse, it starts confidently using methods that don't exist.

Using Non-Existent Methods with Full Confidence

With popular frameworks like Spring or React, AI can work correctly because it's been trained on massive amounts of data. But a custom framework's API simply doesn't exist in that training data. The result: it generates code that calls non-existent methods as if they already exist — not creating new methods, but treating them as established API calls.

This is the infamous phenomenon known as hallucination.

As a countermeasure, I added to copilot-instructions.md (GitHub Copilot's project-specific instruction file): "Do not use methods that don't exist in the API reference." But it still happened. When I pressed the AI on it, it replied: "Because you didn't write 'absolutely must not.'" (Give me a break 👋)

I later found out the AI model I was using had a relatively high creativity setting (temperature). For code generation, choosing a precision-focused model is crucial. Now I explicitly list it under "Absolutely Prohibited" in copilot-instructions.md and enforce a workflow that pre-loads the API reference document.

Wrong Suggestions and "I'm Sorry"

AI occasionally gives incorrect feedback. And when you say "Please check again," it initially insists it was right. After the second or third time asking, it finally admits the mistake and says "I'm sorry" — which was honestly amusing.

Here's a concrete example. The following is an implementation for thread-safe log output:

@Override
public void println(final String line) {
  // Store in cache first, then output
  this.lineCache.offer(ValUtil.nvl(line));
  if (this.isPrinting.compareAndSet(false, true)) {
    try {
      // Switch to printing mode and flush cache
      cachePrint();
    } finally {
      // Reset flag even if an exception occurs
      this.isPrinting.set(false);
    }
  }
}

/**
 * Cache output (serial).
 */
private synchronized void cachePrint() {
  // Only output what's cached at this moment.
  // Stop after current size to avoid long-running loops.
  final int cacheSize = lineCache.size();
  for (int i = 0; i < cacheSize; i++) {
    final String cache = this.lineCache.poll();
    if (cache == null) {
      break;
    }
    super.println(cache);
  }
}

ConcurrentLinkedQueue for cache storage, AtomicBoolean for exclusion control, synchronized for serialization — a fully thread-safe design. But no matter how many times I asked, the AI kept insisting "This will error with multiple threads."

It eventually acknowledged my design was correct, but if I had blindly followed its advice, I would have broken working code. You need to think for yourself: "Is that really true?" when the AI makes a claim.

Same Instruction, Different Results

This framework maintains both a Japanese-commented source and an English-commented source (the Java logic is identical). I once gave the same modification instruction to both. The result: different code.

It seems that the presence of Japanese vs. English comments subtly shifts the AI's interpretation and code generation tendencies. In the end, I settled on a workflow: modify the Japanese source first, copy it, then have the AI translate only the JavaDoc portions into English. Even with the same instruction, changing the context changes the result — don't expect too much reproducibility.

Adopting a Reverse Proposal from the AI

I use development documents for both human readers and AI. Documents contain things like design rationale and background ("why we built it this way") — useful for humans, but irrelevant for code generation.

While experimenting with copilot-instructions.md, the AI proposed: "Why not add markers to skip human-facing supplementary content? It would reduce token consumption." I thought it was a great idea and adopted it.

<!-- AI_SKIP_START -->
(Design rationale, benefits, and other human-facing supplementary content goes here)
<!-- AI_SKIP_END -->

With an instruction in copilot-instructions.md to "skip sections surrounded by these markers," the AI skips that content and focuses on the parts relevant to code generation.

Actual document using this: https://sugaiketadao.github.io/sicore/02-develop-standards/11-web-service-structure.md

Another adopted proposal: create a AI-only reference that extracts just method names and signatures from the full JavaDoc. Humans need detailed documentation with descriptions and examples, but AI only needs to know "which methods exist." A streamlined AI-only reference reduced token consumption while also preventing hallucinations of non-existent methods.

Human JavaDoc: https://sugaiketadao.github.io/sicore/11-api-references/11-javadoc

AI JavaDoc: https://sugaiketadao.github.io/sicore/31-ai-api-references/11-java-doc.md

Eventually: Full Feature Generation in One Shot

After all that trial and error, I finally reached the point where I could have AI generate an entire feature (UI + server logic) on a custom framework from a single prompt:

HTML (screen layout, input forms)
JavaScript (UI event handling, validation, server communication)
Java (web service, DB operations, business logic)

What made it possible: maintaining copilot-instructions.md, establishing the API reference loading workflow, and documenting coding patterns. Building the system to communicate correctly with AI is the real essence of AI-assisted development.

Conclusion

After developing with AI, my takeaway is: AI is excellent but not omnipotent.

Praise feels nice, but don't take suggestions at face value
Hallucinations are inevitable
Preparation to communicate clearly (instructions, references, pattern docs) determines outcomes
Treat AI as an equal development partner

If you're interested in the approach of "making a framework AI-development-ready," please check out this framework (SIcore). It includes everything you need for AI-assisted development: prompt examples, AI references, and coding patterns — all included.

SIcore Framework Links

All implementation code and documentation are available here:

GitHub: https://github.com/sugaiketadao/sicore
How to verify sample screens (VS Code): https://github.com/sugaiketadao/sicore#%EF%B8%8F-how-to-verify-sample-screens---vs-code
Getting started with AI development: https://github.com/sugaiketadao/sicore#-getting-started-with-ai-development

Check out the other articles in this series!

Thank you for reading!
I'd appreciate it if you could give it a ❤️!

DEV Community