You shipped a Lovable app. It works. The buttons click, the forms submit, the data flows.
Then you open it on your phone and the hero section overlaps the nav. Or you squint at a button that's definitely not your brand blue. Or a screen reader announces nothing useful.
This isn't a "you" problem. Testing by Jason Arbon found approximately 160 issues per AI-generated app, and the majority aren't functional bugs. They're visual: layout, spacing, accessibility, color.
Here are the 7 categories that show up in every AI-generated codebase, regardless of which tool built it.
1. Spacing Drift
AI defaults to Tailwind utilities that approximate your design. A 24px spec becomes gap-4 (16px) in one place and gap-8 (32px) elsewhere.
The model picks the closest available class, not the correct value. Multiply this across 50+ components and your layout feels "off" without any single element being obviously wrong.
2. Color Inconsistency
Brand colors outside standard palettes get substituted with nearest matches. Your brand blue (#2563EB) appears as three different shades across buttons, links, and nav elements.
This happens because models infer color from context rather than referencing a single source of truth. Each generation pass introduces fresh approximations.
3. Missing Responsive Breakpoints
AI uses Tailwind defaults: 640px, 768px, 1024px, 1280px. Layouts break at intermediate widths like 834px (iPad portrait) or 900px (common laptop-with-sidebar viewport).
Unless you explicitly prompt for custom breakpoints, you'll only discover these gaps when a real user hits them.
4. Accessibility Failures
The WebAIM Million study found 95.9% of homepages have WCAG failures, averaging 56.8 errors per page. AI-generated code underperforms even that baseline because models deprioritize semantic HTML and ARIA unless explicitly prompted.
Missing alt text, insufficient color contrast, unlabeled form inputs, broken focus order. These aren't edge cases. They're the default output.
5. Typography Mismatches
AI generates text-base (16px/24px) but ignores letter-spacing, font weights, or custom font imports. Your design might spec Inter at 500 weight with -0.02em tracking. You'll get system font at 400 weight with default tracking.
6. Hover and Focus State Gaps
AI produces default component states but frequently omits hover, focus, active, and disabled variants. Buttons that don't respond to hover feel broken. Missing focus states make keyboard navigation impossible.
7. Z-Index Chaos
Z-index values lack a global stacking strategy. Modals render behind navbars. Tooltips clip behind adjacent sections. Dropdowns disappear under hero images.
Every component gets an arbitrary z-index instead of a coordinated system.
Why This Happens Across ALL Tools
Arbon's testing showed a p-value of 0.7199 between Bolt.new and Lovable bug counts. Statistically equivalent. No single AI tool significantly outperforms others because the architectural limitation is universal: models work with tokens, not pixels.
They can't render output in a browser. They can't compare against a design file. They optimize for syntactic validity, not visual fidelity.
The Fix: Batch, Don't Iterate
Iterative prompting ("fix this, now fix that") costs 3-5 million tokens per cycle and introduces new regressions with each pass.
Instead:
- Do a complete QA pass on the full app (15-30 min for a 5-10 page app)
- Batch issues by file rather than fixing one at a time
- Fix in single targeted prompts with specific CSS values, not vague descriptions
A SmartBear survey found 68% of teams say faster AI-assisted development creates testing bottlenecks. The bottleneck isn't the building. It's verifying what was built matches what was designed.
What visual bugs have you hit with AI code generators? Curious if others are seeing the same patterns.
Top comments (0)