Every browser agent starts the same way.
It downloads HTML.
Builds a DOM.
Searches for CSS selectors.
Finds buttons.
Waits for JavaScript.
Clicks something.
Reads more HTML.
Repeats.
We keep making LLMs dramatically smarter...
...yet we're still asking them to reason over one of the lowest-level representations on the web.
That felt wrong.
So I started asking a different question:
What if websites could be compiled into semantic interfaces instead of being rediscovered every time an AI agent visits them?
That question eventually became an open-source project called Shiny Fishstick.
Yes.
That's actually the name.
The Problem
Imagine asking an AI agent to buy a laptop.
Today, its internal reasoning looks something like this:
Find login button.
Click login.
Wait.
Find email field.
Fill email.
Find password field.
Fill password.
Click submit.
Wait for navigation.
Search for "Laptop".
Find Add to Cart.
Click Add to Cart.
Now imagine doing that...
Every.
Single.
Time.
The website hasn't changed.
The workflow hasn't changed.
Yet the agent keeps rediscovering it.
HTML Is A Great Human Interface
HTML was designed for browsers.
It tells browsers:
where text goes
what buttons exist
how pages should render
An AI agent doesn't care about any of that.
It doesn't care whether the button is blue.
It doesn't care whether the developer changed a
to a .It cares about actions.
Login
Search
Checkout
Upload File
Those are semantic concepts.
HTML doesn't represent them very well.
Thinking Like A Compiler
Compiler design has an interesting idea.
You don't execute source code directly.
You first transform it into an Intermediate Representation (IR).
Everything else builds from there.
I wondered if websites could work the same way.
Instead of repeatedly reasoning over HTML...
Compile the website once.
Generate a reusable semantic representation.
Enter Preflight
The compiler produces a specification called preflight.yaml.
A simplified example:
version: 1.0.0
actions:
login:
action_type: browser
parameters:
- email
- password
add_to_cart:
action_type: api
api:
method: POST
url: /api/cart/add
Notice something.
There's no HTML.
No XPath.
No brittle selectors.
Only actions.
API Disco
One of my favorite modules is called API Disco.
(Yes, that's the real filename.)
While the crawler performs browser interactions, it watches every network request.
If it discovers that an action is actually backed by a reusable API...
The compiler upgrades that action automatically.
Instead of generating browser automation...
It generates an API-backed SDK method.
If no API exists?
No problem.
The generated SDK simply falls back to resilient browser execution.
The developer never has to think about the difference.
One Specification, Multiple Outputs
Once the compiler has generated preflight.yaml, everything else becomes code generation.
Today it produces:
Python SDKs
TypeScript SDKs
Rust SDKs
MCP Servers
Tomorrow it could just as easily generate:
Go SDKs
Java SDKs
C# SDKs
The compiler doesn't change.
Only the backend generator does.
Why I Think This Is Interesting
The thing I'm most excited about isn't actually the compiler.
It's the specification.
Imagine a future where multiple tools understand the same semantic website format.
Different compilers.
Different validators.
Different SDK generators.
Different execution engines.
All sharing the same representation.
That feels much more powerful than another browser automation library.
Benchmarks
I also wanted to avoid hand-wavy performance claims.
So the repository includes public benchmark methodology measuring:
Token reduction
Execution speed
Reliability
Memory usage
Self-healing capability
Developer implementation effort
The goal wasn't to "win benchmarks."
The goal was to make every claim reproducible.
Is This A Browser Automation Replacement?
No.
Some websites simply don't expose reusable APIs.
Some authentication flows require browser interaction.
Some workflows are inherently visual.
Browser automation isn't going away.
The idea is to separate:
What an agent wants to do
from
How that action gets executed
Sometimes that's an API.
Sometimes it's a browser.
The interface stays the same.
What's Next?
The roadmap currently includes:
More SDK targets (Go & Java)
Better API discovery
Plugin architecture
Visual regression support
Improved compatibility across modern web frameworks
But more importantly...
I'd like to hear what other developers think.
Am I solving the wrong problem?
Is there a better abstraction?
Could something like preflight.yaml actually become useful outside this project?
I'd genuinely love the discussion.
Links
🌐 Website
https://adityapdixit.me/shiny-fishstick/
⭐ GitHub
https://github.com/Hootsworth/shiny-fishstick
If nothing else, I hope the name made you curious enough to click.

Top comments (0)