Bob Oner

Posted on May 29

AI-Assisted Development Is Not Autopilot

#ai #productivity #programming #webdev

AI can make coding faster. It can also make messy code faster.

That is the part of AI-assisted development that does not get discussed enough. A model can generate a route handler, a browser userscript, a README section, or a test idea in seconds. But speed is not the same as engineering progress. If the output is not scoped, tested, reviewed, and documented, the project may become harder to maintain even though it was faster to start.

My working rule is simple:

AI can draft, but engineering must decide.

I have been using this rule while building two small developer tools:

FastAPI CSV Quality API, a minimal FastAPI service that accepts CSV uploads and returns a structured data quality report.
ChatGPT Long Conversation Helper, a privacy-first Tampermonkey userscript for collapsing and navigating long ChatGPT conversations locally in the browser.

These are intentionally small projects. That is the point. Small projects are useful for learning where AI helps, where it needs limits, and how to turn generated drafts into reviewable engineering work.

This article is not a prompt guide. It is also not a benchmark, a productivity claim, or a recipe for replacing code review. It is a practical reflection on how I keep AI-assisted development useful without giving up scope control, testing, documentation, and human review.

AI can draft, but engineering must decide

I do not treat AI-generated code as finished code. I treat it as a draft that needs to pass through normal engineering gates.

For small tools, those gates do not need to be heavy. They can be simple:

Is the scope clear?
Is the interface small?
Is the behavior testable?
Are errors handled consistently?
Is the privacy boundary explicit?
Can another developer reproduce the project from the README?
Can I explain what the code does without relying on the original prompt?

That last question matters. If I cannot explain the code after reading it, I do not own the implementation yet.

In my workflow, AI is helpful during exploration. I may ask it to compare approaches, list edge cases, suggest a project structure, draft a README, or propose test cases. But I do not let it decide the final shape of the project. That decision belongs to the developer, because the developer is responsible for the behavior that gets published.

This is the difference between using AI as a drafting tool and treating AI as an autopilot.

Start with a small interface, not a large prompt

The biggest mistake I see in AI-assisted coding is starting with a large prompt that asks for an entire application.

That often produces code, but not necessarily a design.

For the CSV quality API, the useful boundary was not:

Build a complete data quality platform.

That would have been too broad. The useful boundary was much smaller:

A user uploads a CSV file. The API returns a structured JSON report.

That boundary made the project reviewable. It forced the implementation to answer concrete questions:

What is the endpoint?
What does the response model contain?
What happens with an empty file?
What happens with a non-CSV file?
What does a duplicate row count mean?
How should missing values be represented?
Which errors should be structured?

AI could help draft the FastAPI route and suggest pandas checks, but the important engineering work was defining the contract. Once the response shape was clear, the code had something to obey.

The browser userscript had a different kind of interface. It was not an HTTP API. It was a local UI boundary inside the browser.

The useful boundary was:

Add local collapse and expand controls to long conversation messages without sending, uploading, exporting, or scraping conversation content.

That boundary was just as important as an API contract. It prevented the project from drifting into a more sensitive tool. It also made implementation choices easier. The script could use DOM selectors, CSS, MutationObserver, and localStorage, but it should not use external requests, analytics, backend sync, or API calls.

In both projects, the small interface came before the implementation. That gave AI a box to work inside.

Tests turn AI output into reviewable code

AI-generated code becomes safer when it is forced to satisfy tests.

For the FastAPI CSV Quality API, automated tests were the main review tool. The tests were not only checking whether the app started. They were checking behavior that mattered to the API contract:

health endpoint behavior
valid CSV upload behavior
missing value reporting
duplicate row reporting
expected column validation
invalid file handling
empty upload handling

This matters because an API can look correct while silently changing its response shape. A field can be renamed. A ratio can be calculated differently. An error response can become inconsistent. Without tests, those changes are easy to miss.

A simplified test might look like this:

def test_analyze_csv_returns_quality_report(client, csv_file):
    response = client.post(
        "/analyze",
        files={"file": ("sample.csv", csv_file, "text/csv")},
    )

    assert response.status_code == 200
    data = response.json()
    assert "row_count" in data
    assert "missing_values_by_column" in data
    assert "duplicate_row_count" in data

The exact test is less important than the habit. The test says: this is the behavior I expect, and future changes must respect it.

The userscript needed a different testing strategy. Browser UI behavior is harder to protect with a quick pytest suite, especially when the page is dynamic and not controlled by the project. So I used a manual test checklist instead.

That checklist covered installation, single-message collapse and expand, global controls, dynamic messages, refresh behavior, localStorage state, and privacy checks. It also included cases that are easy to forget: code blocks, long lines, Markdown tables, streaming replies, and messages added after the initial page load.

This is still testing. It is just the right level of testing for the project.

The point is not that every small tool needs a full CI pipeline. The point is that AI output needs a review mechanism. For an API, that mechanism may be automated tests. For a browser userscript, it may start with a disciplined manual checklist.

Logs are part of the review surface

Logs are often treated as an afterthought in small tools. In AI-assisted development, I think they are more important.

When a project uses generated code, logs help answer a basic question:

What is the code actually doing at runtime?

For the API project, logs are useful when checking upload handling, parsing failures, and unexpected errors. For a small FastAPI service, I do not need a complex observability stack. But I do need error paths that are visible and understandable.

For the userscript, console warnings are useful when selectors fail or expected message containers are not found. This is especially important because DOM-based tools depend on a page structure that may change. If the script silently stops working, debugging becomes frustrating. A small, clear warning is better than silent failure.

Logs should not leak sensitive content. That is especially important for a tool that runs on conversation pages. Logging message text would violate the project’s own privacy boundary. Logging a generic warning such as “message container not found” is enough.

Good logs do not make the project bigger. They make it easier to review.

Privacy and permission boundaries must be explicit

AI is useful at suggesting features. That is also why the developer needs to say no.

For the long conversation helper, it would be easy to add more features:

export conversations
summarize previous replies
sync state across devices
send content to an API
add search over all messages
collect usage analytics

Some of those features may be useful in other products, but they do not belong in this MVP.

The project is intentionally local. It modifies the browser view. It stores local UI state. It does not transmit conversation content. It does not call the ChatGPT API. It does not automate message sending. It does not export conversations.

That is not only a privacy statement. It is an engineering constraint.

Once the privacy boundary is explicit, every new feature can be reviewed against it:

Does this feature require external requests?
Does it store message text?
Does it touch cookies, tokens, or account data?
Does it turn a local UI helper into a data extraction tool?

If the answer is yes, it is outside the scope.

This is where AI needs the most supervision. A model may suggest a technically possible feature without understanding the product boundary. The developer must decide whether the feature should exist at all.

Ask AI for alternatives, not final decisions

I get better results when I ask AI for options rather than final answers.

For example, in the API project, AI can suggest several response model shapes. But I still need to choose the one that is easiest to understand, test, and document.

In the userscript project, AI can suggest several selector strategies. But selector choice requires judgment. Deep class-name chains may work today and break tomorrow. Shallow role-based selectors may be more stable, but they still need manual testing. There is no perfect answer, only a trade-off that should be documented.

The same applies to error handling, README structure, release notes, and limitations. AI can produce a draft quickly. The developer decides what is accurate.

A useful AI prompt is not:

Build the final solution.

A better prompt is:

Give me three implementation options, their risks, and how I should test each one.

That kind of prompt keeps the developer in control. It turns AI into a reviewer, not an autopilot.

Documentation is part of development

For small projects, documentation is often postponed until the code is done. I try to do the opposite.

A README is not only a marketing page. It is a reproducibility contract. It should tell a reader what the project does, what it does not do, how to run it, how to test it, and where the limitations are.

For the CSV API, documentation needed to explain the endpoint, the response fields, sample data, test commands, Docker usage, and screenshots. Without that, the project would be much harder to evaluate from the outside.

For the userscript, documentation needed to explain installation, privacy, manual testing, troubleshooting, limitations, and release scope. That documentation is part of the engineering work because the tool runs in a sensitive context: a user’s browser session.

AI is useful for documentation drafts. It can help organize sections and turn rough notes into readable text. But documentation still needs technical review.

If the README says the project does not send external requests, the code must support that statement. If the limitations section says selectors may break when the page changes, the implementation should be structured so selectors are easy to update.

Documentation should not make the project sound bigger than it is. Good documentation reduces uncertainty. It does not inflate the project.

Stop before the project becomes too big

AI makes scope creep easier.

Once the first version works, it is tempting to ask for more: a dashboard, a Chrome extension, cloud sync, user accounts, analytics, background jobs, advanced configuration, and so on.

For portfolio projects, that can be dangerous. A small finished tool is often more convincing than a large unfinished platform.

The CSV API did not need to become a full data quality platform. It needed to show a clean API boundary, structured output, meaningful checks, tests, Docker packaging, and documentation.

The conversation helper did not need to become a full browser extension or AI workspace. It needed to solve one local navigation problem with a privacy-first boundary.

Stopping is an engineering skill.

A clear MVP makes the project easier to review. It also makes the writing stronger. Instead of explaining a large incomplete system, I can explain the trade-offs behind a small complete one.

My lightweight AI-assisted development loop

After building these two small tools, I now prefer a simple loop:

Define the smallest useful interface.
Ask AI for options and risks.
Write or review the first implementation.
Add tests or a manual test checklist.
Add logs where debugging would otherwise be unclear.
Document usage, limitations, and non-goals.
Review the code against the original boundary.
Stop before the project becomes unnecessarily large.

This is not a heavy process. It is a lightweight review loop for small tools.

The order matters. I do not want to start with a large prompt and then search for a structure afterward. I want the structure first, then use AI inside that structure.

For the CSV API, that structure was the upload endpoint and response contract. For the userscript, it was the local-only browser interaction model and the privacy boundary. In both cases, the AI-assisted parts were useful because the project already had a small reviewable shape.

What I learned from two small tools

The two projects are different, but they taught me the same lesson:

AI-assisted development works best when the surrounding process is disciplined.

From the FastAPI CSV Quality API, I learned that AI is helpful for turning a rough script idea into an API draft. But the real value comes from defining the response contract and protecting it with tests.

From the ChatGPT Long Conversation Helper, I learned that AI is helpful for exploring DOM logic and browser APIs. But the real value comes from privacy boundaries, manual testing, selector judgment, and clear limitations.

In both cases, the workflow mattered more than the initial code generation.

Conclusion

AI-assisted development is not autopilot.

It is useful when it helps developers move faster through drafts, alternatives, edge cases, and documentation. It becomes risky when generated code bypasses scope control, testing, privacy review, and human judgment.

For me, the practical answer is not to avoid AI. It is to wrap AI inside an engineering workflow.

Small interfaces keep the project understandable. Tests protect behavior. Logs make runtime behavior visible. Documentation makes the project reproducible. Limitations prevent overclaiming. Human review keeps the final responsibility where it belongs.

AI can help write code.

Engineering decides what code is worth keeping.

DEV Community