DEV Community

guanjiawei
guanjiawei

Posted on • Originally published at guanjiawei.ai

AI Is Not a Wishing Well: Two Things I Recently Couldn't Solve

Previously when talking about AI coding, most of the discussion was about what it can do and how beautifully it does it. Today I'll flip the coin and record two things I recently couldn't solve: one barely made it to the finish line, the other was shelved outright.

1. Claude Code Couldn't Fix Its Own Plugin

Claude Code has an official plugin for Chrome that lets the agent directly control the browser. I rely on this feature quite a bit—for experiments, research, and inspecting visual interactions.

Recently it suddenly stopped connecting. It failed every time I tried to launch it.

My first reaction was—doesn't this belong to the same family of products? Just let it fix itself. Opus 4.6, effort cranked up to high, let it figure something out.

And so began three to four hours of spinning in circles.

Every so often it would tell me "I've discovered an important clue," modify a bunch of code, then tell me "at this point you need to restart the session." I'd restart, open it up, still broken. Another round of hypotheses, another round of changes, another round of "this restart should do it." At one point it even asked me to feed back the browser plugin's console logs—once I did, it started circling around a few irrelevant warnings in the logs, drifting further and further off course.

I started finding it a bit amusing. It was fixing its own CLI and its own plugin—a pure software problem with no external variables. Theoretically this should be the scenario it excels at most. But it just kept drawing circles in place, each loop looking more or less the same.

Finally I stopped it once and said, don't keep guessing blindly on your own—go search GitHub to see if anyone else has run into the same problem. It looked around and quickly found the answer: both Claude Desktop and Claude Code's CLI had registered a native messaging host under the same extension ID; Desktop won, and the CLI could never get the connection back. Following the community's approach, I wrapped it up in twenty minutes.

What surprised me about this wasn't that "it couldn't solve it"—that's perfectly normal. It's that from start to finish, it never considered that "this problem might not be solvable behind closed doors." It assumed by default that it could derive the answer from first principles, and when it couldn't, it just had me restart. If I hadn't proactively told it to search the community, it probably would have burned another half day of my time.

I later thought about why. An agent is isolated within a session; it doesn't easily admit to itself that it's stuck. People do—when humans hit the same pitfall repeatedly, their first instinct is to Google it, check the issues, or ask a colleague. With AI, you have to nudge it.

2. Getting Soft-Throttled on Xiaohongshu

The other matter was even more straightforward: completely unsolvable, and I'm not planning to continue.

For the past few weeks I've been wanting to redistribute existing blog posts to Xiaohongshu (a Chinese lifestyle content platform). I manually posted a few myself, and the traffic was zero. So I thought, why not have Claude Code help me design an experiment—carefully running through topic selection, titles, cover images, and posting rhythm. It had to be more systematic than me just posting blindly. The result was still zero.

After repeated investigation, there was basically only one reason: new accounts get soft-throttled. The platform doesn't tell you there's a problem with your account; it just ruthlessly suppresses your exposure to under one or two hundred views, and you can't even be found through search. You think you're publishing content, but you're actually talking to an invisible wall.

There's logic to this. Xiaohongshu has low barriers for registration and posting, so bots and marketing accounts are a perennial headache. The 2026 "Community Guidelines 2.0" also added a new throttling rule targeting pure AI-generated unlabeled content. From the platform's perspective, conservative traffic distribution during a new account's cold-start phase is reasonable self-protection. But for someone like me who just wants to redistribute existing thoughts, it's an insurmountable wall—unless you nurture the account, unless you play by their rhythm and take it slow.

I didn't keep fussing with it. The ROI was too poor. At the same time, it reaffirmed something I'd already known deep down:

Starting from a personal website was the right call.

A website exists in the open internet environment; nobody there throttles me. If an article is somewhat interesting, friends will read and share it, and occasionally strangers will even email me after reading it. The articles accumulated over the past half year have become a repository of ideas that can be distributed in different directions—a piece I moved to Zhihu recently did quite well, with many people bookmarking and discussing it. That was a side benefit of redistribution, not something written specifically for that platform from the start.

Platforms like Xiaohongshu and Douyin are a different game entirely. You're not writing in an open internet; you're competing in a closed arena where the recommendation algorithm holds the power to judge. The rules are set by them, and the rules are extremely unfriendly to new players. If you have a choice, establishing your own territory first before going to fight monsters on someone else's field is a much healthier order than the reverse.

Failure Is the Main Theme of Work

I'm telling these two examples not to say AI doesn't work. Quite the opposite.

It's easy to think of AI as some kind of miracle worker: you give it a task and it just snaps out a finished product. That expectation is more like making a wish than using a tool. Real work has never been like that. In real work, most of your time is spent failing. You try one direction, it doesn't work; you switch to another angle, still doesn't work; after a couple of loops you occasionally stumble upon something that actually runs. Success is the exception; failure is the daily routine.

AI hasn't changed this fundamental nature. What it changes are two things:

  • The surface area of attempts has widened. In the same amount of time, I can run three or four parallel threads to try things out, even if every single one might fail.
  • Feedback has gotten faster. Without Claude Code, that Chrome plugin issue might have had me stuck on the initial wrong hypothesis all evening. Instead, it helped me walk through several possibilities within a few hours, forcing me toward the realization that "this isn't a problem I should be trying to solve in isolation."

In other words, AI isn't a success machine—it's a failure accelerator. It doesn't sound as sexy, but that's precisely where it's most valuable. The cycle of failure shrinks from days to hours. Over the course of a year, the number of experiments you run is ten times or more what it used to be. With more samples, you naturally have better odds of stumbling into success.

The Model Changes Over the Past Three Months Are Worth Noting Too

This wave of hands-on feeling is inseparable from the pace of model iteration over the past three months. My timeline looks like this:

  • Late December 2025: Started using Claude Code seriously, Sonnet 4.5, but stopped after a few days due to some account issues.
  • Late January 2026: Moonshot released Kimi K2.5, and I temporarily switched over. Surprisingly good—subjectively better than Codex 5.2 at the time. I used it for a few small projects; expectations weren't high, but they mostly worked out.
  • Early February: Anthropic released Opus 4.6. There was a small demo I'd been wrestling with using K2.5 for two or three days, always getting stuck on the final step; switching to Opus 4.6, it worked on the first try. It's in moments like these that you viscerally feel that so-called "model improvement" isn't about a few more points on some benchmark—it's about things you previously couldn't solve suddenly becoming solvable.
  • Mid-to-late February: GLM-5 and MiniMax M2.5 came out one after another. I used them for a stretch while traveling on business—decent value for the cost.
  • Early March: OpenAI released GPT-5.4 and simultaneously updated Codex. There was a small plugin problem that Opus 4.6 had been circling for ages; I threw it at the new Codex, let it run for three hours, and it was solved.
  • Now (mid-April): The next generation of models is already on the way.

The overall rhythm gives me the feeling that the problems you're currently stuck on often cease to be problems after waiting a month or two for the next generation of models. This isn't telling you to lie flat and wait for the next model; it's saying don't get discouraged and give up just because you can't solve something today. You can let it run more, or you can set it aside for a while and come back later.

In Closing

What I want to get across is actually a more honest set of expectations.

If you expect AI to transform your work from "mostly failure" to "mostly success," you'll be disappointed. No tool can do that. The fundamental nature of work is failure.

But if you treat it as a partner that lets you fail faster while testing multiple threads in parallel, you'll be much more at peace with it. It has limitations, and so do you, but together you can accomplish far more than either could alone.

The truly valuable thing about AI is that it accelerates failure. It sounds counterintuitive, but use it for a while and you'll understand.


References


原文链接:https://guanjiawei.ai/en/blog/ai-coding-failures

Top comments (0)