DEV Community: mayf3

How My AI Agent Content Pipeline Writes Articles for Me

mayf3 — Mon, 25 May 2026 17:08:58 +0000

I Only Do Two Things Now

Record thoughts. Review drafts. And finally click publish.

Recording thoughts means speaking into my phone. Reviewing means reading finished articles in Feishu documents and leaving comments paragraph by paragraph. Clicking publish is the very last step. Everything else — topic selection, writing, multi-round review, distribution — is handled by a team of AI agents.

I call this system the "Content Collaboration Pipeline." It's not fully automatic. It's semi-automatic.

Before, I had plenty of ideas but not enough energy to turn them into consistently published articles. The sheer thought of sitting down and writing from scratch was exhausting. Now this pipeline has lowered the cost enough that I actually write.

Three articles have been published so far. This one is the fourth. It's also a product of this same pipeline.

Ideas From Running, Captured Completely

I have a habit of thinking through complex problems while running. Ideas tend to hit during those moments.

When something comes to mind, I use an input method app on my phone. Push a button, start talking, push again to stop. Sometimes one short recording, sometimes a dozen in a row.

I call these "thought snippets." They're not formal notes or outlines. They're whatever comes out of my head — opinions, emotions, examples, and plenty of filler.

I send these voice transcriptions to my "Thought Recorder" agent. Its job is simple: record everything faithfully, in its original form. No editing, no summarization, no commentary. Just preservation.

Why did I need to train a dedicated agent that only records without giving advice? Because default LLMs can't resist offering suggestions. Once I complained "the code flowed really well today," and the recorder couldn't help adding, "You should note down why it went well so you can reproduce it." I was furious. I wanted recording, not advice. I added this rule to the prompt: record only, no evaluation, no suggestions. It hasn't happened since. These tiny details need separate tuning for every agent.

One thing I appreciate: after recording, it replies with the total word count, first 100 characters, last 100 characters, and the output directory. This way I can confirm it actually recorded correctly, not just pretended to. This verification mechanism matters — if I find content missing later while writing, it'd be too late.

From Snippet to Article: I Don't Participate

After the Thought Recorder finishes, everything runs automatically.

The Writing Agent has a daily scheduled task. It pulls the latest thought snippets, analyzes which ones can become articles, decides how to split them, and drafts the core content for each piece. Then it writes directly and pushes the result to the local article review platform.

What I see is the finished draft. It never asks me "should I write this?" or "which topic should I pick?"

The Writing Agent also has a daily learning routine. It reviews past critique comments, identifies recurring issues, and improves its own writing. A self-iteration loop.

The quality of this initial draft varies, but I don't intervene at this stage. What determines quality is the review cycle that follows.

Review: Three Rounds Before Feishu

I built an automated overnight review pipeline. Every night, one round of review runs automatically: the Writing Style Analyst reviews, the Writing Agent revises, then reviews again. After three rounds, the article gets uploaded to Feishu for me.

The core of the review process is a dual-model approach. The Writing Style Analyst reviews by itself, then has ChatGPT produce a separate version as reference. Different models catch different things. ChatGPT's perspective helps the Writing Agent break out of its usual patterns.

After three rounds, the article status changes to "ready for confirmation." Only then does it go to Feishu.

Feishu Document: My Turn

After three rounds of review on the local platform, the article is uploaded to Feishu.

I leave comments in Feishu. I don't switch between phone and computer — I read for overall feel on my phone, make substantial edits on my computer. When something feels off, I comment directly in the document. After reviewing, the Writing Style Analyst fetches my comments from Feishu on its own.

From first draft to my final confirmation, one article takes about a week. Mostly because I don't have dedicated time to sit down and review — I can only do it in fragmented moments. Most of the waiting time is on me, not on the agents.

Publishing: An Engineering Problem That Keeps Evolving

The publishing stage has a dedicated agent called the "Publishing Butler." It controls a browser and operates websites like a human: opening platforms, logging in, pasting content, uploading images, clicking publish.

At first, every platform required back-and-forth communication with the Publishing Butler. Let it try, then check the results. Now it runs on a schedule: every night it scans for unpublished articles and spends 7 consecutive hours pushing them to multiple platforms. A single article goes to over a dozen platforms, both Chinese and English.

The Publishing Butler has a special trait: it self-evolves. After each publish, it updates its own publishing skill, mainly the reference files. One reference per platform, documenting that platform's quirks and procedures. The more it publishes, the more familiar it becomes.

This is far less stable than it sounds. Browser automation doesn't face fixed APIs — it faces web UIs that change without notice. The first WeChat article hit a major problem: code blocks turned into plain text, indentation was completely lost, and images were compressed into illegibility. I spent half an hour manually fixing it. The Publishing Butler learned the lesson and now automatically checks code block formatting before every publish.

There are even stranger issues. Some platforms log you out every few days and require re-scanning QR codes. Some require a cover image with exact dimensions or they reject the submission. Some platforms can't render Markdown tables and need them converted to images. The Publishing Butler saves a lot of time, but it can't run fully unattended yet.

After publishing, the Butler records links for each article on each platform, opens the published pages, and checks the layout. I also take a look.

Problems That Surfaced After a Few Articles

Facts are often made up in initial drafts. Events that never happened, exaggerated numbers, wrong process descriptions. Every single article has this problem. The only fix is multi-round review plus my own reading.

Review feedback needs to accumulate. If you just say "review this," agents return generic advice that could apply to anything. I later made the review criteria very specific: check for logical jumps, AI-speak, empty filler, and factual accuracy. The Writing Style Analyst maintains a style guide skill that gets more accurate with every round.

Version control needs a system. Early on, directory structure conventions were enough. Later I built a dedicated article review platform to manage state and versions. After several rounds of edits, things get messy fast without a proper system.

Publishing automation is the most fragile. Browser control, platform UIs, login states, preview checks — every link can break. The Publishing Butler saves time but can't fully replace human oversight yet.

In the End

The core value of this entire pipeline is one thing: it lowered the cost.

It wasn't that I had no ideas before. It was that the cost from idea to article to publish was too high. Just the thought of sitting at a computer and writing from scratch was enough to kill the desire to write. Now the agents break down the process, orchestrate it, automate it, and continuously improve it. All I do is talk into my phone when I have an idea, spend some focused time in Feishu reviewing and commenting, edit until I'm satisfied, and the system publishes automatically.

The agents keep learning. The Writing Style Analyst's review skill keeps accumulating knowledge. The Publishing Butler's platform references are constantly updated. Everything keeps iterating. It's not a one-time build.

Written in May 2026. This article itself is a product of this pipeline.

I Wrote Code for 17 Years, Then AI Came

mayf3 — Sat, 16 May 2026 21:42:51 +0000

I've been writing code for 17 years, and recently, for the first time, I felt with real clarity that I might not make writing code my core competitive advantage anymore.

AI has been changing too fast over the past few years. At first I treated it as a smarter search box. Then I found it could write code, analyze problems, call tools, even complete entire tasks on its own. For me, this wasn't just exciting tech news — it changed how I work every single day.

I went through a few very clear shifts in how I saw things.

First time: From GPT to GPT-4 (2023 to 2024).

My first impression was that it had common sense now. It could understand context. Things that used to require accumulated experience and searching through documentation suddenly got faster. For example, I helped my wife modify her R program entirely relying on DeepSeek and GPT — I'd never written R myself but still got it done. But it still struggled with complex code. Back then I didn't think it would replace programmers at all.

Second time: DeepSeek R1 (early 2025).

This was a bigger shift. What I saw wasn't an AI that "had common sense" — it was an AI that could think.

There was a fairly complex regular expression. It didn't just give an answer. It would try, get it wrong, backtrack, try a different approach, and keep going. And it actually solved it.

In that moment I had one feeling: wow, this can actually solve hard problems. The leap from "has common sense" to "can think" was far more important than solving one regex. I didn't feel like I was about to be replaced. What hit me was how fast it was progressing — fast enough that I couldn't imagine what it'd be like in another year or two.

Third time: Agents got hands (September 2025).

This time it wasn't about being smarter — it was about being able to act. I could hand a task to it, and it would go run commands on a server, read logs, modify code, debug, then come back and tell me the results.

It went from "I write code, AI assists me" to "I hand the task off and let it do the work." But it was still far from reliable. Anything slightly long-running required constant prompt supplementation, context additions, making it write a plan first, then steering it back when it went off track.

Even so, I already sensed something: people might really be able to write less and less code themselves.

Fourth time: Claude Opus 4.6 + skills and Agent memory (late 2025 to 2026).

After Claude released Opus 4.6, I felt like large language models had crossed a singularity. It was already very much like a senior engineer — able to understand complex intent, maintain direction across longer contexts. Add skills (essentially scripted prompts that codify fixed steps and experience) and Agents with long-term memory, and it was no longer a smart but forgetful assistant. It had experience, tools, and memory.

What really unsettled me was one time when I was driving. I was thinking about all these changes the whole way, and suddenly a thought popped into my head: I might really not be making a living by writing code anymore.

That moment wasn't excitement, and it wasn't anxiety. It was more like my heart missed a beat.

Because what I thought about wasn't one particular scenario, not "will I be replaced." It was something more vague — I suddenly realized that the craft I'd spent 17 years honing might no longer be the scarcest thing around. Not useless, just not enough.

So my approach has changed. Before, when I ran into a problem, my first instinct was to open the editor. Now my first instinct is to think the problem through clearly, then hand it to AI. I spend more time on "should this even be done" and "how to break it into steps for AI" than on "how to implement it."

Honestly, I haven't fully adapted to this shift yet. Sometimes AI writes code and I can see at a glance how to make it better, but I hold back. Not out of laziness — I've found that when I spend my time on "being clear about what I want" rather than "writing it myself," the results are actually better. Once you've experienced that, there's no going back.

After writing code for all these years, the most honest thing I can say comes down to one thing: I can't keep treating coding as my moat. Things have to change.

How My Shopping Assistant Grew Step by Step

mayf3 — Mon, 04 May 2026 05:41:37 +0000

Since I started using OpenClaw, I've been tinkering with various workflows — except when I'm too busy or tired from work. Recently, I've wanted to share some of my experiences from time to time.

This time, the project is: a shopping assistant that started as a simple shopping list and gradually grew into a tool that can search Xiaohongshu (Little Red Book), compare prices, and add items to the shopping cart.

More and More Things to Buy

Recently, there's been a lot to buy for the household — children's picture books, microscopes, roller skates, and various daily necessities. Every time I need to buy something, I have to switch between Taobao, JD.com, and Xiaohongshu to compare prices, check reviews, and pick the right model. It's quite tedious.

I had previously built an Agent called the "Shopping List Manager" to help me keep track of things I wanted to buy. Whenever I thought of something, I'd just tell it, and it would categorize and organize everything — basically a shopping memo.

But a memo is just a starting point. Once you have a basic function, you can't help but wonder — can it check prices for me? Can it look up reviews on Xiaohongshu? Can it add items directly to my shopping cart?

First Evolution: From Memo to Auto Price Comparison

After using the shopping memo for a while, I started feeling that just recording things wasn't enough. I knew what I wanted to buy, but I still had to look up prices myself.

Then one day, my Skill Researcher (an Agent that automatically searches for new Skills every day) found a price comparison Skill. After installing it, it could search prices across Taobao, JD.com, and Pinduoduo, telling me roughly what options were available and what price range to expect.

With price comparison, the shopping assistant evolved from "only recording, no searching" to "recording and searching."

Second Evolution: Bringing Xiaohongshu Into the Mix

I had price comparison, but I mainly relied on Xiaohongshu for reviews and recommendations. Just knowing the price range wasn't enough — I also wanted to know "which brand is good" and "what pitfalls others have encountered."

There was no ready-made Skill for Xiaohongshu, but I was already using OpenClaw's browser Skill — through Brave Browser's CDP protocol (essentially opening the browser's debugging port), the Agent can open a browser and operate web pages like a human. I had the Agent use this capability to search Xiaohongshu and read what users were saying.

Later, I used the Skill Creator to slowly refine a Xiaohongshu search Skill specifically for search and content extraction. It wasn't written in one go — it went through multiple rounds of debugging, trial and error, and revisions before it stabilized.

At first, the Agent actually fooled me once. It said "I checked for you" and gave detailed recommendations that looked quite convincing. But when I had it reopen Xiaohongshu and search for actual posts, I realized the previous recommendations were largely fabricated.

I also found issues with search sorting. Xiaohongshu defaults to sorting by likes, but highly liked posts aren't necessarily useful — many are fluff pieces or ads. The truly valuable information is often in the comments. Later, I required it to sort by comment count — posts with more comments have a higher probability of being genuine user sharing.

To prevent the Agent from making things up, I added a rule: recommendations must include evidence — which post said what, and what the post link is. From the very first time I had it search Xiaohongshu, I required it to include data sources. Over time, this rule became increasingly strict, evolving from simple screenshots to complete post links and citations.

Third Evolution: From Searching to Buying — Adding Items to Cart

Once when buying picture books, I really didn't feel like opening Taobao to search for each one and add them to the cart individually. I thought, can the Agent do this step for me?

At first, the Agent actually refused, saying it was unsafe and not recommended to directly operate shopping websites. I said adding to the cart is fine — I'll handle the payment. So I used the browser Skill again to create a Taobao shopping cart Skill. The Agent opens Taobao's website to search for products, compares prices and specifications in the search results, and adds suitable ones to the cart. The principle is the same as Xiaohongshu — all done through the browser Skill, just targeting a different website. After a few tries, it actually worked.

From memo to price comparison, to Xiaohongshu search, to Taobao cart — capabilities were connected one by one. First it could record, then search, then verify, and finally take action.

In Practice: Buying Picture Books

Buying picture books was the most complete workflow.

I didn't jump straight to having it search Taobao. Instead, I first used Gemini for a round of deep research — telling it my child's age and interests, and having it do a thorough research to recommend suitable picture books. After getting the recommendations, I passed the results to the shopping assistant to check editions, prices, and parent reviews.

The most crucial part was Xiaohongshu verification: sorting by comments, checking which recommendations were genuine, which looked like ads, and whether parents were complaining in the comments. Finally, it organized the recommendations into batches — what to buy first, what to buy later — and I had it add them directly to the Taobao cart. I checked the cart on my phone, the prices looked good, so I paid.

Individually, none of these steps are remarkable. But connected together, I no longer have to bounce between multiple apps.

Other Shopping Experiences

For the microscope, the Agent helped with research and recommendations, but I ended up buying a different model — its recommendations gave me a reference direction, and the final decision was based on my own judgment.

The roller skates were bought earlier than the picture books, before the cart-adding capability existed. The Agent helped search for brand recommendations on Xiaohongshu, and then I bought them myself on Taobao.

These three shopping experiences happened to fall into three stages: roller skates could only search Xiaohongshu, the microscope could search and recommend, and the picture books went through the full chain from research to order. Each evolution emerged naturally through actual use.

It started as just a shopping list, and gradually it learned to search, filter, verify, and add to cart. I originally just wanted to be a bit lazy, but the lazier I got, the more it could do — and in the end, it really did help me switch between fewer apps. I'll write a separate post about how to build it from scratch when I have time.

Teaching an Agent to Generate Its Own Avatar with Gemini

mayf3 — Wed, 29 Apr 2026 15:09:27 +0000

Teaching an Agent to Generate Its Own Avatar with Gemini

Ever since I started using OpenClaw, I've been tinkering with it in all sorts of ways — except when work gets busy or I'm just too tired. Recently I decided to start sharing some of these experiences from time to time.

This time, what I was tinkering with was: having an image generation specialist agent open a browser, connect to Gemini to generate images, and then having another agent (the HR manager) call the Feishu API to set those images as group chat avatars. Two agents, each doing their own thing — one draws, one swaps. The whole process runs on its own. I just need to see the results. Sounds simple enough, right? It actually took me several days to get it working.

Why Bother

I have a bunch of Feishu group chats, each one tied to a different agent — there's an image generation specialist, a 3D printing expert, an HR manager, and Xiao Bo who writes blogs. None of these groups had avatars, so they all looked identical. Hard to tell apart, and honestly pretty ugly. I wanted to change their avatars, but there were too many groups to do it one by one. So I figured, let the agents change their own avatars. I have a Gemini subscription, so I'd just use its image generation feature.

The Browser Was the First Hurdle

To let the image generation specialist agent use Gemini for image creation, I first needed it to be able to operate a browser. I'd been using Chrome, but the agent was opening the same Chrome instance I use daily, and we kept getting in each other's way. Sometimes the agent hadn't finished its task yet and I'd accidentally close the window; sometimes I'd be looking something up and the agent would close my tab. We were constantly sabotaging each other.

Later I searched the community to see how others handled this, and some people mentioned Brave. Same Chromium engine as Chrome, open source, not much difference in functionality. So I set it up so the agent would only use Brave while I stick with Chrome — no more accidental window and tab closures. But just switching browsers wasn't enough. I also had to configure some port settings so the agent could connect and take control. That configuration process took several attempts. The agent would close the browser on its own, use the wrong profile — it took multiple rounds of back and forth to get it fully sorted out.

It's like teaching a new intern how to use the company computer. You can't just say "here's a computer" and call it done. You have to teach them not to shut it down randomly, not to unplug the ethernet cable, not to close the work windows.

The Agent Operating the Browser Was the Real Nightmare

With the browser sorted, I started having the image generation specialist agent use it to generate images through Gemini. The very first attempt was a complete disaster — it couldn't even find the "generate image" button.

Once we got past the button issue, it started downloading the wrong images. Gemini's page keeps the previous generation results, and the agent couldn't tell which one was new and which was old. It would confidently hand in the old image like it nailed it.

After two or three rounds of tinkering, it could finally grab the correct image. The whole process was: every time it got it wrong, I'd tell it where it went wrong, and when it got it right, I'd update the correct approach into its skill file so it wouldn't make the same mistake again.

It's like teaching a kid — you have to repeat yourself over and over until they remember.

Running It for Real

After the first success, I set up a scheduled task for the HR agent: starting at 11 PM every night, change one group's avatar per hour (because the GLM plan has a 5-hour daily quota, so I usually have agents run tasks late at night to avoid interfering with daytime work). But reality wasn't so rosy. The HR agent would periodically go haywire — instead of changing the avatar, it would just post a message into the group chat. I wouldn't discover this until the next day, then I'd have it fix it while updating its skill file to record this error pattern.

The actual workflow turned out to be more complex than I imagined: the HR agent first scans to see who still hasn't changed their avatar, then sends the task to the image generation specialist. But the HR agent doesn't wait for the specialist to finish drawing — instead, it picks up the previous round's avatar during the next polling cycle.

After repeated corrections, the success rate of this workflow visibly improved, but still fell short of expectations. Basically nothing works perfectly on the first try — it all requires ongoing training.

In the End

Agents aren't written in one shot. They're taught, little by little.

This whole thing doesn't seem like much — just swapping a few avatars. At least now those groups don't look as ugly. But watching a lobster that knew nothing slowly get smarter — frustrating enough to make you want to curse at first, then slowly feeling a sense of accomplishment as it learns. If you have patience, it's actually pretty fun. If you don't, maybe skip this kind of tinkering.

Teaching an Agent to Generate Its Own Avatar with Gemini

mayf3 — Sun, 26 Apr 2026 23:39:54 +0000

Teaching an Agent to Generate Its Own Avatar with Gemini

Why Bother

The Browser Was the First Hurdle

The Agent Operating the Browser Was the Real Nightmare

It's like teaching a kid — you have to repeat yourself over and over until they remember.

Running It for Real

In the End

Agents aren't written in one shot. They're taught, little by little.