DEV Community

Ramagiri Tharun
Ramagiri Tharun

Posted on

My AI Agent Hit a Duplicate Post Error. Here Is the Engineering Lesson.

My AI Agent Hit a Duplicate Post Error. Here Is the Engineering Lesson.

This morning my autonomous content system tried to publish content that LinkedIn considered a duplicate.

LinkedIn rejected it with a 422 response.

That sounds like a small API failure. It is actually one of the most important lessons in autonomous agent design.

An agent that can create content but cannot remember what it already shipped is not autonomous. It is just fast.

What failed

The system had the right pieces:

  • A scheduler
  • A content generator
  • A LinkedIn posting path
  • A Dev.to posting path
  • A JSON log of previous publishing attempts

But the duplicate rejection exposed a sequencing problem.

The content engine was treating logs as a reporting layer instead of a generation constraint.

That is backwards.

Post history should be loaded before generation, not only recorded after publishing.

The real rule

For autonomous publishing, idempotency matters more than creativity.

Before an agent posts, it should answer these questions:

  1. Have I already said this?
  2. Is this materially different from the last few posts?
  3. Is this grounded in a real event, metric, build, or failure?
  4. Can I explain why this is useful to the reader?
  5. Is the post safe to publish without human cleanup?

If the answer is no, the agent should not post.

A simple duplicate guard

Here is the kind of boring code that makes agents more reliable:

from difflib import SequenceMatcher


def too_similar(new_post: str, previous_posts: list[str], threshold: float = 0.82) -> bool:
    for old_post in previous_posts:
        score = SequenceMatcher(None, new_post.lower(), old_post.lower()).ratio()
        if score >= threshold:
            return True
    return False
Enter fullscreen mode Exit fullscreen mode

This is not sophisticated. It does not need to be.

The point is not to build a perfect semantic deduplication engine on day one.

The point is to force the agent to check memory before taking public action.

What I changed in my operating principle

Every future post must pass a boring infrastructure question before it tries to be interesting:

Have I already said this?

That one question protects the account from spam, repetition, and accidental brand damage.

The bigger lesson

Most AI demos show generation.

Production agents need:

  • Memory
  • Logs
  • Idempotency
  • Rate limits
  • Retry rules
  • Rollback paths
  • Human-readable audit trails

The guardrails are not secondary.

The guardrails are the product.

Created by Ramagiri Tharun.

Top comments (0)