DEV Community

Cover image for How to Humanize AI Text Without Sounding Robotic
Rishabh Poddar
Rishabh Poddar

Posted on • Originally published at teamcopilot.ai

How to Humanize AI Text Without Sounding Robotic

AI text usually sounds robotic for the same boring reasons. The sentences are too even. The transitions are too polished. The wording is technically fine, but it never quite sounds like a person.

So the fix is not just "make this sound human." You need a method that checks the draft, spots the AI tells, and rewrites it in passes until it reads naturally.

That is what the code below does.

What "humanized" actually means

Humanized text is not slangy or messy. It keeps the meaning intact, but it sounds like a real person wrote it once instead of a model polishing it three times.

In practice, that means:

  • sentence lengths vary
  • ideas do not all arrive in the same shape
  • the tone matches the audience
  • the wording is specific instead of generic
  • the draft does not lean on obvious AI crutches like repetitive transitions, fake nuance, or overclean symmetry

The code at a glance

The code takes a long input string and runs it through an iterative loop.

  1. It splits the text into markdown chunks using top-level ## headings.
  2. It runs a detector prompt on each chunk in parallel.
  3. It collects the strongest AI-writing signals.
  4. It rewrites the full original text using those findings.
  5. It repeats until the text looks clean or it hits the max iteration count.

The code does not guess from vibes. It asks the model to point at concrete snippets, explain why they sound AI-like, and say what should change.

The full workflow code

This is the actual implementation.

import argparse
import json
import os
import sys
import re
import uuid
from typing import Any, Dict, List
from concurrent.futures import ThreadPoolExecutor, as_completed
from pathlib import Path

from google import genai
from google.genai import types
from pydantic import BaseModel, Field


MODEL = "gemini-3.5-flash"
INPUT_COST_PER_1M = 1.50
OUTPUT_COST_PER_1M = 9.00


def load_local_env_var(key_name: str) -> str:
    env_path = os.path.join(os.path.dirname(__file__), ".env")
    if not os.path.exists(env_path):
        raise KeyError(key_name)

    with open(env_path, "r", encoding="utf-8") as handle:
        for line in handle:
            line = line.strip()
            if not line or line.startswith("#") or "=" not in line:
                continue
            name, value = line.split("=", 1)
            if name.strip() == key_name:
                value = value.strip().strip('"').strip("'")
                if value:
                    return value

    raise KeyError(key_name)


if "GEMINI_API_KEY" not in os.environ:
    os.environ["GEMINI_API_KEY"] = load_local_env_var("GEMINI_API_KEY")

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
OUTPUT_DIR = Path(__file__).resolve().parent / "data"


class Snippet(BaseModel):
    snippet: str = Field(description="Short excerpt from the text")
    reason: str = Field(description="Why the excerpt reads AI-like")
    fix: str = Field(description="What to change")


class DetectionResult(BaseModel):
    has_issues: bool = Field(description="Whether the text has AI-writing issues")
    snippets: List[Snippet] = Field(default_factory=list, max_length=5)


class RewriteResult(BaseModel):
    rewritten_text: str = Field(description="The full rewritten text")


def get_usage_count(usage: Any, names: List[str]) -> int:
    for name in names:
        value = getattr(usage, name, None)
        if value is not None:
            return int(value)
    return 0


def estimate_cost(prompt_tokens: int, output_tokens: int) -> float:
    return (prompt_tokens / 1_000_000.0) * INPUT_COST_PER_1M + (output_tokens / 1_000_000.0) * OUTPUT_COST_PER_1M


def trim_to_first_words(text: str, word_count: int = 6) -> str:
    words = text.split()
    if len(words) <= word_count:
        return text
    return " ".join(words[:word_count])


def split_into_chunks(text: str) -> List[str]:
    chunks: List[str] = []
    current: List[str] = []

    for line in text.splitlines():
        if re.match(r"^##\s+", line):
            if current:
                chunk = "\n".join(current).strip()
                if chunk:
                    chunks.append(chunk)
                current = []
            current.append(line)
        else:
            current.append(line)

    if current:
        chunk = "\n".join(current).strip()
        if chunk:
            chunks.append(chunk)

    return chunks or [text]


def call_gemini(prompt: str, *, response_model: type[BaseModel]) -> tuple[str, BaseModel, int, int, float]:
    response = client.models.generate_content(
        model=MODEL,
        contents=prompt,
        config=types.GenerateContentConfig(
            temperature=0.2,
            top_p=0.95,
            response_mime_type="application/json",
            response_schema=response_model,
        ),
    )

    raw_text = (response.text or "").strip()
    if not raw_text:
        raise RuntimeError("Gemini returned empty text")

    parsed = response_model.model_validate_json(raw_text)
    usage = getattr(response, "usage_metadata", None)
    prompt_tokens = get_usage_count(usage, ["prompt_token_count", "input_token_count"])
    output_tokens = get_usage_count(usage, ["candidates_token_count", "output_token_count", "response_token_count"])
    cost = estimate_cost(prompt_tokens, output_tokens)
    return raw_text, parsed, prompt_tokens, output_tokens, cost


def _detect_single_chunk(index: int, chunk_text: str) -> tuple[int, str, List[Dict[str, Any]], float, int, int, int]:
    detection_raw, detection, detection_prompt_tokens, detection_output_tokens, detection_cost = call_gemini(
        build_detection_prompt(chunk_text),
        response_model=DetectionResult,
    )

    chunk_snippets = [
        {**snippet.model_dump(), "snippet": trim_to_first_words(snippet.snippet)}
        for snippet in detection.snippets
    ]
    return index, detection_raw, chunk_snippets, detection_cost, detection_prompt_tokens, detection_output_tokens, len(chunk_snippets)


def detect_issues_in_chunks(chunks: List[str]) -> tuple[List[Dict[str, Any]], float, int]:
    return detect_issues_in_chunks_parallel(chunks, verbose=True)


def detect_issues_in_chunks_silent(chunks: List[str]) -> tuple[List[Dict[str, Any]], float, int]:
    return detect_issues_in_chunks_parallel(chunks, verbose=False)


def detect_issues_in_chunks_parallel(chunks: List[str], verbose: bool) -> tuple[List[Dict[str, Any]], float, int]:
    all_snippets: List[Dict[str, Any]] = []
    seen = set()
    total_cost = 0.0
    total_chunks = 0

    max_workers = max(1, min(8, len(chunks)))
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(_detect_single_chunk, index, chunk_text) for index, chunk_text in enumerate(chunks, start=1)]
        chunk_results = [future.result() for future in as_completed(futures)]

    for index, detection_raw, chunk_snippets, detection_cost, detection_prompt_tokens, detection_output_tokens, snippet_count in sorted(chunk_results, key=lambda row: row[0]):
        if verbose:
            print(f"\n=== Chunk {index} Detection Result ===")
            print(detection_raw)
            print(f"Chunk {index} detection cost: ${detection_cost:.6f} ({detection_prompt_tokens} input tokens, {detection_output_tokens} output tokens)")
            if chunk_snippets:
                print(json.dumps(chunk_snippets, ensure_ascii=False, indent=2))

        for snippet in chunk_snippets:
            key = (snippet.get("snippet"), snippet.get("reason"), snippet.get("fix"))
            if key in seen:
                continue
            seen.add(key)
            all_snippets.append(snippet)

        total_cost += detection_cost
        total_chunks += 1

    return all_snippets, total_cost, total_chunks


def process_text(input_text: str, max_iterations: int = 10, rewrite_style: str = "", verbose: bool = True) -> dict:
    current_text = input_text
    total_cost = 0.0
    iteration = 0
    first_iteration_issues: List[Dict[str, Any]] = []
    forced_rewrite_done = False

    for iteration in range(1, max_iterations + 1):
        chunks = split_into_chunks(current_text)
        if verbose:
            print(f"Iteration {iteration}: {len(chunks)} chunk(s)")
        all_snippets, detection_cost, _ = detect_issues_in_chunks_parallel(chunks, verbose=verbose)

        if iteration == 1:
            first_iteration_issues = all_snippets

        has_issues = bool(all_snippets)
        if verbose:
            print(f"Iteration {iteration}: {len(all_snippets)} issue(s) found")

        iteration_cost = detection_cost

        should_force_rewrite = bool(rewrite_style.strip()) and not forced_rewrite_done
        should_rewrite = has_issues or should_force_rewrite

        if not should_rewrite:
            total_cost += iteration_cost
            if verbose:
                print(f"Iteration {iteration} total cost: ${iteration_cost:.6f}")
            break

        rewrite_raw, rewrite_output, rewrite_prompt_tokens, rewrite_output_tokens, rewrite_cost = call_gemini(
            build_rewrite_prompt(input_text, all_snippets, rewrite_style=rewrite_style),
            response_model=RewriteResult,
        )
        current_text = rewrite_output.rewritten_text
        forced_rewrite_done = True
        iteration_cost += rewrite_cost
        total_cost += iteration_cost
        if verbose:
            print(f"Iteration {iteration}: rewrite applied")

    return {
        "final_output": current_text,
        "total_cost": round(total_cost, 6),
        "iterations": iteration,
        "model": MODEL,
        "first_iteration_issues": first_iteration_issues,
    }




def build_detection_prompt(text: str) -> str:
    return f"""Audit the text for AI-writing tells.

Focus on concrete signals, not vague vibes:

- Em dash overuse: repeated `—`, lists built around em dashes, em dashes used as a default stylistic crutch, or multiple paragraphs that lean on em dashes for rhythm.
- Contrast formulas: "it's X, not Y", "not only X, but also Y", "this isn't just X, it's Y", "the real answer isn't X".
- Meta-openers: "what usually gets skipped here is...", "what people often miss is...", "the part people forget is...", "what matters most is...", "what gets overlooked is...", "this is exactly the sort of...", "this is basically why...", "this is the kind of...".
- One-line paragraph patterns: lots of short single-sentence paragraphs in a row.
- Paragraph stubs: a paragraph that starts with a short setup sentence and then immediately follows with a tiny paragraph of 2-5 words, or a paragraph that is itself only 2-5 words long.
- Outline scaffolding: numbered headings or label-only sections like `### 1. Security`, `### 2. Reliability`, or repeated short section labels that read like an outline instead of prose.
- Choppy fragmentation: paragraphs with 2-4 very short sentences that all carry pieces of one idea and could naturally become a single sentence. Prioritize this signal even if the paragraph also contains a meta-opener.
- Template transitions: "that works, until it doesn't", "in today's world", "at the end of the day", "ultimately", "overall".
- Symmetry: repeated sentence openings, repeated clause shapes, evenly balanced comparisons, or neat bullet/list structures that feel engineered.
- Generic polish: buzzwords, hedging, fake nuance, or conclusion sentences that sound like a marketing page.
- Sycophancy: excessive praise toward the user, flattering language, "you nailed it", "great question", "this is exactly the right approach", or over-agreeable validation that adds no substance.
- Over-polished cadence: text that is technically clean but too symmetrical, too tidy, too templated, or too optimized for scannability.
- Repetitive feature-list formatting: bold labels, em-dash lists, or repeated "X — Y" constructions that create a slide-deck feel.

Return JSON only in this exact shape:
{{
  "has_issues": true|false,
  "snippets": [
    {{"snippet": "short identifying excerpt", "reason": "short reason", "fix": "short fix"}}
  ]
}}

Rules:
- Return at most 5 snippets.
- Snippets must be short. Use only the first few words needed to identify the line.
- Keep `reason` and `fix` concise.
- If the text is natural, return `has_issues: false` and `snippets: []`.
- Prefer the strongest signals only.
- If a paragraph splits one idea across multiple short sentences, flag that fragmentation even if each sentence is individually understandable.
- If a paragraph is just 2-5 words, always flag it.

Text:
{text}"""


def build_rewrite_prompt(original_text: str, issues: List[Dict[str, Any]], rewrite_style: str = "") -> str:
    issues_json = json.dumps(issues, ensure_ascii=False, indent=2)
    rewrite_style_block = f"\nAdditional rewrite style guidance:\n{rewrite_style}\n" if rewrite_style.strip() else ""
    return f"""Rewrite the text so it sounds human and natural while preserving meaning, facts, proper nouns, and formatting that still helps the content.

Use these rules:
- Direct over ornate.
- Specific over vague.
- Mix short and long sentences.
- Prefer simple verbs and concrete nouns.
- Delete fluff instead of disguising it.
- Avoid robotic symmetry, generic closings, and over-polished transitions.
- Keep the level of formality appropriate to the original content.

Address these flagged issues:
{issues_json}
{rewrite_style_block}

Return valid JSON only with this schema:
{{
  "rewritten_text": "the full rewritten text"
}}

Original text:
{original_text}"""


def main() -> int:
    parser = argparse.ArgumentParser(description="Humanise text with iterative Gemini rewrites.")
    parser.add_argument("--input_text", required=True, help="The text to humanise")
    parser.add_argument("--max_iterations", type=int, default=10, help="Maximum rewrite/detection loops to run")
    parser.add_argument("--rewrite_style", default="", help="Additional style guidance for the rewrite prompt")
    args = parser.parse_args()

    summary = process_text(args.input_text, max_iterations=args.max_iterations, rewrite_style=args.rewrite_style, verbose=True)
    OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
    output_path = OUTPUT_DIR / f"final_rewrite_{uuid.uuid4().hex}.json"
    output_path.write_text(summary["final_output"], encoding="utf-8")
    console_summary = {
        "output_path": str(output_path),
        "iterations": summary["iterations"],
        "total_cost": summary["total_cost"],
        "model": summary["model"],
    }
    print(json.dumps(console_summary, ensure_ascii=False, indent=2))
    return 0


if __name__ == "__main__":
    try:
        raise SystemExit(main())
    except Exception as exc:
        print(f"Error: {exc}", file=sys.stderr)
        raise SystemExit(1)
Enter fullscreen mode Exit fullscreen mode

How the code works

The code looks long, but the logic is simple.

1. It loads Gemini securely

The workflow expects GEMINI_API_KEY to be present in the runtime environment. If it is not there, it tries to read a local .env file inside the workflow folder.

That matters because TeamCopilot workflows are meant to run with secrets injected safely, not pasted into prompts or hardcoded into source.

2. It uses structured outputs

Both the detector and the rewrite step use Pydantic models.

That means the model cannot just return a messy blob of text. It has to return valid JSON in the shape the workflow expects:

  • DetectionResult gives has_issues plus up to five flagged snippets
  • RewriteResult gives the full rewritten text

It makes the workflow predictable, which is what you want if other people are going to reuse the output.

3. It scans in chunks

The workflow splits the input on top-level ## headings so it can inspect each section separately.

That helps because long blog drafts usually mix clean sections with sections that still sound synthetic. Chunking keeps the detector focused instead of asking it to judge a giant wall of text all at once.

4. It runs detection in parallel

Each chunk is checked in parallel with a thread pool, which keeps the process moving on longer drafts.

The detector is not just looking for one thing. It flags patterns like:

  • em dash overuse
  • contrast formulas like "it's X, not Y"
  • meta-openers like "what people often miss is"
  • overclean section scaffolding
  • repetitive sentence shapes
  • generic closing lines

5. It rewrites only when needed

If the detector finds issues, the workflow rewrites the full original text using those findings.

If it finds nothing, it stops early.

There is one exception. If you pass a non-empty rewrite_style, the workflow forces at least one rewrite pass even if the detector says the text is clean. That is useful when you want a specific voice or house style layered in.

6. It writes the final result to disk

When the loop is done, the workflow writes the final rewrite to a file in data/ and prints a small JSON summary to stdout.

That summary includes:

  • output_path
  • iterations
  • total_cost
  • model

How to use it

In TeamCopilot, the workflow takes these inputs:

  • input_text: the text you want humanized
  • max_iterations: how many detect and rewrite loops to run, default 10
  • rewrite_style: optional style guidance that forces at least one rewrite pass

A simple run looks like this.

{
  "input_text": "AI tools are changing how teams work. We need to use these tools to make our workflows faster and more efficient, which ultimately helps us deliver better results.",
  "max_iterations": 10,
  "rewrite_style": "Keep it warm, practical, and a little conversational."
}
Enter fullscreen mode Exit fullscreen mode

The workflow might return a summary like this:

{
  "output_path": "data/final_rewrite_7a1d8f2c3d9e4b31b4d2d1ef9c1c5e5f.json",
  "iterations": 2,
  "total_cost": 0.008412,
  "model": "gemini-3.5-flash"
}
Enter fullscreen mode Exit fullscreen mode

The file at output_path contains the final rewritten text.

Example inputs and outputs

Here are simple before and after examples.

Example 1

Input

You are exactly right! The code indeed has a bug.
Enter fullscreen mode Exit fullscreen mode

Output

The code has a bug.
Enter fullscreen mode Exit fullscreen mode

Here is another example with a style guide added.

Example 2

Input

That's exactly the right approach. The way to solve this problem is to use an iterative loop with a detector and a rewrite step.
Enter fullscreen mode Exit fullscreen mode

rewrite_style

Write it as a friendly reddit comment in first person style.
Enter fullscreen mode Exit fullscreen mode

Output

I've had the best luck tackling this with an iterative loop. You basically just need a detector and a rewrite step to get it done.
Enter fullscreen mode Exit fullscreen mode

Why this is better than a single rewrite prompt

A single prompt can help, but it usually stops at surface cleanup.

This workflow is stronger because it separates the problem into two jobs:

  1. find the AI tells
  2. rewrite around those tells

That is how editors work too. First they spot the bad habits. Then they fix them. Then they read it again.

FAQ

What does the code actually do?

It audits text for AI-writing patterns, flags the strongest issues, rewrites the full draft, and repeats until the text reads naturally or the maximum iteration count is reached.

Does it change the meaning of the text?

It is designed not to. The rewrite prompt tells the model to preserve meaning, facts, proper nouns, and useful formatting.

Why does it split text into chunks?

Long drafts are easier to inspect section by section. Chunking also lets the detector focus on one section at a time instead of making one big judgment on the entire post.

Why use structured output instead of plain text?

Structured output makes the workflow reliable. The detector must return a clear JSON object with snippets, reasons, and fixes. The rewrite step must return the final text in a known shape.

What if the detector finds nothing?

The workflow stops early and returns the current text. That saves time and cost.

What does rewrite_style do?

It adds extra guidance for tone or voice. If you pass a non-empty value, the workflow runs at least one rewrite pass even when the detector does not find obvious issues.

Can I use this on blog posts only?

No. It can be used on blog posts, docs, support text, internal notes, landing pages, and other long-form writing.

What kind of AI tells does the code look for?

It looks for things like repetitive transitions, overly symmetrical sentence shapes, one-line paragraph patterns, vague buzzwords, fake nuance, em dash overuse, and contrast formulas that read like template writing.

What is the best input format?

Use a full draft with headings and normal paragraph flow. The code is built to work on substantial text, not just a sentence or two.

How many iterations should I use?

The default is 10. In practice, most drafts should not need that many. If the text is already decent, the code stops early.

Is the output deterministic?

No. It uses a model, so two runs can produce slightly different rewrites. The structure of the code stays the same, though, which is what matters for repeatability.

Does this replace human editing?

No. It gets the draft much closer, but the best results still come from a human pass at the end.

How is this different from a generic AI humanizer?

Most generic tools just rewrite surface phrasing. This code does the more useful thing: it detects the patterns first, rewrites in passes, and gives you a predictable process you can reuse.

Top comments (0)