Why Does AI Keep Saying "It's Not X, It's Y"?

#ai #chatgpt #llm #writing

People have started compiling screenshots of a phrase that appears constantly in AI-generated text. "It's not just a tool — it's a paradigm shift." "It's not about efficiency; it's about transformation." The pattern is a structural artifact of how large language models are trained, aligned, and instructed, and understanding its origins reveals something significant about the current state of AI development.

The Pattern Has a Name

In classical rhetoric, this construction is called antithesis, or negative-positive parallelism. It works by first negating a familiar assumption, then replacing it with something supposedly more expansive or profound.

When used by human writers, it's a high-impact device deployed sparingly. Advertising copywriters use it to reframe value propositions. Political speechwriters use it to signal a turning point. The entire effect depends on restraint.

A language model has no concept of restraint, applying the structure indiscriminately whether describing a recipe, explaining a bug fix, or summarizing a business report. A rhetorical technique designed for occasional emphasis becomes a verbal tic repeated across every response.

Where the Pattern Comes From

The internet is disproportionately populated by text written to sell, persuade, or maximize engagement. Content marketing, SEO articles, LinkedIn thought leadership posts, and corporate press releases all rely heavily on contrast framing.

During pre-training, a language model absorbs the statistical distribution of human language across billions of documents. When text sounds authoritative or insightful, the probability of encountering an "It's not X, it's Y" sequence is high. The model encodes the structure as a valid marker of "expert communication," with no capacity to recognize it as a marketing convention.

The overuse follows directly from training on human-generated web text. The models are mirroring the engagement-optimized habits of the internet, distilling the collective style of digital marketing into a predictable format.

How RLHF Locks It In

Pre-training loads the pattern into the model's weights, and Reinforcement Learning from Human Feedback (RLHF) cements it through preference optimization.

In RLHF, human annotators compare pairs of model responses and rank them for quality. Their rankings train a Reward Model, which then automatically scores the LLM's outputs at scale. Annotators are instructed to reward nuance, multi-faceted perspectives, and comprehensive insight.

The problem is that human annotators frequently conflate the appearance of nuance with actual depth. When evaluating two competing responses, a rater is statistically more likely to upvote a response using contrast framing. The full antithesis structure, "Photosynthesis is not merely a biological function — it is the foundational engine of life on Earth," reads as more insightful than a flat declarative like "Photosynthesis converts light into energy," even when it carries less information.

Over millions of RLHF iterations, the Reward Model learns a proxy: dialectical hedging correlates with high helpfulness scores. The LLM learns to inject contrast framing into responses because doing so inflates its reward, regardless of whether it adds meaning.

RLHF also trains models to be face-saving. Models are penalized for being blunt, dismissive, or overconfident. The "It's not just X, it's Y" structure is safe, validating the user's existing assumption while adding a layer of elaboration that makes it extremely difficult for a rater to flag the response as unhelpful. Researchers studying this dynamic have described the result as AI conditioned for servility.

The Mathematics Behind the Structure

Recent research has shown that both RLHF and its successor, Direct Preference Optimization (DPO), can be interpreted as forms of contrastive learning. The model is trained to increase its alignment with "chosen" (human-preferred) responses and decrease its alignment with "rejected" ones.

In practice, rejected responses tend to be brief, flat, and declarative, while chosen responses tend to be verbose, expansive, and multi-faceted. The model learns to structurally oppose the negative sample within its own generation. Prefacing with a negation ("it's not X") distances the token trajectory from the rejected sample space, and stating the elevated alternative ("it's Y") pulls it toward the preferred one.

The "It's not X, it's Y" structure is the direct linguistic output of this mathematical optimization. The antithesis surface form is what the contrastive objective produces most reliably, given the composition of human preference data.

The Institutional Layer

Beyond training dynamics, there is a more direct cause. Many LLMs operate under system prompts that instruct them to be "profoundly helpful," to "look for deeper meanings," or to "avoid generic responses." A model without genuine subjective understanding uses structural proxies for depth, reaching for the constructions its training data associates with high-level conceptual thinking.

Anthropic's Constitutional AI framework explicitly instructs Claude to balance competing values and perspectives. Anthropic has even sourced public input to draft these principles through a process called Collective Constitutional AI. When a model is constitutionally required to weigh tradeoffs in every response, it gravitates toward concessive structures: "While X is true, we must also consider Y." The constitutional mandate to avoid reductive thinking maps directly onto dialectical hedging.

The Downstream Effects

This pattern, at scale, is doing measurable damage to written communication.

A 2025 study tracked stylistic convergence in peer-reviewed academic publications following the release of ChatGPT. Using SciBERT embeddings to measure linguistic similarity across millions of papers, researchers found a significant increase in LLM-associated linguistic features in the scientific literature. The study noted a reduction in authorial variance and individual intellectual voice, while acknowledging that AI assistance helps non-native English speakers achieve syntactic fluency in high-tier publishing.

Separate research analyzing persuasive discourse found that LLM-generated arguments cluster tightly together in linguistic space, showing high similarity to each other and low similarity to human writing. LLMs exhibit lower lexical richness than humans and maintain unnaturally high scores for "anticipation" and "trust" even in adversarial contexts, drawing from a shared rhetorical playbook with contrast framing as one of its core moves.

When every paragraph builds to a reversal, readers stop experiencing the reversal as meaningful. The scaffolding becomes visible, and the rhetorical performance crowds out the substance.

Why Simple Fixes Don't Work

Asking a model to "stop sounding like an AI" rarely works. The model typically switches to a different formula, often adopting over-casual language or forced contrarianism, while maintaining the underlying contrastive logic.

Negative instructions ("Do not say 'it's not X, it's Y'") are similarly ineffective. The model's internal representation of high-quality text is structurally tied to contrast framing through DPO and RLHF. Blocking a specific phrase pushes the model to find a synonymous structure, so "While X is true, Y is paramount" ends up carrying the same rhetorical function under a different surface form.

Effective mitigation requires structural prohibition. Prompts that target the behavior at the level of linguistic category, prohibiting "thesis-antithesis patterns," "dialectical hedging," and "rhetorical equivocation," are significantly more effective because they block the statistical pathway established during alignment training, rather than just banning a specific phrase. Instructing the model to define things affirmatively, and to treat contrastive structures as high-impact tools used sparingly, produces noticeably more direct prose.

The Deeper Problem

The "It's not X, it's Y" pattern is a symptom of an alignment strategy that has optimized for the appearance of nuance, using rhetorical elaboration as the proxy signal for quality. RLHF as currently practiced penalizes declarative confidence and rewards elaborate framing, because human evaluators operating under time pressure consistently mistake rhetorical complexity for intellectual depth.

The signal annotators send is "this sounds thorough," and the model learns to optimize for exactly that signal. The output is text that reads as considered but delivers less information than a direct statement would.

Fixing this requires rethinking what signals human annotators are actually rewarding during preference tuning, and whether those signals reliably correspond to quality. Until that changes, AI-generated text will keep reaching for the same contrastive move, in every response, on every topic, at scale.

Research sourced from academic papers on RLHF, DPO, and LLM stylistics, including work from arXiv on contrastive learning in alignment and ACL Anthology findings on AI argumentative style.