Improving Sentence Rewriter’s API Detection Accuracy: What Actually Worked

#programming #web3 #discuss #api

Over the past few weeks, I worked on optimizing the detection layer behind Sentence Rewriter, our rewriting API designed to help users improve clarity, grammar, and tone in real time. The goal was to make rewrite outputs more consistent and context-aware, especially for complex or mixed-tone inputs.

Problem

The API needs to make several decisions before rewriting:

Whether the input is grammatically correct or needs significant rewriting.
Tone (formal, casual) and style.
Whether the text is AI-generated or heavily templated.
Appropriate rewrite strength based on input quality.

Originally, detection was rule-based and treated paragraphs as single units. This often caused inconsistent rewrites when users submitted multi-sentence or mixed-tone inputs.

Optimizations

1. Sentence-Level Segmentation

Paragraphs are now split into individual sentences for classification.
Each sentence is labeled independently, then merged for a paragraph-level decision.
Result: 18–22% improvement in rewrite consistency on internal tests.

2. Pattern-Based AI Detection

Replaced simple heuristics with pattern scoring: embedding collapse, repetition ratio, syntactic symmetry, and repetitive connectors.
When AI-generated patterns are detected, Sentence Rewriter lowers rewrite strength to avoid overly robotic outputs.

3. Context-Aware Tone Detection

Applied a context window around contrastive conjunctions (but, although, however) to capture tone flips mid-sentence.
Ensures the API chooses the correct rewrite mode, preserving user intent and style.

4. Dynamic Rewrite Strength

Rewrite strength is now adaptive:
- Low grammar errors → minor adjustments.
- Moderate errors → moderate rewrites.
- Heavy errors → aggressive rewriting.
- AI-style detected → lower strength to prevent stacked transformations.
This dynamic approach significantly improved output quality for Sentence Rewriter users.

Evaluation

Tested with 500 real user sentences and 100 borderline cases.
Metrics: clarity, fluency, tone preservation, meaning preservation.
Overall improvement: +27% across metrics.

Next Steps

Phrase-level confidence scores.
Better handling of domain-specific terminology.
Detecting user intent for rewrite vs. rephrase vs. refine.
Continuous feedback loop from user interactions to improve Sentence Rewriter API performance.