Day 4 of my 21-day API challenge is done.
Yesterday I built a VAT Number Validator. Today I built a Text Readability Scorer API — analyze any text and get back Flesch-Kincaid scores, grade level, reading time, vocabulary analysis and improvement suggestions.
Previous days:
- Day 1 — Invoice & Receipt Parser API
- Day 2 — Password Strength & Security Scorer API
- Day 3 — VAT Number Validator API
The Problem
Every content platform, SEO tool and email marketing app needs to know how readable their text is. The question developers always ask:
- Is this blog post too complex for a general audience?
- What grade level is this email written at?
- How long will it take to read this article?
- Is this text full of passive voice and hedge words?
Building these algorithms from scratch takes time and the formulas are fiddly to get right. So I packaged all of it into one clean API.
What I Built
The Text Readability Scorer API — send it any text, get back a full readability report.
Input:
{
"text": "The quick brown fox jumps over the lazy dog. This is a simple sentence. Reading should be easy and enjoyable for everyone. Short sentences help readers understand content faster."
}
Output:
{
"success": true,
"data": {
"scores": {
"flesch_reading_ease": 72.6,
"flesch_kincaid_grade": 5.2,
"gunning_fog_index": 6.8,
"smog_index": 5.1,
"coleman_liau_index": 7.3,
"average_grade_level": 7
},
"interpretation": {
"reading_ease_label": "Fairly Easy — readable by 7th grade",
"grade_level_label": "Middle School",
"reading_time": "10 seconds",
"target_audience": "General public"
},
"style": {
"passive_voice_count": 0,
"passive_voice_percent": 0,
"complex_words": 2,
"complex_word_percent": 5.6
},
"suggestions": []
}
}
The 6 Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check |
| POST | /analyze |
Full analysis — all fields |
| POST | /analyze/scores |
Readability scores only |
| POST | /analyze/stats |
Text statistics only |
| POST | /analyze/vocabulary |
Vocabulary analysis only |
| POST | /compare |
Compare two texts side by side |
The lightweight endpoints are useful for high-volume pipelines. If you only need the reading ease score, call /analyze/scores instead of the full /analyze endpoint.
The Algorithms
1. Flesch Reading Ease (0-100)
The most widely used readability formula. Higher = easier to read.
const readingEase = 206.835
- (1.015 * avgWordsPerSentence)
- (84.6 * avgSyllablesPerWord);
| Score | Level |
|---|---|
| 90-100 | Very Easy |
| 70-80 | Fairly Easy |
| 60-70 | Standard |
| 30-50 | Difficult |
| 0-30 | Very Difficult |
2. Flesch-Kincaid Grade Level
Converts the reading ease score into a US school grade level.
const gradeLevel = (0.39 * avgWordsPerSentence)
+ (11.8 * avgSyllablesPerWord)
- 15.59;
3. Gunning Fog Index
Estimates the years of formal education needed to understand the text on first reading.
const fog = 0.4 * (avgWordsPerSentence + percentComplexWords);
Complex words are defined as words with 3 or more syllables.
4. Coleman-Liau Index
Unlike the others this formula uses characters instead of syllables — making it more consistent across different texts.
const L = (charsNoSpaces / words) * 100;
const S = (sentences / words) * 100;
const cli = (0.0588 * L) - (0.296 * S) - 15.8;
Syllable Counting
The trickiest part of the whole API. English syllable counting is notoriously difficult because of all the exceptions.
My approach:
function countSyllables(word) {
word = word.toLowerCase().replace(/[^a-z]/g, "");
if (word.length <= 2) return 1;
// Remove silent endings
word = word.replace(/(?:[^laeiouy]es|ed|[^laeiouy]e)$/, "");
word = word.replace(/^y/, "");
// Count vowel groups
const matches = word.match(/[aeiouy]{1,2}/g);
return matches ? matches.length : 1;
}
It handles the most common cases well — not perfect for every word in English but accurate enough for readability scoring purposes.
Passive Voice Detection
function countPassiveVoice(text) {
// Look for "to be" verb + past participle pattern
const passivePattern = /\b(was|were|been|being|is|are|am|be)\s+\w+ed\b/gi;
const matches = text.match(passivePattern) || [];
return matches.length;
}
The API flags passive voice because it's one of the most common writing problems. Active voice is clearer and more engaging.
Vocabulary Analysis
The /analyze/vocabulary endpoint returns:
{
"unique_words": 28,
"lexical_diversity": 0.78,
"long_words": 3,
"short_words": 12,
"transition_words": 2,
"hedge_words": 1,
"top_words": [
{ "word": "the", "count": 4 },
{ "word": "is", "count": 2 }
]
}
Lexical diversity is the ratio of unique words to total words. A score close to 1.0 means very varied vocabulary. A score close to 0 means lots of repetition.
The Compare Endpoint
My favourite endpoint. Send two versions of the same text and find out which is more readable:
{
"text_a": "The utilization of sophisticated vocabulary and convoluted sentence structures creates significant comprehension difficulties.",
"text_b": "Using big words and long sentences makes text hard to read."
}
Response:
{
"text_a": { "flesch_reading_ease": 18.2, "grade_level": 16 },
"text_b": { "flesch_reading_ease": 74.1, "grade_level": 7 },
"more_readable": "text_b",
"ease_difference": 55.9
}
Useful for A/B testing email subject lines, landing page copy or article introductions.
Tech Stack
- Runtime: Node.js + Express
- Hosting: Railway (free tier)
- Marketplace: RapidAPI
- Dependencies: express, cors, helmet, morgan, express-rate-limit
Zero paid APIs. Zero AI. Pure math — costs almost nothing to run.
Try It
Live on RapidAPI — search Text Readability Scorer or find me at rapidapi.com/user/ruanmul04.
Free tier: 10 requests/month. No credit card required.
What's Next
Day 5 tomorrow — Email Validator & Disposable Email Checker API.
Every signup form needs email validation. I'll build format checking, disposable email detection and domain validation in one clean API.
Follow me here on dev.to to catch every day of the challenge. 🇿🇦
21 APIs in 21 days. Built in South Africa. Sold globally.
Top comments (0)