DEV Community

Cover image for I Built a Text Readability Scorer API with Flesch-Kincaid, Gunning Fog and More — Day 4 of 21
Ruan Muller
Ruan Muller

Posted on

I Built a Text Readability Scorer API with Flesch-Kincaid, Gunning Fog and More — Day 4 of 21

Day 4 of my 21-day API challenge is done.

Yesterday I built a VAT Number Validator. Today I built a Text Readability Scorer API — analyze any text and get back Flesch-Kincaid scores, grade level, reading time, vocabulary analysis and improvement suggestions.

Previous days:


The Problem

Every content platform, SEO tool and email marketing app needs to know how readable their text is. The question developers always ask:

  • Is this blog post too complex for a general audience?
  • What grade level is this email written at?
  • How long will it take to read this article?
  • Is this text full of passive voice and hedge words?

Building these algorithms from scratch takes time and the formulas are fiddly to get right. So I packaged all of it into one clean API.


What I Built

The Text Readability Scorer API — send it any text, get back a full readability report.

Input:

{
  "text": "The quick brown fox jumps over the lazy dog. This is a simple sentence. Reading should be easy and enjoyable for everyone. Short sentences help readers understand content faster."
}
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "success": true,
  "data": {
    "scores": {
      "flesch_reading_ease": 72.6,
      "flesch_kincaid_grade": 5.2,
      "gunning_fog_index": 6.8,
      "smog_index": 5.1,
      "coleman_liau_index": 7.3,
      "average_grade_level": 7
    },
    "interpretation": {
      "reading_ease_label": "Fairly Easy — readable by 7th grade",
      "grade_level_label": "Middle School",
      "reading_time": "10 seconds",
      "target_audience": "General public"
    },
    "style": {
      "passive_voice_count": 0,
      "passive_voice_percent": 0,
      "complex_words": 2,
      "complex_word_percent": 5.6
    },
    "suggestions": []
  }
}
Enter fullscreen mode Exit fullscreen mode

The 6 Endpoints

Method Endpoint Description
GET /health Health check
POST /analyze Full analysis — all fields
POST /analyze/scores Readability scores only
POST /analyze/stats Text statistics only
POST /analyze/vocabulary Vocabulary analysis only
POST /compare Compare two texts side by side

The lightweight endpoints are useful for high-volume pipelines. If you only need the reading ease score, call /analyze/scores instead of the full /analyze endpoint.


The Algorithms

1. Flesch Reading Ease (0-100)

The most widely used readability formula. Higher = easier to read.

const readingEase = 206.835
  - (1.015  * avgWordsPerSentence)
  - (84.6   * avgSyllablesPerWord);
Enter fullscreen mode Exit fullscreen mode
Score Level
90-100 Very Easy
70-80 Fairly Easy
60-70 Standard
30-50 Difficult
0-30 Very Difficult

2. Flesch-Kincaid Grade Level

Converts the reading ease score into a US school grade level.

const gradeLevel = (0.39  * avgWordsPerSentence)
                 + (11.8  * avgSyllablesPerWord)
                 - 15.59;
Enter fullscreen mode Exit fullscreen mode

3. Gunning Fog Index

Estimates the years of formal education needed to understand the text on first reading.

const fog = 0.4 * (avgWordsPerSentence + percentComplexWords);
Enter fullscreen mode Exit fullscreen mode

Complex words are defined as words with 3 or more syllables.


4. Coleman-Liau Index

Unlike the others this formula uses characters instead of syllables — making it more consistent across different texts.

const L   = (charsNoSpaces / words) * 100;
const S   = (sentences / words) * 100;
const cli = (0.0588 * L) - (0.296 * S) - 15.8;
Enter fullscreen mode Exit fullscreen mode

Syllable Counting

The trickiest part of the whole API. English syllable counting is notoriously difficult because of all the exceptions.

My approach:

function countSyllables(word) {
  word = word.toLowerCase().replace(/[^a-z]/g, "");
  if (word.length <= 2) return 1;

  // Remove silent endings
  word = word.replace(/(?:[^laeiouy]es|ed|[^laeiouy]e)$/, "");
  word = word.replace(/^y/, "");

  // Count vowel groups
  const matches = word.match(/[aeiouy]{1,2}/g);
  return matches ? matches.length : 1;
}
Enter fullscreen mode Exit fullscreen mode

It handles the most common cases well — not perfect for every word in English but accurate enough for readability scoring purposes.


Passive Voice Detection

function countPassiveVoice(text) {
  // Look for "to be" verb + past participle pattern
  const passivePattern = /\b(was|were|been|being|is|are|am|be)\s+\w+ed\b/gi;
  const matches = text.match(passivePattern) || [];
  return matches.length;
}
Enter fullscreen mode Exit fullscreen mode

The API flags passive voice because it's one of the most common writing problems. Active voice is clearer and more engaging.


Vocabulary Analysis

The /analyze/vocabulary endpoint returns:

{
  "unique_words": 28,
  "lexical_diversity": 0.78,
  "long_words": 3,
  "short_words": 12,
  "transition_words": 2,
  "hedge_words": 1,
  "top_words": [
    { "word": "the", "count": 4 },
    { "word": "is", "count": 2 }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Lexical diversity is the ratio of unique words to total words. A score close to 1.0 means very varied vocabulary. A score close to 0 means lots of repetition.


The Compare Endpoint

My favourite endpoint. Send two versions of the same text and find out which is more readable:

{
  "text_a": "The utilization of sophisticated vocabulary and convoluted sentence structures creates significant comprehension difficulties.",
  "text_b": "Using big words and long sentences makes text hard to read."
}
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "text_a": { "flesch_reading_ease": 18.2, "grade_level": 16 },
  "text_b": { "flesch_reading_ease": 74.1, "grade_level": 7  },
  "more_readable": "text_b",
  "ease_difference": 55.9
}
Enter fullscreen mode Exit fullscreen mode

Useful for A/B testing email subject lines, landing page copy or article introductions.


Tech Stack

  • Runtime: Node.js + Express
  • Hosting: Railway (free tier)
  • Marketplace: RapidAPI
  • Dependencies: express, cors, helmet, morgan, express-rate-limit

Zero paid APIs. Zero AI. Pure math — costs almost nothing to run.


Try It

Live on RapidAPI — search Text Readability Scorer or find me at rapidapi.com/user/ruanmul04.

Free tier: 10 requests/month. No credit card required.


What's Next

Day 5 tomorrow — Email Validator & Disposable Email Checker API.

Every signup form needs email validation. I'll build format checking, disposable email detection and domain validation in one clean API.

Follow me here on dev.to to catch every day of the challenge. 🇿🇦


21 APIs in 21 days. Built in South Africa. Sold globally.

Top comments (0)