What Is the Automated Readability Index (ARI)? A Developer's Guide

#webdev #tools #beginners #productivity

If you've ever needed to score the readability of a piece of text programmatically, you've probably encountered the Flesch-Kincaid formula. But there's a lesser-known formula that's actually faster to compute and produces similarly accurate results: the Automated Readability Index, or ARI.

This post explains what ARI is, how the formula works, how it compares to other readability metrics, and when to use it.

What Is the Automated Readability Index?

The Automated Readability Index (ARI) is a readability formula developed in 1967 for the US Air Force to evaluate the readability of technical documents and training manuals. It was designed to be computed automatically — without human syllable counting.

The formula is:

ARI = 4.71 × (characters / words) + 0.5 × (words / sentences) − 21.43

The result is a number that maps directly to a US school grade level:

ARI Score	Grade Level	Age Range
1	Kindergarten	5–6
2	1st Grade	6–7
3	2nd Grade	7–8
4	3rd Grade	8–9
5	4th Grade	9–10
6	5th Grade	10–11
7	6th Grade	11–12
8	7th Grade	12–13
9	8th Grade	13–14
10	9th Grade	14–15
11	10th Grade	15–16
12	11th Grade	16–17
13	12th Grade	17–18
14	College	18–22

Why Characters Instead of Syllables?

Most readability formulas (Flesch-Kincaid, Gunning Fog, SMOG) use syllable counts. Counting syllables accurately requires linguistic knowledge — you need a pronunciation dictionary or a syllabification algorithm.

ARI sidesteps this entirely. It uses character count per word instead. The reasoning: longer words (more characters) tend to have more syllables and be harder to read — so character count is a reasonable proxy, and it's trivially easy to compute.

In Python, implementing ARI from scratch takes about 10 lines:

import re

def calculate_ari(text):
    sentences = len(re.split(r'[.!?]+', text.strip()))
    words = len(text.split())
    chars = sum(c.isalnum() for c in text)
    if words == 0 or sentences == 0:
        return 0
    return 4.71 * (chars / words) + 0.5 * (words / sentences) - 21.43

No syllable dictionary required.

ARI vs. Flesch-Kincaid Grade Level

Both ARI and Flesch-Kincaid Grade Level output a US school grade number. How do they compare?

For most English text, they agree closely — typically within 0.5–1.5 grade levels.

The difference emerges with words that are long but phonetically simple. Consider "algorithm": it has 9 characters but only 3 syllables. Flesch-Kincaid treats it as a moderately complex word; ARI treats it as more complex because of its character length. This means ARI tends to score slightly higher than Flesch-Kincaid for texts heavy in technical vocabulary.

For everyday prose, the difference is negligible. Choose based on your use case:

ARI: faster to compute, no syllable detection needed, designed for technical writing
Flesch-Kincaid: more established in academic and educational publishing, better for literary analysis

Target ARI Scores by Content Type

Content Type	Target ARI
Children's content	3–5
General web content	6–9
Marketing emails	6–8
News articles	8–10
Business reports	10–12
Technical documentation	12–14
Academic papers	14–18+

Implementing ARI in Other Languages

JavaScript:

function calculateARI(text) {
  const sentences = text.split(/[.!?]+/).filter(s => s.trim()).length;
  const words = text.trim().split(/\s+/).length;
  const chars = text.replace(/[^a-zA-Z0-9]/g, '').length;
  if (!words || !sentences) return 0;
  return 4.71 * (chars / words) + 0.5 * (words / sentences) - 21.43;
}

Go:

import (
    "regexp"
    "strings"
    "unicode"
)

func calculateARI(text string) float64 {
    sentenceRe := regexp.MustCompile(`[.!?]+`)
    sentences := len(sentenceRe.Split(strings.TrimSpace(text), -1))
    words := len(strings.Fields(text))
    chars := 0
    for _, r := range text {
        if unicode.IsLetter(r) || unicode.IsDigit(r) {
            chars++
        }
    }
    if words == 0 || sentences == 0 {
        return 0
    }
    return 4.71*float64(chars)/float64(words) + 0.5*float64(words)/float64(sentences) - 21.43
}

When Should You Use ARI?

ARI is a good choice when:

You need a fast computation — no syllable dictionary, just character counting
You're analysing technical writing — ARI was designed for this use case
You're building a readability pipeline — ARI integrates cleanly as one of several metrics
You need a grade-level output — ARI maps directly to US grade levels without additional conversion

If you want to run an ARI check on a piece of text right now without writing any code, you can use a free ARI checker online that calculates ARI alongside Flesch-Kincaid, Gunning Fog, SMOG, and Coleman-Liau — all in one pass, no signup.

Summary

The Automated Readability Index is a 1967 formula designed for fast, automatic readability scoring. It uses character-per-word ratios instead of syllable counts, making it simpler to implement than most alternatives. For most English texts, it produces results very close to Flesch-Kincaid Grade Level.

If you're building a readability scoring feature and want something lightweight with no linguistic dependencies, ARI is worth considering.