If you've ever needed to score the readability of a piece of text programmatically, you've probably encountered the Flesch-Kincaid formula. But there's a lesser-known formula that's actually faster to compute and produces similarly accurate results: the Automated Readability Index, or ARI.
This post explains what ARI is, how the formula works, how it compares to other readability metrics, and when to use it.
What Is the Automated Readability Index?
The Automated Readability Index (ARI) is a readability formula developed in 1967 for the US Air Force to evaluate the readability of technical documents and training manuals. It was designed to be computed automatically — without human syllable counting.
The formula is:
ARI = 4.71 × (characters / words) + 0.5 × (words / sentences) − 21.43
The result is a number that maps directly to a US school grade level:
| ARI Score | Grade Level | Age Range |
|---|---|---|
| 1 | Kindergarten | 5–6 |
| 2 | 1st Grade | 6–7 |
| 3 | 2nd Grade | 7–8 |
| 4 | 3rd Grade | 8–9 |
| 5 | 4th Grade | 9–10 |
| 6 | 5th Grade | 10–11 |
| 7 | 6th Grade | 11–12 |
| 8 | 7th Grade | 12–13 |
| 9 | 8th Grade | 13–14 |
| 10 | 9th Grade | 14–15 |
| 11 | 10th Grade | 15–16 |
| 12 | 11th Grade | 16–17 |
| 13 | 12th Grade | 17–18 |
| 14 | College | 18–22 |
Why Characters Instead of Syllables?
Most readability formulas (Flesch-Kincaid, Gunning Fog, SMOG) use syllable counts. Counting syllables accurately requires linguistic knowledge — you need a pronunciation dictionary or a syllabification algorithm.
ARI sidesteps this entirely. It uses character count per word instead. The reasoning: longer words (more characters) tend to have more syllables and be harder to read — so character count is a reasonable proxy, and it's trivially easy to compute.
In Python, implementing ARI from scratch takes about 10 lines:
import re
def calculate_ari(text):
sentences = len(re.split(r'[.!?]+', text.strip()))
words = len(text.split())
chars = sum(c.isalnum() for c in text)
if words == 0 or sentences == 0:
return 0
return 4.71 * (chars / words) + 0.5 * (words / sentences) - 21.43
No syllable dictionary required.
ARI vs. Flesch-Kincaid Grade Level
Both ARI and Flesch-Kincaid Grade Level output a US school grade number. How do they compare?
For most English text, they agree closely — typically within 0.5–1.5 grade levels.
The difference emerges with words that are long but phonetically simple. Consider "algorithm": it has 9 characters but only 3 syllables. Flesch-Kincaid treats it as a moderately complex word; ARI treats it as more complex because of its character length. This means ARI tends to score slightly higher than Flesch-Kincaid for texts heavy in technical vocabulary.
For everyday prose, the difference is negligible. Choose based on your use case:
- ARI: faster to compute, no syllable detection needed, designed for technical writing
- Flesch-Kincaid: more established in academic and educational publishing, better for literary analysis
Target ARI Scores by Content Type
| Content Type | Target ARI |
|---|---|
| Children's content | 3–5 |
| General web content | 6–9 |
| Marketing emails | 6–8 |
| News articles | 8–10 |
| Business reports | 10–12 |
| Technical documentation | 12–14 |
| Academic papers | 14–18+ |
Implementing ARI in Other Languages
JavaScript:
function calculateARI(text) {
const sentences = text.split(/[.!?]+/).filter(s => s.trim()).length;
const words = text.trim().split(/\s+/).length;
const chars = text.replace(/[^a-zA-Z0-9]/g, '').length;
if (!words || !sentences) return 0;
return 4.71 * (chars / words) + 0.5 * (words / sentences) - 21.43;
}
Go:
import (
"regexp"
"strings"
"unicode"
)
func calculateARI(text string) float64 {
sentenceRe := regexp.MustCompile(`[.!?]+`)
sentences := len(sentenceRe.Split(strings.TrimSpace(text), -1))
words := len(strings.Fields(text))
chars := 0
for _, r := range text {
if unicode.IsLetter(r) || unicode.IsDigit(r) {
chars++
}
}
if words == 0 || sentences == 0 {
return 0
}
return 4.71*float64(chars)/float64(words) + 0.5*float64(words)/float64(sentences) - 21.43
}
When Should You Use ARI?
ARI is a good choice when:
- You need a fast computation — no syllable dictionary, just character counting
- You're analysing technical writing — ARI was designed for this use case
- You're building a readability pipeline — ARI integrates cleanly as one of several metrics
- You need a grade-level output — ARI maps directly to US grade levels without additional conversion
If you want to run an ARI check on a piece of text right now without writing any code, you can use a free ARI checker online that calculates ARI alongside Flesch-Kincaid, Gunning Fog, SMOG, and Coleman-Liau — all in one pass, no signup.
Summary
The Automated Readability Index is a 1967 formula designed for fast, automatic readability scoring. It uses character-per-word ratios instead of syllable counts, making it simpler to implement than most alternatives. For most English texts, it produces results very close to Flesch-Kincaid Grade Level.
If you're building a readability scoring feature and want something lightweight with no linguistic dependencies, ARI is worth considering.
Top comments (0)