DEV Community

Snappy Tools
Snappy Tools

Posted on

What Is the Automated Readability Index (ARI)? A Developer's Guide

If you've ever needed to score the readability of a piece of text programmatically, you've probably encountered the Flesch-Kincaid formula. But there's a lesser-known formula that's actually faster to compute and produces similarly accurate results: the Automated Readability Index, or ARI.

This post explains what ARI is, how the formula works, how it compares to other readability metrics, and when to use it.


What Is the Automated Readability Index?

The Automated Readability Index (ARI) is a readability formula developed in 1967 for the US Air Force to evaluate the readability of technical documents and training manuals. It was designed to be computed automatically — without human syllable counting.

The formula is:

ARI = 4.71 × (characters / words) + 0.5 × (words / sentences) − 21.43
Enter fullscreen mode Exit fullscreen mode

The result is a number that maps directly to a US school grade level:

ARI Score Grade Level Age Range
1 Kindergarten 5–6
2 1st Grade 6–7
3 2nd Grade 7–8
4 3rd Grade 8–9
5 4th Grade 9–10
6 5th Grade 10–11
7 6th Grade 11–12
8 7th Grade 12–13
9 8th Grade 13–14
10 9th Grade 14–15
11 10th Grade 15–16
12 11th Grade 16–17
13 12th Grade 17–18
14 College 18–22

Why Characters Instead of Syllables?

Most readability formulas (Flesch-Kincaid, Gunning Fog, SMOG) use syllable counts. Counting syllables accurately requires linguistic knowledge — you need a pronunciation dictionary or a syllabification algorithm.

ARI sidesteps this entirely. It uses character count per word instead. The reasoning: longer words (more characters) tend to have more syllables and be harder to read — so character count is a reasonable proxy, and it's trivially easy to compute.

In Python, implementing ARI from scratch takes about 10 lines:

import re

def calculate_ari(text):
    sentences = len(re.split(r'[.!?]+', text.strip()))
    words = len(text.split())
    chars = sum(c.isalnum() for c in text)
    if words == 0 or sentences == 0:
        return 0
    return 4.71 * (chars / words) + 0.5 * (words / sentences) - 21.43
Enter fullscreen mode Exit fullscreen mode

No syllable dictionary required.


ARI vs. Flesch-Kincaid Grade Level

Both ARI and Flesch-Kincaid Grade Level output a US school grade number. How do they compare?

For most English text, they agree closely — typically within 0.5–1.5 grade levels.

The difference emerges with words that are long but phonetically simple. Consider "algorithm": it has 9 characters but only 3 syllables. Flesch-Kincaid treats it as a moderately complex word; ARI treats it as more complex because of its character length. This means ARI tends to score slightly higher than Flesch-Kincaid for texts heavy in technical vocabulary.

For everyday prose, the difference is negligible. Choose based on your use case:

  • ARI: faster to compute, no syllable detection needed, designed for technical writing
  • Flesch-Kincaid: more established in academic and educational publishing, better for literary analysis

Target ARI Scores by Content Type

Content Type Target ARI
Children's content 3–5
General web content 6–9
Marketing emails 6–8
News articles 8–10
Business reports 10–12
Technical documentation 12–14
Academic papers 14–18+

Implementing ARI in Other Languages

JavaScript:

function calculateARI(text) {
  const sentences = text.split(/[.!?]+/).filter(s => s.trim()).length;
  const words = text.trim().split(/\s+/).length;
  const chars = text.replace(/[^a-zA-Z0-9]/g, '').length;
  if (!words || !sentences) return 0;
  return 4.71 * (chars / words) + 0.5 * (words / sentences) - 21.43;
}
Enter fullscreen mode Exit fullscreen mode

Go:

import (
    "regexp"
    "strings"
    "unicode"
)

func calculateARI(text string) float64 {
    sentenceRe := regexp.MustCompile(`[.!?]+`)
    sentences := len(sentenceRe.Split(strings.TrimSpace(text), -1))
    words := len(strings.Fields(text))
    chars := 0
    for _, r := range text {
        if unicode.IsLetter(r) || unicode.IsDigit(r) {
            chars++
        }
    }
    if words == 0 || sentences == 0 {
        return 0
    }
    return 4.71*float64(chars)/float64(words) + 0.5*float64(words)/float64(sentences) - 21.43
}
Enter fullscreen mode Exit fullscreen mode

When Should You Use ARI?

ARI is a good choice when:

  1. You need a fast computation — no syllable dictionary, just character counting
  2. You're analysing technical writing — ARI was designed for this use case
  3. You're building a readability pipeline — ARI integrates cleanly as one of several metrics
  4. You need a grade-level output — ARI maps directly to US grade levels without additional conversion

If you want to run an ARI check on a piece of text right now without writing any code, you can use a free ARI checker online that calculates ARI alongside Flesch-Kincaid, Gunning Fog, SMOG, and Coleman-Liau — all in one pass, no signup.


Summary

The Automated Readability Index is a 1967 formula designed for fast, automatic readability scoring. It uses character-per-word ratios instead of syllable counts, making it simpler to implement than most alternatives. For most English texts, it produces results very close to Flesch-Kincaid Grade Level.

If you're building a readability scoring feature and want something lightweight with no linguistic dependencies, ARI is worth considering.

Top comments (0)