ckmtools

Posted on Mar 8

I Replaced 5 npm Packages with One for Text Analysis

#javascript #webdev #showdev #npm

Every project I built that needed text analysis ended up with the same dependency list:

npm install flesch flesch-kincaid coleman-liau automated-readability sentiment

Five packages. Five different APIs. Five entries in package.json. And none of them shared the underlying text parsing work — each one re-tokenized the same string independently.

I got tired of it and built textlens.

The Problem

Say you want to score an article's readability and sentiment. Here's what that looked like before:

const flesch = require('flesch');
const fleschKincaid = require('flesch-kincaid');
const colemanLiau = require('coleman-liau');
const ari = require('automated-readability');
const Sentiment = require('sentiment');

// Each package needs its own input format
const counts = {
  sentence: countSentences(text),
  word: countWords(text),
  syllable: countSyllables(text) // you write this yourself
};

const fleschScore = flesch(counts);
const fkGrade = fleschKincaid(counts);
const clIndex = colemanLiau({
  sentence: counts.sentence,
  word: counts.word,
  letter: countLetters(text) // different input shape
});
const ariScore = ari({
  sentence: counts.sentence,
  word: counts.word,
  character: countCharacters(text) // yet another shape
});

const sentiment = new Sentiment();
const sentimentResult = sentiment.analyze(text);

Notice the problems:

You still need to write countSyllables() yourself (or add a sixth package)
Each package expects a slightly different input object
None of them share the text parsing — the string gets tokenized multiple times
No TypeScript types (or outdated ones)

The After

npm install textlens

import { analyze } from 'textlens';

const result = analyze(text);

console.log(result.readability.fleschReadingEase.score);  // 0-100
console.log(result.readability.fleschKincaidGrade.grade); // US grade level
console.log(result.readability.colemanLiau.grade);        // grade level
console.log(result.readability.automatedReadability.grade);// grade level
console.log(result.readability.consensusGrade);           // weighted avg
console.log(result.sentiment.label);                      // 'positive' | 'negative' | 'neutral'
console.log(result.keywords);                             // TF-IDF ranked
console.log(result.readingTime.minutes);                  // estimated read time

One import. One function call. The text is parsed once.

What You Get

Here's the full comparison table:

Capability	Separate packages	textlens
Flesch Reading Ease	`flesch`	`readability(text).fleschReadingEase`
Flesch-Kincaid Grade	`flesch-kincaid`	`readability(text).fleschKincaidGrade`
Coleman-Liau Index	`coleman-liau`	`readability(text).colemanLiau`
Automated Readability	`automated-readability`	`readability(text).automatedReadability`
Sentiment analysis	`sentiment`	`sentiment(text)`
Keyword extraction	`keyword-extractor`	`keywords(text)`
Reading time	`reading-time`	`readingTime(text)`
Gunning Fog, SMOG, Dale-Chall, Linsear Write	no popular package	Included
Keyword density (n-grams)	no popular package	`density(text)`
SEO scoring	no popular package	`seoScore(text)`
Extractive summarization	no popular package	`summarize(text)`
CLI tool	assemble yourself	`npx textlens file.txt`
TypeScript types	varies	Built-in

Individual Functions

You don't have to use the analyze() kitchen-sink. Each function works standalone:

import { readability, sentiment, keywords, readingTime, seoScore, summarize } from 'textlens';

// Just readability
const r = readability(text);
console.log(r.consensusGrade); // weighted average of 8 formulas

// Just sentiment
const s = sentiment('I love this product, it works great');
console.log(s.label);      // 'positive'
console.log(s.positive);   // ['love', 'great']

// Just keywords
const kw = keywords(text, { topN: 5, minLength: 3 });
// [{ word: 'readability', score: 0.42, count: 3, density: 0.05 }, ...]

// SEO scoring
const seo = seoScore(text, { targetKeyword: 'typescript' });
console.log(seo.grade);       // 'A' through 'F'
console.log(seo.suggestions); // actionable tips

// Summarize
const summary = summarize(text, { sentences: 2 });
console.log(summary.sentences); // top 2 sentences by importance

CLI Included

No code needed for quick checks:

$ npx textlens article.txt

  Words:      1,247
  Sentences:  68
  Grade:      8.2 (8th grade)
  Flesch:     62.4 (Standard)
  Sentiment:  Positive (0.12)
  Read time:  5 min

$ npx textlens article.txt --json | jq '.readability.consensusGrade'
8.2

$ npx textlens article.txt --keywords 5
  readability   0.42
  typescript    0.38
  analysis      0.31
  sentiment     0.27
  text          0.24

Technical Decisions

A few choices I made and why:

Zero dependencies. The AFINN-165 sentiment lexicon (~3,300 words) and Dale-Chall word list (~3,000 words) are bundled as TypeScript objects. Install size is ~200KB. No transitive dependency surprises.

Consensus grade. Individual readability formulas can vary by 2-3 grade levels on the same text. The consensus grade is a weighted average across all 8 formulas, which gives a more stable estimate than any single formula.

Syllable counting. Uses a rule-based algorithm with English syllable patterns. About 95% accurate on standard text. Good enough for readability formulas, which are themselves approximations.

Lexicon-based sentiment. I chose AFINN-165 over ML approaches intentionally. It's fast, deterministic, runs anywhere (no GPU, no API calls), and about 60% accurate vs. human judgment. If you need better accuracy, you need a different class of tool.

Install

npm install textlens

Ships ESM and CommonJS with full TypeScript types.

If you're juggling multiple text analysis packages, give it a try. Feedback and issues welcome on GitHub.

DEV Community