CS50 Week 6 - Readability in Python

The Problem

The task here is to write some code in Python that will determine the "grade" (or reading level) of a passage of text using a calculation called the Cole-Liauman index.

The Solution

First things first, we're going to want to import the get_string function from cs50 and also import the library string from Python. This will be useful later.

from cs50 import get_string
import string

We want to get the user input and declare variables. Words is declared as one because we count spaces to find how many words there are in a passage of text, but of course it would be unusual to have a space before the first word so the amount of words in some text will be amount of spaces + 1, generally.

# Get user input
sentence = get_string("Text: ")

# Declare variables
letters = 0
words = 1
sentences = 0

To find how many sentences there are in our text we can just check to see if the current letter in the sentence is an exclamation mark, question mark or full stop.
If the letter is any other punctuation (such as a comma), we don't want to increment anything so we can just continue. We do this so that when it reaches the else block, our program won't count a comma as a letter.
To find the amount of words we can count the spaces as mentioned before.
Finally, the else block because anything else is going to be a letter.

# Loop through string, identifying each letter
for letter in sentence:
    if letter == "!" or letter == "?" or letter == ".":
        sentences += 1
    elif letter in string.punctuation:
        continue
    elif letter in string.whitespace:
        words += 1
    else:
        letters += 1

We first need to find the word factor before we do any calculations. This is important because it allows us to find how many letters and sentences there are per 100 words.
The Coleman-Liau index calculation is rounded to an integer.

# Calculations for liau index
wordFactor = words / 100
lettersPer100 = letters / wordFactor
sentencesPer100 = sentences / wordFactor
liauIndex = round((0.0588 * lettersPer100) - (0.296 * sentencesPer100) - 15.8)

Finally, we just have to print the grade based on what the Coleman-Liau index is! Above 16 we print "Grade 16+", below 1 is "Before Grade 1" and anything else is going to be whatever the grade is so we can just inject the Coleman-Liau value into a string that says "Grade {liauIndex}".

# Print correct grade based on liau indx
if liauIndex > 16:
    print("Grade 16+")
elif liauIndex < 1:
    print("Before Grade 1")
else:
    print(f"Grade {liauIndex}")

Jobs a goodun!

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (2)

George • Nov 22 '21

Count sentences based on amount of dots in the string iteration is a bad practice. In case if there are multiple dots (i.e. "So it goes..."), it will count them as multiple sentences. Instead you should consider using regex:

import re
sentences = len(re.split(r'[.!?]+', your_text_variable))