DEV Community

Tommy
Tommy

Posted on

CS50 Week 6 - Readability in Python

The Problem

The task here is to write some code in Python that will determine the "grade" (or reading level) of a passage of text using a calculation called the Cole-Liauman index.

The Solution

First things first, we're going to want to import the get_string function from cs50 and also import the library string from Python. This will be useful later.

from cs50 import get_string
import string
Enter fullscreen mode Exit fullscreen mode

We want to get the user input and declare variables. Words is declared as one because we count spaces to find how many words there are in a passage of text, but of course it would be unusual to have a space before the first word so the amount of words in some text will be amount of spaces + 1, generally.

# Get user input
sentence = get_string("Text: ")

# Declare variables
letters = 0
words = 1
sentences = 0
Enter fullscreen mode Exit fullscreen mode

To find how many sentences there are in our text we can just check to see if the current letter in the sentence is an exclamation mark, question mark or full stop.
If the letter is any other punctuation (such as a comma), we don't want to increment anything so we can just continue. We do this so that when it reaches the else block, our program won't count a comma as a letter.
To find the amount of words we can count the spaces as mentioned before.
Finally, the else block because anything else is going to be a letter.

# Loop through string, identifying each letter
for letter in sentence:
    if letter == "!" or letter == "?" or letter == ".":
        sentences += 1
    elif letter in string.punctuation:
        continue
    elif letter in string.whitespace:
        words += 1
    else:
        letters += 1
Enter fullscreen mode Exit fullscreen mode

We first need to find the word factor before we do any calculations. This is important because it allows us to find how many letters and sentences there are per 100 words.
The Coleman-Liau index calculation is rounded to an integer.

# Calculations for liau index
wordFactor = words / 100
lettersPer100 = letters / wordFactor
sentencesPer100 = sentences / wordFactor
liauIndex = round((0.0588 * lettersPer100) - (0.296 * sentencesPer100) - 15.8)
Enter fullscreen mode Exit fullscreen mode

Finally, we just have to print the grade based on what the Coleman-Liau index is! Above 16 we print "Grade 16+", below 1 is "Before Grade 1" and anything else is going to be whatever the grade is so we can just inject the Coleman-Liau value into a string that says "Grade {liauIndex}".

# Print correct grade based on liau indx
if liauIndex > 16:
    print("Grade 16+")
elif liauIndex < 1:
    print("Before Grade 1")
else:
    print(f"Grade {liauIndex}")
Enter fullscreen mode Exit fullscreen mode

Jobs a goodun!

Latest comments (2)

Collapse
 
whydafuck profile image
George

Count sentences based on amount of dots in the string iteration is a bad practice. In case if there are multiple dots (i.e. "So it goes..."), it will count them as multiple sentences. Instead you should consider using regex:

import re
sentences = len(re.split(r'[.!?]+', your_text_variable))

Collapse
 
mzmzolqadr profile image
Mohammad Zolghadr

thank you Tommy! I have learned a lot from this code.