DEV Community

Cover image for Spell Checker- Operation on Text-NLP
datatoinfinity
datatoinfinity

Posted on

Spell Checker- Operation on Text-NLP

operation on text:

  • Delete
  • Swap
  • Replace
  • Insert

Split

It will be use in all other operation.

def split(word):
    parts=[]
    for i in range(len(word)+1):
        parts+=[(word[ : i],word[i :])]
    return parts
split('datatoinfinity')
Output:
[('', 'datatoinfinity'),
 ('d', 'atatoinfinity'),
 ('da', 'tatoinfinity'),
 ('dat', 'atoinfinity'),
 ('data', 'toinfinity'),
 ('datat', 'oinfinity'),
 ('datato', 'infinity'),
 ('datatoi', 'nfinity'),
 ('datatoin', 'finity'),
 ('datatoinf', 'inity'),
 ('datatoinfi', 'nity'),
 ('datatoinfin', 'ity'),
 ('datatoinfini', 'ty'),
 ('datatoinfinit', 'y'),
 ('datatoinfinity', '')]

Explanation

  1. The function takes a single word and return all the possible ways to split it into two part.
  2. len('datatoiinfinity')=14, so loop goes through i=0 to i=14
  3. For each i it creates a tuple (word[:i],word[i:]) i.e, a split between first i characters and the rest.
  4. Each tuple represent:
    • First part: word[:i]
    • Second part: word[i:]

Delete

def delete(word):
    output=[]
    for l,r in split(word):
        output.append(l+r[1:])
    return output
delete('hello')
['ello', 'hllo', 'helo', 'helo', 'hell', 'hello']

Explanation

  1. The delete() function returns a list of words formed by deleting one character at every possible position from the input word.
  2. The function uses your pervious split(word) to divide the word at all position.
  3. For each split (l, r), it removes the first character of r, effectively deleting one character from the original word.

Swap

def swap(word):
    output = []    
    for l,r in split(word):
        if (len(r) > 1):
            output.append(l + r[1] + r[0] + r[2:])
    return output
            
swap('Hello')
['eHllo', 'Hlelo', 'Hello', 'Helol']

Explanation

  1. This function returns a list of words created by swapping two adjacent characters in the word — this simulates a common typo where two letters are accidentally typed in the wrong order (like "hlelo" instead of "hello").
  2. Uses your previously defined split(word) to divide the word at each position.
  3. If the right part r has at least 2 characters, it swaps the first two characters of r, and joins the result back with l.

Replace

def replace(word):
    
    characters = 'abcdefghijklmnopqrstuvwxyz'
    output = []    

    for l,r in split(word):
        for char in characters:
            output.append(l + char +  r[1:])
    return output

len(replace('lave'))
Output:

130

Explanation

1.This function simulates replacing each character in a word with every letter of the alphabet, one at a time.It generates all possible one-character replacements.

  1. For each position in the word (using split()), it:
    • Keeps the left part l.
    • Replaces the first character of the right part r (i.e., the current character) with every letter in the alphabet.
    • Appends the result to output.

Insert

def insert(word):

    characters = 'abcdefghijklmnopqrstuvwxyz'
    output = []

    for l,r in split(word):
        for char in characters:
            output.append(l + char + r)

    return output

len(insert('lve'))
Output:

104

Explanation

  1. This function simulates inserting one character (from 'a' to 'z') at every possible position in the word.
  2. It uses your split(word) to split at every position.
  3. At each position, it inserts each character from 'a' to 'z'.

Top comments (0)