datatoinfinity

Posted on Jun 29

Spell Checker- Operation on Text-NLP

#nlp #machinelearning #devto #python

operation on text:

Delete
Swap
Replace
Insert

Split

It will be use in all other operation.

def split(word):
    parts=[]
    for i in range(len(word)+1):
        parts+=[(word[ : i],word[i :])]
    return parts
split('datatoinfinity')

Output:
[('', 'datatoinfinity'),
 ('d', 'atatoinfinity'),
 ('da', 'tatoinfinity'),
 ('dat', 'atoinfinity'),
 ('data', 'toinfinity'),
 ('datat', 'oinfinity'),
 ('datato', 'infinity'),
 ('datatoi', 'nfinity'),
 ('datatoin', 'finity'),
 ('datatoinf', 'inity'),
 ('datatoinfi', 'nity'),
 ('datatoinfin', 'ity'),
 ('datatoinfini', 'ty'),
 ('datatoinfinit', 'y'),
 ('datatoinfinity', '')]

Explanation

The function takes a single word and return all the possible ways to split it into two part.
len('datatoiinfinity')=14, so loop goes through i=0 to i=14
For each i it creates a tuple (word[:i],word[i:]) i.e, a split between first i characters and the rest.
Each tuple represent:
- First part: word[:i]
- Second part: word[i:]

Delete

def delete(word):
    output=[]
    for l,r in split(word):
        output.append(l+r[1:])
    return output
delete('hello')

['ello', 'hllo', 'helo', 'helo', 'hell', 'hello']

Explanation

The delete() function returns a list of words formed by deleting one character at every possible position from the input word.
The function uses your pervious split(word) to divide the word at all position.
For each split (l, r), it removes the first character of r, effectively deleting one character from the original word.

Swap

def swap(word):
    output = []    
    for l,r in split(word):
        if (len(r) > 1):
            output.append(l + r[1] + r[0] + r[2:])
    return output
            
swap('Hello')

['eHllo', 'Hlelo', 'Hello', 'Helol']

Explanation

This function returns a list of words created by swapping two adjacent characters in the word — this simulates a common typo where two letters are accidentally typed in the wrong order (like "hlelo" instead of "hello").
Uses your previously defined split(word) to divide the word at each position.
If the right part r has at least 2 characters, it swaps the first two characters of r, and joins the result back with l.

Replace

def replace(word):
    
    characters = 'abcdefghijklmnopqrstuvwxyz'
    output = []    

    for l,r in split(word):
        for char in characters:
            output.append(l + char +  r[1:])
    return output

len(replace('lave'))

Output:

130

Explanation

1.This function simulates replacing each character in a word with every letter of the alphabet, one at a time.It generates all possible one-character replacements.

For each position in the word (using split()), it:
- Keeps the left part l.
- Replaces the first character of the right part r (i.e., the current character) with every letter in the alphabet.
- Appends the result to output.

Insert

def insert(word):

    characters = 'abcdefghijklmnopqrstuvwxyz'
    output = []

    for l,r in split(word):
        for char in characters:
            output.append(l + char + r)

    return output

len(insert('lve'))

Output:

104

Explanation

This function simulates inserting one character (from 'a' to 'z') at every possible position in the word.
It uses your split(word) to split at every position.
At each position, it inserts each character from 'a' to 'z'.

DEV Community

Spell Checker- Operation on Text-NLP

Split

Delete

Swap

Replace

Insert

Top comments (0)