DEV Community

petercour
petercour

Posted on

7 1

Machine Learning with Simple Text Messages

You can classify text data automatically. Say you collect messages about cooking and messages about programming.

A machine learning algorithm can then decide which group (cooking, programming) a new message belongs to.

If you have a list of messages like this:

#!/usr/bin/python3
data = [ "Help me impress the girl of my dreams!", 
         "How do you measure ingredients like butter in cups?",
         "Tips on making fried rice", 
         "immutability in javascript. It has a declarative approach of programming, which means that you focus on describing what your program must accomplish", 
         "Facing a Programming Problem. Everybody has encountered it, the programming problem that makes NO sense. This problem has no fix, it just cannot be done",
         " 5 Uses for the Spread Operator. The spread operator is a favorite of JavaScript developers. It's a powerful piece of syntax that has numerous applications."]
Enter fullscreen mode Exit fullscreen mode

Where each message belongs to a class

target = [ 0,0,0,1,1,1 ]
Enter fullscreen mode Exit fullscreen mode

You can predict the class for a new message:

#!/usr/bin/python3
sentence = input("Enter some text: ")
sentence_x = transfer.transform([sentence])
y_predict = estimator.predict(sentence_x)
print("y_predict:\n", y_predict)
Enter fullscreen mode Exit fullscreen mode

Give it a spin:

Enter some text: im a cook
y_predict: [0]
Enter fullscreen mode Exit fullscreen mode

Another run:

Enter some text: programming javascript is great
y_predict: [1]
Enter fullscreen mode Exit fullscreen mode

The program

The data set we have defined is extremely small (6 samples). The more samples you have, the better it becomes.

#!/usr/bin/python3
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split

def text_classify():
    data = [ "Help me impress the girl of my dreams!", 
             "How do you measure ingredients like butter in cups?",
             "Tips on making fried rice", 
             "immutability in javascript. It has a declarative approach of programming, which means that you focus on describing what your program must accomplish", 
             "Facing a Programming Problem. Everybody has encountered it, the programming problem that makes NO sense. This problem has no fix, it just cannot be done",
             " 5 Uses for the Spread Operator. The spread operator is a favorite of JavaScript developers. It's a powerful piece of syntax that has numerous applications."]
    target = [ 0,0,0,1,1,1 ]
    x_train, x_test, y_train, y_test = train_test_split(data,target)

    transfer = TfidfVectorizer()
    x_train = transfer.fit_transform(x_train)
    x_test = transfer.transform(x_test)

    estimator = MultinomialNB()
    estimator.fit(x_train,y_train)

    score = estimator.score(x_test, y_test)
    print("score:\n", score)

    sentence = input("Enter some text: ")
    sentence_x = transfer.transform([sentence])
    y_predict = estimator.predict(sentence_x)
    print("y_predict: ", y_predict)

    return None

text_classify()            
Enter fullscreen mode Exit fullscreen mode

Related links:

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (1)

Collapse
 
eyussf profile image
Yusuf

Nice! appreciate your time
thank you!

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay