Intro
I started learning ML and Data Science, and I want share with you a powerful python library for semantic text analysis called spaCy.
Description
Similarity is determined by comparing word vectors or “word embeddings”, multi-dimensional meaning representations of a word.
spaCy is designed to help you do real work — to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. We like to think of spaCy as the Ruby on Rails of Natural Language Processing.
spaCy is able to compare two objects, and make a prediction of how similar they are. Predicting similarity is useful for building recommendation systems or flagging duplicates. For example, you can suggest a user content that’s similar to what they’re currently looking at, or label a support ticket as a duplicate if it’s very similar to an already existing one.
Installation instructions
pip
Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13).
# pip3 install spacy
Models
spaCy’s models can be installed as Python packages. This means that they’re a component of your application, just like any other module. They’re versioned and can be defined as a dependency in your requirements.txt. Models can be installed from a download URL or a local directory, manually or via pip. Their data can be located anywhere on your file system.
python3 -m spacy download en
spaCy currently provides support for the following languages. Here is a complete list.
Example
here is a little python code that compare two string.
import spacy
nlp = spacy.load("en_core_web_sm")
first_text = nlp(input("insert first text: "))
second_text = nlp(input("insert second text: "))
print(f"similarity: {first_text.similarity(second_text)}")
result
insert first text: i'm a software developer
insert second text: i'm a software web developer
similarity: 0.9302790237853475
Enjoy!!
Top comments (1)
Can we get in touch to discuss more on a book that we are planning to update?