DEV Community

Cover image for Sentence Similarity With Transformers and PyTorch
James Briggs
James Briggs

Posted on

Sentence Similarity With Transformers and PyTorch

All we ever seem to talk about nowadays are BERT this, BERT that. I want to talk about something else, but BERT is just too good - so this video will be about BERT for sentence similarity.

A big part of NLP relies on similarity in highly-dimensional spaces. Typically an NLP solution will take some text, process it to create a big vector/array representing said text - then perform several transformations.

It's highly-dimensional magic.

Sentence similarity is one of the clearest examples of how powerful highly-dimensional magic can be.

The logic is this:

  • Take a sentence, convert it into a vector.
  • Take many other sentences, and convert them into vectors.
  • Find sentences that have the smallest distance (Euclidean) or smallest angle (cosine similarity) between them - more on that here.
  • We now have a measure of semantic similarity between sentences - easy!

At a high level, there's not much else to it. But of course, we want to understand what is happening in a little more detail and implement this in Python too.

Medium article

Easy mode

Oldest comments (0)