What is Full-Text Search in MySQL?
How does the Full-Text Search in MySQL work?
LIKE vs Full-Text Search operators in MySQL
Advantages and disadvantages of Full Text Search and LIKE operator in MySQL
Full-text Search Restrictions
Example of Using the Full-text Search in MySQL
You can have Google-like superpowers to find words and phrases across your entire database. Search engines use Full-Text Search to find results in databases. These databases often contain a lot of textual data. For example, news websites contain lots of news articles and may want to find a specific phrase. Or users may file support tickets with the same wording that indicates a common problem.
Full-Text Search does not try to find a word-for-word phrase written by the user. It is trying to guess what the user had in mind, to learn the deeper meaning, so to speak.
Its goal is to find results that are approximate in meaning and quickly issue them. Thus, the result will be much more useful to the user, as it will be more relevant.
Typically, a search service has two components.Search engine and indexer:
- The indexer receives text as input, does text processing (cutting out endings, insignificant words, etc.), and stores everything in the index. The structure of such an index allows for very fast searches;
- The search engine, an index search interface, accepts a request from the client, processes the phrase, and searches for it in the index.
To use full-text search in MySQL you need to use full-text indexes and the MATCH () function. The full-text index is FULLTEXT.
Mysql supports full-text indexes on MyISAM tables. InnoDB support has been added since version 5.6.4.
When you create a table using the CREATE TABLE command, you can create these indexes on the VARCHAR and TEXT columns. You can also add indexes later using the ALTER TABLE or CREATE INDEX commands.
However, you should refrain from adding the FULLTEXT index right away when creating the table. Large amounts of data are loaded into the table much faster if no index is added. That is why indexes are usually added later.
The MATCH () function executes a language search. It compares a string to the text content. By content I mean the combination of one or more columns included in the FULLTEXT index.
The search string is specified with an argument in the AGAINST () expression. The search is case-insensitive. The MATCH () command returns the relevance value for each row in a column. Relevance value is how similar the search string and the text in the string are.
MATCH (column1,column2,column3...) AGAINST (expression [search_modifier])
When you use the MATCH () command in a WHERE clause, the result column rows are automatically sorted by relevance.
The relevance value is a non-negative floating-point number.
Relevance is calculated based on:
- Words in a given row of a column;
- Unique words in that row;
- Number of words in the text;
- Rows containing a single word.
Full-Text Search is not a complex tool, it is almost as simple as the LIKE operator. But the LIKE operator alone is not as useful as FTS. I'll talk about this in the next subsection.
A synonym for the LIKE operator is precision, for full-text input it is flexibility. Thus, you can easily understand the intricacies of the work of each method. And now for more details.
High LIKE operator precision means that fewer false results are displayed. A high recall rate means fewer relevant results are overlooked. This means that LIKE is 100% accurate when searching for a phrase.
However, the MySQL LIKE statement may not be useful in some circumstances.
If you apply it to a non-indexed column, the database will use a full scan to find matches.
If the column is indexed, matching can be done against the index keys, but with much less efficient than most index searches.
In the worst case, the LIKE pattern will have leading wildcards requiring each index key to be checked.
The LIKE operator is better when there are not a huge number of records and it is practically impossible to use it in tables with hundreds of thousands of rows since the search is performed on all records and this reduces performance.
Most full-text search implementations use an index, where keys are individual terms and related values are recordsets that contain the term.
FTS is optimized to compute the intersection, combining these recordsets. It often provides a ranking algorithm to quantify how closely a given entry matches your search keywords.
Full-text search features include:
- Splitting a block of unstructured text into separate words, phrases, and special tokens;
- Combining variations of a word into one index term;
- Measuring the similarity of a matched record to a query string.
Let's take a quick look at the comparison. Morphology means searching based on the root of the word using specific language and grammar knowledge.
|Less than O(n^2) complexity||No morphology support|
|Sort the results||No modifiers|
|Used on any type of field||Search in all lines|
|Support for Morphology||Lack of sorting capability|
|Results ordered by relevance||VARCHAR and TEXT with indexes only|
|Google style modifiers||Resource intensive process|
|Stop words||Native support for MyISAM tables only|
|Ability to customize||Adding data takes longer with FULLTEXT|
When using full-text search, it is also worth remembering that the MATCH () function has its limitations:
- All parameters of the MATCH () function must be columns of the same table. This means that they must be part of the same FULLTEXT index. Does not apply to BOOLEAN MODE;
- The list of columns in the MATCH () command must exactly match the list of columns in the FULLTEXT index definition for this table. Does not apply to BOOLEAN MODE;
- The argument in AGAINST () must be an immutable string.
In this example, you will see how relevant the results of a full-text search are.
First, let's create a schema:
CREATE SCHEMA IF NOT EXISTS `mydb`;
Our schema will be a database of cosmetic products. That is, now we are creating a table of categories:
CREATE TABLE IF NOT EXISTS `mydb`.`category` ( `id` INT NOT NULL, `name_category` VARCHAR(45) NOT NULL, FULLTEXT(name_category), PRIMARY KEY (`id`)) ENGINE = InnoDB;
And now we create a table of products:
CREATE TABLE IF NOT EXISTS `mydb`.`product` ( `id` INT NOT NULL AUTO_INCREMENT, `name_product` VARCHAR(45) NOT NULL, `price_product` DECIMAL NOT NULL, `amount_product` INT NOT NULL, `desc_product` VARCHAR(255) NOT NULL, `idcategory` INT NOT NULL, FULLTEXT(name_product, desc_product), PRIMARY KEY (`id`), INDEX `fk_product_category_idx` (`idcategory` ASC), CONSTRAINT `fk_product_category` FOREIGN KEY (`idcategory`) REFERENCES `mydb`.`category` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION) ENGINE = InnoDB;
Now that we have the tables, we fill them in starting with the category table:
INSERT INTO `mydb`.category (`id`, `name_category`) VALUES (1,'Shampoo'), (2,'Eyeshadow'), (3,'Face lotionproduct'), (4,'Face Primer'), (5,'Lipstick'), (6,'Powder'), (7,'Mascara'), (8,'Blush'), (9,'Highlighter'), (10,'Toner');
And also fill in the product table:
INSERT INTO product (id, name_product, price_product, amount_product, desc_product, idcategory) VALUES (1, 'Shampoo perfumes "Clear"', 130, 100, 'Styling products and environmental pollutants have been filling the hair structure from time to time, making it dull and hard. This shampoo is formulated to effectively remove this impurity while rejuvenating the hair structure.',1), (2, 'Eyeshadow "Makeup"', 245, 104, 'Contains an all-natural mineral wax with added kaolin and zinc oxide to create lush, healthy lashes. The brush is designed to reduce clumping of the product. Apply in a zigzag motion from root to tip to completely cover each lash. Reapply if desired.',2), (3, 'Face Lotion "Gigi"', 120, 99, 'The lotion promotes the resorption of inflammation, maturation and drying of abscesses. The lotion has a disinfectant, anti-inflammatory, antibacterial and antifungal effect, and also prevents the development of a secondary infection.',3), (4, 'Lipstick Set "LP"', 250, 165, 'The properties of lipstick will be appreciated by professional makeup artists and lovers of the perfect make-up. A variety of dazzling shades allow you to choose the right tone for elegant daytime and spicy evening make-up.',5), (5, 'Highlighter "NAC"', 175, 111, 'The skin seems to glow from the inside.',9), (6, 'Powder "Pow"', 190, 205, 'With a high concentration of shimmery pigments and a unique holographic shade, the powder enhances facial features and fills the skin with radiance.',6), (7, 'Toner "Tonn"', 160, 198, 'This is a versatile toner that removes deep impurities, tones the skin and gently removes dead skin cells from the skin!',10), (8, 'Blush "Gigi"', 135, 57, 'The cosmetic product has a silky texture that lays down on the skin very softly and smoothly, without clogging pores or making the make-up heavier.',8), (9, 'Highlighter "Lily"', 200, 30, 'It has a light texture that creates a weightless shimmer on the skin. Thanks to the smallest reflective particles, the face acquires a delicate, soft, natural shine.',9), (10, 'Shampoo "Lux"', 160, 11, 'Dry and damaged hair requires intensive repair. Gentle cleansing and deep nourishment in minutes.',1), (11, 'Lipstick "Gigi"', 90, 225, 'On your lips, lipstick will retain a flawless look for up to 12 hours, decorating them with a luxurious matte finish and a sophisticated shade.',5);
Outputting fields from a table by the word “shampoo” for two indirectly joined tables:
SELECT * FROM category c JOIN product p ON c.id = p.idcategory WHERE MATCH(p.name_product, p.desc_product) AGAINST ('shampoo');
You can notice that the search found this word in the product description and product name (line 1), and in line 2 this word is present only in the product name, but not in the description.
This was a short piece of information about full-text search in MySQL. A basic understanding of the usefulness of Full-Text Search will come from using this search in practice. Now you know that the LIKE operator is not the only way to search through text in the MySQL database. Full-Text Search in MySQL gives you search superpowers like Google has to find any string or word across all tables in your database.