DEV Community

Alin Climente
Alin Climente

Posted on

Meilisearch Python setup

First create a docker compose service:

Name the service however you want. Checkout the latest version of meilisearch (1.16 may be old by now).

  chatcodfiscal-meilisearch:
    image: getmeili/meilisearch:v1.16
    container_name: chatcodfiscal-meilisearch
    restart: unless-stopped
    expose:
       - "7700"
    env_file:
      - ./webapp/.env.prod
    environment:
      - MEILI_HTTP_ADDR=0.0.0.0:7700
      # uncomment this for prod if new index is created
      # - MEILI_IMPORT_DUMP=/meili_data/dumps/backup.dump
    volumes:
      # uncomment this for prod if new index is created
      # - ./webapp/data/meili_data/dumps/backup.dump:/meili_data/dumps/backup.dump
      - ./webapp/data/meili_data:/meili_data
    networks:
      - web
Enter fullscreen mode Exit fullscreen mode

In your env file add MEILI_MASTER_KEY and MEILI_HTTP_ADDR (use meilisearch service name from docker compose if the api is called within docker network).

MEILI_MASTER_KEY=super-secret-key
# uncomment this for prod
# MEILI_HTTP_ADDR=http://chatcodfiscal-meilisearch:7700
# uncomment this for local
MEILI_HTTP_ADDR=http://0.0.0.0:7700
Enter fullscreen mode Exit fullscreen mode

Index some documents:

import meilisearch
from django.conf import settings
from webapp.logger import log

from ingestor.models import ChunksModel

LEGAL_SYNONYMS = {
    # -------------------------------------------------------------
    # I. IMPOZITE, CONTRIBUȚII ȘI DECLARAȚII FISCALE
    # -------------------------------------------------------------
    "tva": [
        "taxa pe valoarea adaugata",
        "platitor de taxa",
        "inregistrare tva",
        "cod de inregistrare in scopuri de tva",
    ],
    "impozit": ["taxa", "bir", "obligatii fiscale", "de dat la stat"],
    "cif": ["cod de identificare fiscala", "cod fiscal"],
    "cas": [
        "contributia de asigurari sociale",
        "pensii",
        "contributie pensii",
        "cota cas",
    ],
    "cass": [
        "contributia de asigurari sociale de sanatate",
        "sanatate",
        "asigurari sociale de sanatate",
        "cota cass",
    ],
    # etc - You can add some usual synonyms useful for similar search.
}


def index_chunks():
    client = meilisearch.Client(settings.MEILI_HTTP_ADDR, settings.MEILI_MASTER_KEY)

    index_uid = "chunks"
    log.info(f"Deleting existing index '{index_uid}'...")
    try:
        client.delete_index(index_uid)
    except Exception:
        pass

    index = client.index(index_uid)

    log.debug(f"Configuring index settings for {index_uid}...")

    index.update_filterable_attributes(["nume_fisier"])

    index.update_searchable_attributes(["nume_sursa", "text_summary", "text_markdown"])

    log.debug("Updating synonyms dictionary...")
    synonyms_task = index.update_synonyms(LEGAL_SYNONYMS)

    client.wait_for_task(synonyms_task.task_uid)

    log.debug("Fetching data from Django DB...")
    chunks = ChunksModel.objects.all().iterator()

    documents = []
    for chunk in chunks:
        doc = {
            "id": str(chunk.pk),
            "nume_sursa": chunk.nume_sursa,
            "nume_fisier": chunk.nume_fisier,
            "text_markdown": chunk.text_markdown,
            "text_summary": chunk.text_summary,
        }
        documents.append(doc)

    log.debug(
        f"Prepared {len(documents)} documents. Sending to Meilisearch index '{index_uid}'..."
    )

    response = index.add_documents(documents)

    task_uid = response.task_uid

    log.info(f"Upload started. Task UID: {task_uid}")

    client.wait_for_task(task_uid)

    log.debug("Creating dump Meilisearch...")

    response = client.create_dump()

    client.wait_for_task(response.task_uid)

    log.success("DONE")

    return

Enter fullscreen mode Exit fullscreen mode

Here ChunksModel is a django model source, but you can add whatever source you want.

Let's start searching!

Here we initialize meilisearch to make some searches. The chunksfts is what we will import where we need to do some searches. You can notice that we've made some pydantic models to parse the result from meilisearch.

import meilisearch
from pydantic import BaseModel
from webapp.settings import MEILI_HTTP_ADDR, MEILI_MASTER_KEY

meiliclient: meilisearch.Client = meilisearch.Client(MEILI_HTTP_ADDR, MEILI_MASTER_KEY)

chunksfts = meiliclient.index("chunks")


class FTSHit(BaseModel):
    id: int
    nume_sursa: str
    nume_fisier: str
    text_markdown: str
    text_summary: str


class FTSHits(BaseModel):
    hits: list[FTSHit]
    query: str

Enter fullscreen mode Exit fullscreen mode

To search you'll just need to do this:

from .fts import chunksfts, FTSHits

meili_query = "Some query"
meili_options = {"limit": 20} # See more options on meilisearch docs

result = chunksfts.search(meili_query, meili_options)
fts_results = FTSHits(**result)

Enter fullscreen mode Exit fullscreen mode

That's it! Now you've got full text search with Meilisearch.

Top comments (0)