DEV Community

Sarah Guthals, PhD for Tensorlake

Posted on • Originally published at tensorlake.ai

Unlocking Smarter RAG with Qdrant + Tensorlake: Structured Filters Meet Semantic Search

RAG isn't new. But the bar for what constitutes "good RAG" has risen dramatically. Simple embedding lookup was sufficient in 2023, but today's applications demand far more sophisticated retrieval to avoid hallucinations and irrelevant responses.

Modern baseline RAG requires:

  • Hybrid search: Vector + full-text search for comprehensive coverage
  • Rich metadata filtering: Structured data to narrow search scope before retrieval
  • Table and figure understanding: Complex visual content properly parsed, summarized, and embedded
  • Global document context: Every chunk enriched with document-level metadata

To keep up with modern RAG, you are likely chaining multiple model calls with orchestration systems, requiring you to manage state between calls by writing intermediate outputs to some storage systems. This requires ongoing maintenance and some amount of infrastructure work.

With Tensorlake, you get the power of the best models without needing to maintain any infrastructure. Tensorlake delivers everything you need to build modern RAG in a single API call. In this post, we'll show you how to implement next-level RAG with structured filtering, semantic search, and comprehensive document understanding, all integrated seamlessly.

Beyond simple embedding lookup: What modern RAG demands

Traditional RAG implementations that rely solely on vector similarity often fail in production because they miss critical context and retrieve irrelevant passages. When dealing with complex documents like financial reports, research papers, or legal contracts, you need structured filters to narrow your search space before running semantic queries.

Tensorlake extracts structured data, parses complex tables and figures into markdown (with summaries), and provides document-level metadata in a single API call. Combined with Qdrant's filtering and hybrid search capabilities, you get production-grade RAG without building complex parsing pipelines.

Diagram showing how Tensorlake processes PDFs into structured data, document layout, and markdown chunks, which are encoded and sent to Qdrant with metadata. A user query runs filtered search on structured data and embeddings search on markdown chunks for accurate retrieval.

Try it yourself: Academic research papers demo

To show how simple this really is, we put together a live demo using academic research papers about computer science education. You can try this out for yourself using this Colab Notebook.

This example demonstrates modern RAG requirements with complex documents that have:

  • Complex reading order: Author names in columns with mixed affiliations, text in multiple columns
  • Critical figures and tables: Data that's essential for accurate embeddings (already parsed into markdown)
  • Rich metadata: Conference info, authors, universities for precise filtering

We'll build a hybrid search system (with structured filters) that can answer questions like "Does computer science education improve problem solving skills?" while filtering by specific authors, conferences, or years.

It's really just three quick steps:

  1. Parse documents with Tensorlake
  2. Create embeddings and upsert to Qdrant
  3. Perform hybrid search with structured filtering

Step 1: Parse documents with Tensorlake

First, define a schema for the structured data you want extracted. For research papers, we'll extract metadata like authors and conference information:

from pydantic import BaseModel
from typing import List

class Author(BaseModel):
  """Author information for a research paper"""
  name: str = Field(description="Full name of the author")
  affiliation: str = Field(description="Institution or organization affiliation")

class Conference(BaseModel):
  """Conference or journal information"""
  name: str = Field(description="Name of the conference or journal")
  year: str = Field(description="Year of publication")
  location: str = Field(description="Location of the conference or journal publication")

class ResearchPaperMetadata(BaseModel):
  """Complete schema for extracting research paper information"""
  authors: List[Author] = Field(description="List of authors with their affiliations. Authors will be listed below the title and above the main text of the paper. Authors will often be in multiple columns and there may be multiple authors associated to a single affiliation.")
  conference_journal: Conference = Field(description="Conference or journal information")
  title: str = Field(description="Title of the research paper")

# Convert to JSON schema for Tensorlake
json_schema = ResearchPaperMetadata.model_json_schema()
Enter fullscreen mode Exit fullscreen mode

Then, use Tensorlake's Document Ingestion API to parse research paper PDFs and extract both structured JSON data and markdown chunks:

file_path = "path/to/your/research_paper.pdf"
file_id = doc_ai.upload(path=file_path)

# Configure parsing options
parsing_options = ParsingOptions(
    chunking_strategy=ChunkingStrategy.SECTION,
    table_parsing_strategy=TableParsingFormat.TSR,
    table_output_mode=TableOutputMode.MARKDOWN,
)

# Create structured extraction options with the JSON schema
structured_extraction_options = [StructuredExtractionOptions(
    schema_name="ResearchPaper",
    json_schema=json_schema,
)]

# Create enrichment options
enrichment_options = EnrichmentOptions(
    figure_summarization=True,
    figure_summarization_prompt="Summarize the figure beyond the caption by describing the data as it relates to the context of the research paper.",
    table_summarization=True,
    table_summarization_prompt="Summarize the table beyond the caption by describing the data as it relates to the context of the research paper.",
)

# Initiate the parsing job
parse_id = doc_ai.parse(file_url, parsing_options, structured_extraction_options, enrichment_options)
result = doc_ai.wait_for_completion(parse_id)

# Collect results - structured data and markdown chunks (table and figure summaries already in chunks)
structured_data = result.structured_data[0] if result.structured_data else {}
chunks = result.chunks if result.chunks = result.chunks else []
Enter fullscreen mode Exit fullscreen mode

Step 2: Create embeddings and upsert to Qdrant

Create embeddings for markdown chunks (which include table and figure summaries) and store in Qdrant with structured metadata for filtering:

from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient, models
from uuid import uuid4

# Initialize embedding model and Qdrant client
model = SentenceTransformer('all-MiniLM-L6-v2')
qdrant_client = QdrantClient(url="your-qdrant-url", api_key="your-api-key")

def create_qdrant_collection(structured_data, chunks):
  # Create collection if it doesn't exist
  collection_name = "research_papers"
  if not qdrant_client.collection_exists(collection_name):
      qdrant_client.create_collection(
          collection_name=collection_name,
          vectors_config=models.VectorParams(size=384, distance=models.Distance.COSINE)
      )

  # Extract structured metadata for global document context
  author_names = []
  conference_name = ""
  conference_year = ""
  title = ""

  if structured_data:
      # Extract author information for filtering
      if 'authors' in structured_data.data:
          for author in structured_data.data['authors']:
              if isinstance(author, dict):
                  author_name = author.get('name', '')
                  author_names.append(author_name)

      # Extract conference information for filtering
      if 'conference_journal' in structured_data.data:
          conf = structured_data.data['conference_journal']
          if isinstance(conf, dict):
              conference_name = conf.get('name', '')
              conference_year = conf.get('year', '')

      title = structured_data.data.get('title', '')

  points = []

  # Create embeddings for markdown chunks (tables/figures already included)
  chunk_texts = [chunk.content for chunk in chunks]
  vectors = model.encode(chunk_texts).tolist()

  # Process each chunk with global document context
  for i, chunk in enumerate(chunk_texts):
      # Enhanced payload with structured metadata for filtering
      payload = {
          "content": chunk,
          "document_index": doc_idx,
          # Structured data fields for filtering
          "title": title,
          "author_names": author_names,  # List for filtering by specific authors
          "conference_name": conference_name,
          "conference_year": conference_year
      }

      points.append(models.PointStruct(
          id=str(uuid4()),
          vector=vectors[i],
          payload=payload
      ))

  # Insert data into Qdrant
  qdrant_client.upsert(collection_name=collection_name, points=points)

  # Create indices for efficient hybrid search and filtering
  qdrant_client.create_payload_index(collection_name, "title")
  qdrant_client.create_payload_index(collection_name, "author_names") 
  qdrant_client.create_payload_index(collection_name, "conference_name")
  qdrant_client.create_payload_index(collection_name, "conference_year")
Enter fullscreen mode Exit fullscreen mode

Step 3: Perform hybrid search with structured filtering

Now you can perform both simple queries and filtered searches and test the results.

Simple search

points = qdrant_client.query_points(
  collection_name="research_papers",
  query=model.encode("Does computer science education improve problem solving skills?").tolist(),
  limit=3,
).points

for point in points:
  print(point.payload.get('title', 'Unknown'), "score:", point.score)
Enter fullscreen mode Exit fullscreen mode

With output:

CodeSpells: Bridging Educational Language Features with Industry-Standard Languages score: 0.57552844
CHILDREN'S PERCEPTIONS OF WHAT COUNTS AS A PROGRAMMING LANGUAGE score: 0.55624765
Experience Report: an AP CS Principles University Pilot score: 0.54369175
Enter fullscreen mode Exit fullscreen mode

Filtered Search

points = qdrant_client.query_points(
  collection_name="research_papers",
  query=model.encode("Does computer science education improve problem solving skills?").tolist(),
  query_filter=models.Filter(
      must=[
          models.FieldCondition(
              key="author_names",
              match=models.MatchValue(
                  value="William G. Griswold",
              ),
          )
      ]
  ),
  limit=3,
).points

for point in points:
  print(point.payload.get('title', 'Unknown'), point.payload.get('conference_name', 'Unknown'), "score:", point.score)
Enter fullscreen mode Exit fullscreen mode

With output:

CodeSpells: Bridging Educational Language Features with Industry-Standard Languages Koli Calling '14 score: 0.57552844
CODESPELLS: HOW TO DESIGN QUESTS TO TEACH JAVA CONCEPTS Consortium for Computing Sciences in Colleges score: 0.4907498
CodeSpells: Bridging Educational Language Features with Industry-Standard Languages Koli Calling '14 score: 0.4823265
Enter fullscreen mode Exit fullscreen mode

Modern RAG in production: Structured + semantic

Building production-grade RAG requires more than embedding models and vector search. You need structured metadata extraction, hybrid search capabilities, comprehensive document understanding, and global context—all integrated seamlessly.

Whether you're building an agent that searches academic literature, audits financial reports, or summarizes patient history, the combination of Tensorlake's document understanding and Qdrant's hybrid search gives you the foundation for reliable, accurate RAG systems.

With documents like research papers, you can see how a sinlge API call to Tensorlake for complete markdown conversion, structured data extraction, and table and figure summarization can provide the rich context modern RAG demands. You get complete, accurate data while reducing pipeline complexity.

Next Steps: Try Tensorlake

When you sign up with Tensorlake you get 100 free credits so that you can parse your own documents.

Explore the rest of our cookbooks and learn how to make vector database search more effective for your RAG workflows today. Start with the research papers colab notebook or build your own workflow from the Tensorlake API docs.

Got feedback or want to show us what you built? Join the conversation in our Slack Community!

Ready to get started? Head over to our Qdrant docs to find all the documentation, tips, and samples.

Happy parsing!

Top comments (0)