Pierre Brunelle

Posted on Jul 7

Stop Gluing Data Infrastructure Tools: Build Multimodal AI Workloads and Application with One Declarative Python SDK

#python #multimodal #machinelearning #opensource

Introducing Pixeltable open-source data infrastructure, that unifies your data store, transformation, indexing, and retrieval queries for your AI applications and pipelines, so you can stop wrestling with infrastructure and spaghetti code and start building.

Building AI applications today feels like assembling a puzzle in the dark. You're constantly gluing together a relational database for metadata, an object store for files, a separate vector database for search, and a complex web of scripts, orchestrators, cache, and state manager to make them all work together and talk to each other.

Every new data type (e.g. video, audio, PDFs...) adds another layer of complexity. Every new AI model means another pipeline to build, manage, and maintain. The result? You spend more time on plumbing and infrastructure than on the value add of your product, high-impact work of building AI-powered products.

What if we could change that? What if there was a single, declarative platform that could manage it all?

Meet Pixeltable

You define your entire data processing and AI workflow declaratively using computed columns on tables. Pixeltable's engine then automatically handles:

Data Ingestion & Storage: References files (images, videos, audio, docs) in place, handles structured data.
Transformation & Processing: Applies any Python function (UDFs) or built-in operations (chunking, frame extraction) automatically.
AI Model Integration: Runs inference (embeddings, object detection, LLMs) as part of the data pipeline.
Indexing & Retrieval: Creates and manages vector indexes for fast semantic search alongside traditional filtering.
Incremental Computation: Only recomputes what's necessary when data or code changes, saving time and cost.
Versioning & Lineage: Automatically tracks data and schema changes for reproducibility.

The "Aha!" Moment: How It Works

Pixeltable's magic lies in its declarative nature. You simply define what you want, and Pixeltable orchestrates how to get it done—incrementally and efficiently.

Let's walk through a 4-step workflow that shows just how powerful this is.

Step 1: Create a Table & Add a Computed Column

Everything starts with a table. But unlike traditional databases, a Pixeltable table can natively handle any data type—images, videos, documents, and more—right alongside your structured data.

Then, you can add computed columns that transform your data using simple Python expressions. Pixeltable's engine automatically orchestrates the computation for all existing and future rows.

import pixeltable as pxt

# Create a table for films with revenue and budget
t = pxt.create_table(
    'films', 
    {'name': pxt.String, 'revenue': pxt.Float, 'budget': pxt.Float}, 
    if_exists="replace"
)

# Insert some data
t.insert([
  {'name': 'Inside Out', 'revenue': 800.5, 'budget': 200.0},
  {'name': 'Toy Story', 'revenue': 1073.4, 'budget': 200.0}
])

# Add a computed column for profit.
# Pixeltable calculates this automatically.
t.add_computed_column(profit=(t.revenue - t.budget))

# Query the results to see the computed profit
print(t.select(t.name, t.profit).collect())
#
# +------------+--------+
# | name       | profit |
# +------------+--------+
# | Inside Out | 600.5  |
# | Toy Story  | 873.4  |
# +------------+--------+

Step 2: Run an AI Vision Pipeline with a UDF

Want to bring your own logic? Wrap any Python function in a @pxt.udf decorator. Pixeltable seamlessly integrates this User-Defined Function into its declarative framework, automatically managing dependencies, caching, lineage, versioning, parallelization, and more...

import PIL
import pixeltable as pxt
from yolox.models import Yolox
from yolox.data.datasets import COCO_CLASSES

# Assumes table 't' exists with an 'image' column
# t = pxt.get_table('my_images') 
# t.insert([{'image': 'path/to/cat.jpg'}])

# Wrap any Python code in a UDF
@pxt.udf
def detect(image: PIL.Image.Image) -> list[str]:
    # Load a pre-trained YOLOX model
    model = Yolox.from_pretrained("yolox_s")
    result = model([image])
    # Return a list of detected class labels
    coco_labels = [COCO_CLASSES[label] for label in result[0]["labels"]]
    return coco_labels

# Apply the UDF as a computed column
t.add_computed_column(classification=detect(t.image))

# The 'classification' column is now automatically populated!
#
# +----------------------+------------------+
# | image                | classification   |
# +----------------------+------------------+
# | <Image: cat.jpg>     | ['cat', 'couch'] |
# | <Image: birds.png>   | ['bird']         |
# +----------------------+------------------+

Step 3: Perform Multimodal Vector Search

Forget setting up and managing a separate vector database. With Pixeltable, you can add a multimodal embedding index to any column with a single line of code.

Pixeltable handles the entire lifecycle: generating embeddings, storing them efficiently, and—crucially—keeping them automatically in sync with your source data. This enables powerful, co-located semantic search across all your data types.

import pixeltable as pxt
from pixeltable.functions.huggingface import clip

# Assumes table 'images' exists with an 'img' column
# images = pxt.get_table('my_images')

# 1. Add a CLIP embedding index to the image column
images.add_embedding_index(
    'img',
    embedding=clip.using(model_id='openai/clip-vit-base-patch32')
)

# 2. Perform text-to-image similarity search
query_text = "a dog playing fetch"
sim_text = images.img.similarity(query_text)
results = images.order_by(sim_text, asc=False).limit(1).collect()

Step 4: Build an Incremental RAG Workflow

Now, let's put it all together. You can build a powerful, end-to-end RAG system with just a few declarative statements. Pixeltable orchestrates the entire workflow, from chunking documents and generating embeddings to retrieving relevant context and prompting an LLM.

Because the pipeline is incremental, only new or updated documents are processed, making your RAG application highly efficient and cost-effective.

import pixeltable as pxt
from pixeltable.functions import openai, huggingface
from pixeltable.iterators import DocumentSplitter

# 1. Create tables for documents and Q&A
docs = pxt.create_table('docs', {'doc': pxt.Document})
qa = pxt.create_table('qa', {'prompt': pxt.String})

# 2. Create a view to chunk documents
chunks = pxt.create_view('chunks', docs,
    iterator=DocumentSplitter.create(document=docs.doc, separators='sentence'))

# 3. Create an embedding index on the chunks
embed_model = huggingface.sentence_transformer.using(model_id='all-MiniLM-L6-v2')
chunks.add_embedding_index('text', string_embed=embed_model)

# 4. Define a query function to retrieve context
@pxt.query
def get_context(query_text: str, limit: int = 3):
    sim = chunks.text.similarity(query_text)
    return chunks.order_by(sim, asc=False).limit(limit)

# 5. Build the RAG pipeline on the Q&A table
qa.add_computed_column(context=get_context(qa.prompt))
qa.add_computed_column(
    answer=openai.chat_completions(
        model='gpt-4o-mini',
        messages=[{
            'role': 'user',
            'content': f"Context: {qa.context.text}\nQuestion: {qa.prompt}"
        }]
    ).choices[0].message.content
)

# 6. Ask a question - Pixeltable runs the whole pipeline!
qa.insert([{'prompt': 'What were the key takeaways from the report?'}])

Why Pixeltable? The Core Principles

Declarative & Simple: Focus on your logic, not the infrastructure. Define your entire workflow in Python, and let Pixeltable handle the complex orchestration.
Unified & Multimodal: Your data, embeddings, and transformations live together. No more data silos or brittle integration scripts.
Incremental & Cost-Effective: Pixeltable understands your data dependencies and only recomputes what's necessary when data changes, saving you massive amounts of time and compute cost.
Extensible & Open: Built on an open-source core, Pixeltable allows you to bring your own Python functions, AI models, and business logic directly into the data layer.

Get Started Today!

Ready to stop gluing tools and start building?

Install Pixeltable:
```
pip install pixeltable
```
Star us on GitHub: https://github.com/pixeltable/pixeltable
Read the Docs: https://docs.pixeltable.com
Join our Community: We're active on Discord and would love to hear from you!

We're incredibly excited to see what you build with Pixeltable. Happy coding!

DEV Community