There are things I do with DataFrames all the time that pandas was never built for. Filtering by subjective criteria. Joining tables that don't share a key. Looking up information that only exists on the web. Recently I've been using LLMs, and the results have been surprisingly cheap and accurate.
Here are five operations I now handle with LLMs (with working code).
1. Filter by Qualitative Criteria
You have 3,616 job postings and want only the ones that are remote-friendly, senior-level, AND disclose salary. df[df['posting'].str.contains('remote')] matches "No remote work available."
Cost: $4.24 for 3,616 rows (9.9 minutes)
from everyrow.ops import screen
from pydantic import BaseModel, Field
class JobScreenResult(BaseModel):
qualifies: bool = Field(description="True if meets ALL criteria")
result = await screen(
task="""
A job posting qualifies if it meets ALL THREE criteria:
1. Remote-friendly: Explicitly allows remote work
2. Senior-level: Title contains Senior/Staff/Lead/Principal
3. Salary disclosed: Specific compensation numbers mentioned
""",
input=jobs,
response_model=JobScreenResult,
)
216 of 3,616 passed (6%). Interestingly, the pass rate has climbed from 1.7% in 2020 to 14.5% in 2025 as more companies are offering remote work and disclosing salaries.
Full guide with dataset · See it applied to real job postings: Screening job postings by criteria
2. Classify Rows Into Categories
You need to label 200 job postings into categories (backend, frontend, data, ML/AI, devops, etc.). Keyword matching misses anything that's not an exact match, but training a classifier is overkill for a one-off task like this.
Cost: $1.74 for 200 rows (2.1 minutes). At scale: ~$9 for 1,000 rows, ~$90 for 10,000.
from everyrow.ops import agent_map
from typing import Literal
class JobClassification(BaseModel):
category: Literal[
"backend", "frontend", "fullstack", "data",
"ml_ai", "devops_sre", "mobile", "security", "other"
] = Field(description="Primary role category")
reasoning: str = Field(description="Why this category was chosen")
result = await agent_map(
task="Classify this job posting by primary role...",
input=jobs,
response_model=JobClassification,
)
The Literal type constrains the LLM to your predefined set, so there's no post-processing needed. You can add confidence scores and multi-label support by extending the Pydantic model.
3. Add a Column Using Web Research
You have a list of 246 SaaS products and need the annual price of each one's lowest paid tier. There's no API for this kind of problem because it requires visiting pricing pages that all present information differently.
Cost: $6.68 for 246 rows (15.7 minutes), 99.6% success rate
from everyrow.ops import agent_map
class PricingInfo(BaseModel):
lowest_paid_tier_annual_price: float = Field(
description="Annual price in USD for the lowest paid tier"
)
tier_name: str = Field(description="Name of the tier")
result = await agent_map(
task="""
Find the pricing for this SaaS product's lowest paid tier.
Visit the product's pricing page.
Report the annual price in USD and the tier name.
""",
input=df,
response_model=PricingInfo,
)
Each result comes with a research column showing how the agent found the answer, with citations. For example, Slack's entry references slack.com/pricing/pro and shows the math: $7.25/month × 12 = $87/year.
Full guide with dataset · See it applied to vendor matching: Matching software vendors to requirements
4. Join DataFrames Without a Shared Key
You have two tables of S&P 500 data — one with company names and market caps, the other with stock tickers and fair values. Without a shared column across both datasets, pd.merge() is useless.
Cost: $1.00 for 438 rows (~30 seconds), 100% accuracy
from everyrow.ops import merge
result = await merge(
task="Match companies to their stock tickers",
left_table=companies, # has: company, price, mkt_cap
right_table=valuations, # has: ticker, fair_value
)
# 3M → MMM, Alphabet Inc. → GOOGL, etc.
Under the hood, it uses a cascade: exact match → fuzzy match → LLM reasoning → web search. The results show 99.8% of rows matched via LLM alone. And even with 10% character-level noise ("Alphaeet Iqc." instead of "Alphabet Inc."), it hit 100% accuracy at $0.44. I'd much prefer having to manually review the unmatched rows than deal with false positives.
Full guide with dataset · See it applied at scale: LLM-powered merging at scale
5. Rank by a Metric That's Not in Your Data
You have 300 PyPI packages and want to rank them by days since last release and number of GitHub contributors. This data is on PyPI and GitHub (not in your DataFrame).
Cost: $3.90 for days-since-release, $4.13 for GitHub contributors (300 rows each, ~5 minutes)
from everyrow.ops import rank
result = await rank(
task="Rank by number of days since the last PyPI release",
input=packages,
field_name="days_since_release",
)
The SDK sends a web research agent per row to look up the metric, then ranks by the result. And it works for any metric you can describe in natural language, as long as it's findable on the web.
Cost Summary
| Operation | Rows | Cost | Time |
|---|---|---|---|
| Filter job postings | 3,616 | $4.24 | 9.9 min |
| Classify into categories | 200 | $1.74 | 2.1 min |
| Web research (pricing) | 246 | $6.68 | 15.7 min |
| Fuzzy join (no key) | 438 | $1.00 | 30 sec |
| Rank by external metric | 300 | $3.90 | 4.3 min |
All of these are one function call on a pandas DataFrame. The orchestration (batching, parallelism, retries, rate limiting, model selection) is handled by everyrow, an open-source Python SDK. New accounts get $20 in free credit, which covers all five examples above with room to spare.
The full code and datasets for each example are linked above.
Top comments (0)