brightplace

Posted on Apr 24

How we built an AI-native apartment search (and why filter-based search is dead)

#ai #webdev #architecture #elasticsearch

We are building brightplace, an AI-native apartment rental platform. This is a write-up of why we threw out the filter-based search model the rental industry has used since 2008, what we are replacing it with, and the three architectural decisions that made it possible.

If you work on search, LLM-powered products, or you are just curious about what happens when you rebuild a consumer vertical around natural language, this one is for you.

The problem with filter-based apartment search

Go to Zillow. Go to Apartments.com. Go to Rent.com. Open any of them.

You will see the same UI. A map. Filter chips at the top: beds, baths, price, pets. Maybe a dropdown for neighborhoods. You pick filters. The system returns a list of apartments that match those filters.

This is the model the entire rental industry has run on for 15+ years. And it is fundamentally broken for how people actually search in 2026.

Here is what is wrong with it:

Filters are a lossy compression of real renter intent. When someone says "I want a 1-bedroom under $2,000 in East Austin," that is a sentence with at least 12 unstated assumptions. They want a real second room for an office. They have a dog. They work hybrid. They value walkability. They care about fiber coverage. They want to be near specific coffee shops. None of this fits in a filter.

The index is static. A filter-based system matches queries against an index of units tagged with a fixed set of attributes. Any attribute not in the schema is invisible. "Good for introverts" is not a filter. Neither is "feels safe to walk at night."

Most of the signal lives outside the listing. Things that matter most to a renter (neighborhood vibe, commute reality, building culture, noise, natural light, building responsiveness) are not in the listing database at all. They live in reviews, social posts, local forums, and the experience of people who have lived there.

AI has raised the bar. Once renters learned to ask ChatGPT "where should I live in Austin if I work remotely and have a dog," filter-based search stopped being acceptable. The comparison is no longer "is this UI fast" but "can this answer my actual question."

We decided the only way to fix this was to throw out the model and start over.

The architecture: three layers

We call the platform IntentOS. It has three layers, each solving a different part of the problem.

┌──────────────────────────────────────────────────┐
│  LAYER 3: Context Engine                         │
│  Natural language prompts → structured queries   │
│  → enriched results                              │
└──────────────────────────────────────────────────┘
                        ▲
                        │
┌──────────────────────────────────────────────────┐
│  LAYER 2: Listings Source API                    │
│  Machine-readable feed of verified listings      │
│  Enriched with contextual metadata               │
│  LLM-optimized documentation                     │
└──────────────────────────────────────────────────┘
                        ▲
                        │
┌──────────────────────────────────────────────────┐
│  LAYER 1: Data Lake                              │
│  Listings · Availability · Pricing · Reviews     │
│  Schools · Commute scores · Landmarks · Social   │
└──────────────────────────────────────────────────┘

The flow: renters query the Context Engine in natural language. The Context Engine turns the query into structured requests against the Listings Source API. The API serves back results enriched from the Data Lake.

Reading this, you will probably think: "that's just RAG with extra steps." It is not. And the difference between a generic RAG stack and what we are building is in the decisions we made about each layer.

Decision 1: the data lake is not a database

The first instinct when building AI search for any vertical is to load a single structured database and call it a day. We tried that. It does not work.

The problem is that apartment data lives in three completely different shapes:

Transactional data (availability, pricing, floor plans, amenities): lives in operator PMS systems like RealPage, Yardi, Entrata, AppFolio. Structured, relational, updated in real-time, accessible via various APIs with various auth models. Easy to normalize once you solve the integration problem.

Semi-structured data (property descriptions, amenity lists, policies): lives in operator marketing sites. Partially structured, frequently out of date, inconsistent across operators. Hard to normalize, critical to enrich.

Unstructured data (reviews, social posts, local news, neighborhood forums): lives everywhere. Yelp, Reddit, Niche, City-Data, TikTok comments. This is where the real signal about "what is it like to live here" comes from. Completely unstructured, high noise, critical for quality answers.

Our Data Lake holds all three, normalized where possible, linked by a unified entity schema (property ID, neighborhood ID, city ID, cohort ID). The unstructured data gets chunked and embedded alongside structured data so the search layer can reason across both in one query.

Why this matters: without the unstructured layer, you get the same answers as Apartments.com. With it, you can answer "is this neighborhood safe to walk at night" which is a question that does not map to any filter but matters enormously to real renters.

Decision 2: the API is the product

Early on, we made a call that shaped everything else: the Listings Source API is a product, not plumbing.

We built it with LLMs as first-class consumers, not as an afterthought. Which meant:

LLM-readable documentation. Our API docs are written to be crawled and understood by GPTBot, ClaudeBot, and PerplexityBot, in addition to human developers. We follow the emerging llms.txt convention and publish Aimarkdown-formatted endpoints so LLMs can understand not just our data shape but the semantic meaning of each field.

Structured responses optimized for agents. Every response includes a compact summary field designed to be inserted directly into an LLM context window. We stopped returning verbose JSON blobs that waste tokens and started returning shaped responses that an agent can paste into its own output with minimal transformation.

Operator-scoped deployment. The same API can be deployed to serve only one operator's inventory (for their own site) or the full index (for brightplace.ai or third-party agents). Scoping is a first-class parameter, not a retrofit.

Attribution as a primary feature. Every API response carries provenance. Where the renter came from (ChatGPT, Perplexity, Google, an operator site). Where they went after. What they did. This is how we monetize (operators pay per signed lease sourced through brightplace), but more importantly, it is how we learn which queries actually convert.

In practice: as AI agents increasingly mediate the internet, the API may end up driving more renter outcomes than the consumer UI. We are building for that world.

Decision 3: the context engine replaces the filter bar

The Context Engine is the layer that actually handles renter queries. It is a three-step pipeline:

Step 1: intent extraction. Take the natural language query. Extract structured intent. Not just entities (beds, price, neighborhood) but cohort (are they a student, remote worker, family, relocator?) and lifecycle stage (are they just exploring, actively searching, ready to apply?).

Step 2: context enrichment. Pull from the Data Lake. Combine structured inventory data with the unstructured context layer. If the query involves neighborhood questions, retrieve from the forum/review chunks. If it involves schools, pull from the school data pipeline. If it involves commute, run real commute calculations.

Step 3: response generation. Generate a response that answers the question. Not a list of apartments. An actual answer, with apartments as one element of it. "Here are three East Austin neighborhoods that fit your situation. Logan Square has the most walkable density. Mueller is quieter. Here are the specific units available in each."

This is the single biggest shift from the old model. Filter-based search returns a list. Context-based search returns an answer. Lists are a commodity. Answers are a product.

The three hardest problems we hit

I am going to be honest about what was harder than we expected.

Problem 1: ground truth for unstructured data is a nightmare. Reviews are noisy. Social posts are dated. Local forums have strong biases. We spent weeks building a trust-scoring system that weights different sources against each other based on recency, specificity, and corroboration. Still imperfect. Probably always will be.

Problem 2: operators have every kind of API auth you can imagine. RealPage, Yardi, Entrata, AppFolio. Some use OAuth. Some use API keys. Some use IP allowlists. Some have different auth schemes per feature. Integration work ate more engineering time than anyone planned for.

Problem 3: LLM latency kills UX if you let it. A conversational search experience has to feel fast. If the first token takes 3 seconds to arrive, renters bounce. We cache aggressively, stream responses, and pre-compute common query patterns for major markets. Latency is a product requirement, not an engineering nice-to-have.

What is next

We are building toward a world where the renter does not visit a website to search for an apartment. They ask an AI agent. The agent queries our API. The agent returns a contextualized answer, with properties as first-class citations.

In that world, the winners are the companies with the cleanest data, the best context layer, and the most LLM-native APIs. The losers are the companies still optimizing their filter UI.

If this is the kind of problem you like thinking about, we are hiring. Or we welcome pushback. Both are useful.

Filter-based search made sense when the internet was a collection of indexes. It does not make sense in a world where the primary interface is conversation and the primary consumer of your API is an AI agent.

The rental industry is early on this curve. Most operators are still tweaking their filter chips. A few are starting to understand that the ground is shifting. By the time everyone figures it out, the companies that moved early will own the layer underneath everything.

If you want to see what we are building, it lives at brightplace.ai. The rental guides we publish as part of the data flywheel are at brightplace.ai/guides.

brightplace is an AI-native apartment rental platform. we publish engineering and product writing as we build. follow for more.