How Natural Language Search Actually Works for Property Listings

#dublin #renting #ireland #proptech

Most NLP tutorials use toy examples. "What's the weather?" "Book me a flight." Property search is structurally messier than either, and the failure modes are worth understanding before you build anything.

Here's how I implemented natural language search for Dublin rental listings, what broke, and what actually works.

The problem with filter dropdowns

Classic rental portals give you: min price, max price, min beds, max beds, area (dropdown), and maybe a keyword field. That's fine if you know exactly what you want and the options the site gives you match how you think. Most tools for finding apartments in Dublin are built around this exact model.

In practice, people search like this:

"2-bed near the DART, pet-friendly, under 1800, ground floor preferred"

That query has five separate intent signals. A dropdown UI forces you to split those across multiple controls, remember which areas are "near the DART", manually scan results for pet-friendly mentions, and completely ignore "ground floor" because no portal has that filter.

The fix isn't just "add more filters." It's parsing intent directly from text.

Intent extraction pipeline

My pipeline has four stages:

1. Query parsing with a structured extraction prompt

I use GPT-4o with a system prompt that instructs it to return a JSON schema. The schema has typed fields: max_price, min_beds, max_beds, transport_proximity (array of station names), pet_friendly (bool or null), floor_preference, areas (array), keywords (freetext remainder).

The LLM handles the conversion from natural language to typed values. "Under 1800" becomes max_price: 1800. "Near the DART" becomes an intent flag that hits a second lookup stage.

2. Geographic intent resolution

"Near the DART" is ambiguous. The DART runs from Malahide to Greystones. You have to ask: how near? Which stations matter for this user?

I don't try to resolve this in the LLM prompt. Instead I maintain a static lookup table mapping Dublin neighborhoods and transport stops to lat/lng bounding boxes. "Near the DART" expands to a union of bounding boxes within ~1km of each station. If the user said "near Sandymount DART" that narrows it to one station's box.

This is faster, cheaper, and more predictable than asking the LLM to do geography.

3. Structured query execution

Now I have a typed query object. This goes against the listings database as a standard structured query. Postgres, indexed on price, beds, area polygons. No vector search at this stage. Structured data matches structured queries better than embeddings do.

4. Semantic re-ranking

For the freetext remainder that doesn't map to structured fields ("quiet street", "modern kitchen", "nice landlord"), I use embeddings. The listing description text is embedded at index time (OpenAI text-embedding-3-small). The freetext query fragment is embedded at search time. Cosine similarity gives a re-ranking score that gets blended with the structured match score.

The blend ratio matters. Too much weight on embeddings and structured matches get buried. Too little and you lose the semantic layer entirely. I landed on roughly 70% structured, 30% semantic after testing on actual Dublin listing data.

What GPT gets wrong on real estate data

A few failure modes I hit:

Price hallucination. Early prompts without strong grounding would occasionally invent constraints. "Around 1800" might become max_price: 1750 or max_price: 1900 depending on prompt phrasing. Fix: require exact numeric extraction and default to null when the query is ambiguous, then ask the user to clarify.

Overfitting to query phrasing. If someone writes "not too far from town" the LLM might return areas: ["Dublin 1", "Dublin 2"] with high confidence. That's one interpretation. Another user might mean the same thing but expect Rathmines to be included. The structured query I build from that is deterministic; the interpretation step isn't.

Missing negation. "No agency fees" or "not a studio" are common. LLMs handle negation inconsistently in structured extraction. Explicit prompt engineering helps but doesn't fully solve it.

Listings data quality. Even with perfect parsing, listings have inconsistent data. "Pet-friendly" might be mentioned in the description, in an amenity tag, or not at all even if it's true. The semantic layer helps here, but it's not a complete solution.

Embeddings for listings: when they help, when they don't

Embeddings are good at capturing soft preferences that don't map to structured fields. "Quiet" vs "vibrant." "Modern" vs "period features." These are real preferences that affect decisions.

They're bad at exact constraints. If someone says "max 1800" and a listing is 2100, cosine similarity between "affordable" and a description mentioning "prime location" might still score it highly. Structured filtering has to happen first, before embeddings get anywhere near the results.

The architecture I landed on: structured query first (hard filters), then embedding re-rank within the filtered set. Never embedding search across the full corpus for price-sensitive queries.

The index structure

Each listing in the database has:

Structured fields: price, beds, baths, area, lat/lng, available date, pet policy (when known), parking
A generated text blob: all descriptive text from the listing concatenated
A precomputed embedding of that text blob
Source metadata for deduplication across the 90+ sites we aggregate

At search time: structured WHERE clause narrows the set, then embedding similarity re-ranks within it. Results return in under 200ms for most queries on the current Dublin dataset size.

What I'd do differently

If I were starting over, I'd invest earlier in the geographic resolution layer. Dublin's transport network is the single biggest intent signal in rental searches here, and the freetext-to-station-to-bbox mapping took longer to get right than the LLM integration.

I'd also be more conservative about what the LLM touches. The more you can push to deterministic lookups and structured queries, the more predictable the system is. LLMs are best at the translation step, not the search step.

The non-technical side of all this — why the search actually feels different for users — is covered in the user-facing guide on the HomeScout blog.

Caspar Bannink. Founder of HomeScout.io. Building AI-powered rental search for Dublin.