What RAG Looks Like in Practice

"RAG" comes up a lot in conversations about AI. It stands for Retrieval-Augmented Generation, which is jargon for a simple idea. Give an AI access to your own documents and have it answer questions from them, with the source for every answer attached. Few of the AI tools your team has tried do this. Most should.

We wanted to show what one actually does, end-to-end, on real data. So we built one. The data is a public set of UK property listings. Property is a useful proxy. A buyer's brief (long descriptions, structured facts, a few specific must-haves) reads like a lawyer's question to a contract library, an auditor's question to last year's working papers, or any team's question to the shared drive they have been searching with Cmd-F for ten years.

If you want the buyer-facing summary of the technique, see the case study. This post is a walkthrough of what we built, with the diagrams, an example query, and the response it produces.

The problem we picked

Imagine a buyer who tells you, in plain English, what they want.

"Three-bedroom flat in Putney, under £600k, with a garden. Walking distance from a station."

An estate agent's website cannot answer this. Their search has dropdowns for bedrooms and price. It has no field for "with a garden, walking distance from a station." The buyer has to translate their question into the website's filters, then read each listing to find out which ones actually have a garden, then check each location on a map to see which are near a station. Twenty minutes per question.

The same gap shows up everywhere. A paralegal who asks "show me clauses that cap liability at twelve months in our last twenty SaaS contracts." A compliance officer who asks "where do we say what we do when a client misses a deadline." A new joiner who asks "how do we handle a chargeback dispute." The answer exists in writing. The search tool cannot reach it. McKinsey's number for knowledge workers is 1.8 hours a day spent looking for information they could otherwise be applying. For a five-person legal or compliance team, that is roughly a full extra headcount lost to search.

The pattern below closes that gap.

What we built, in four steps

Imagine someone in the team who has read every document you own, remembers what each passage means, and is on call to answer any question. When you ask something, they recall the passages most relevant and read them back to you with the source attached. That is the system.

From a corpus of documents to a cited answer.

From left to right, the system does four things. It loads your documents once and turns each section into a searchable fingerprint. It reads the user's question and pulls out the hard requirements. It matches the question to the closest fingerprints and shortlists them. It writes a cited answer from the shortlisted passages. The four sections below explain each step in turn.

1. Read every document, once

The system loads each document and pulls it apart into meaningful sections. A property listing yields its summary, its description, its features list, and its location. A contract yields its clauses. A regulation yields its individual sub-paragraphs. The sections are not arbitrary chunks of fixed length. They follow the structure the document already has.

2. Learn what each section means

For every section, the system creates a numerical fingerprint that captures what the section means, rather than which words it uses. A buyer asking about "somewhere I could grow vegetables" will find a listing that mentions a south-facing garden, even though neither uses the same words.

Sections cluster by meaning. The buyer's question lands closest to the description sections of a few specific listings.

3. Pick out the hard requirements

Meaning alone is not enough. "Under £600k" is a hard number, and a system that only matches on meaning will happily recommend a £900k flat in Putney because it is semantically close to a £550k one. So before anything else, the system reads the question and pulls out the structured requirements (price ceiling, bedroom count, location, property type). Listings that fail those requirements are filtered out. Listings that survive are ranked by how close their meaning is to the rest of the question.

If the filter leaves too few listings, the system relaxes a requirement and tries again, telling the buyer what it relaxed. "No exact match for three bedrooms in Putney under £600k. Here are three close ones, just outside Putney." Knowing which requirement is safe to relax is a meaningful product question in its own right, and we come back to it below.

4. Write the answer, with sources attached

The shortlisted sections are handed to a language model along with the original question. The model writes a direct answer, citing which document each part came from. The buyer sees the answer, the citations, and a link straight through to the source.

One thing we enforce on top of the model. Before the buyer sees the answer, the system checks that every citation the model produced is real. If the model claims a source that does not exist in the shortlist we gave it, the model's answer is discarded and the system falls back to a plainer answer assembled directly from the retrieved passages. That guarantee is what makes the system usable in regulated work.

A real query, walked through

Take the question from the top of the post.

"Three-bedroom flat in Putney, under £600k, with a garden. Walking distance from a station."

Here is what happens to the question at each stage.

The query flows from the buyer's brief through to a cited shortlist. The hard requirements filter the corpus before the soft preferences rank the survivors.

The dashboard the buyer interacts with puts those parts side by side. The brief and the conversation on the left, the matched homes and the supporting snippets on the right.

A stylised view of the built dashboard. Conversation on the left, matched homes top right, supporting snippets beneath.

Each recommendation in the shortlist expands to show the match reason and the passages the system relied on. Two of the cards from the example query are shown below.

Garden Flat, Felsham Road £575,000 · 3 bed · Putney Exact match

A three-bedroom flat in Putney within budget, with a private rear garden and a five-minute walk to East Putney station.

Description "Generous private garden, west-facing, with mature planting and a small patio."
Location "0.3 miles from East Putney Underground (District line)."

Garden Maisonette, Sheen Lane £585,000 · 3 bed · East Sheen Relaxed: location

Just outside Putney but a strong fit on the rest. Three bedrooms, within budget, private rear garden, eight-minute walk to a mainline station.

Features "Private south-facing rear garden, family-friendly layout, freehold."
Location "Eight minutes on foot to Mortlake mainline station, fifteen minutes to East Putney Underground."

An illustrative response. The system always tags whether the match is exact or has been relaxed on a specific requirement, and shows the passages it relied on as citations.

Why this isn't just about property

The buyer-and-listings example is the easiest to picture. The architecture beneath it works for any kind of document, in any industry. Swap what counts as a document and the same pipeline applies.

A law firm has a contract library. The hard requirements are dates, counterparties, jurisdictions, monetary values. The soft preferences are clause language, risk profile, deal type. A paralegal asks "show me last year's SaaS contracts with US counterparties where the liability cap is below twelve months." Same shape.

An engineering team has thousands of installation manuals, troubleshooting guides, and service bulletins. The hard requirements are model number and revision date. The soft preferences are the symptom being investigated. A technician asks "the boiler is short-cycling on this model. What is the usual cause and what is the current service procedure." Same shape.

In every case the pattern is the same. The system reads the documents once and fingerprints each section by meaning, pulls the hard requirements out of the question, filters and ranks the survivors, relaxes a requirement if the shortlist is too thin, and answers with the sources attached. The plumbing does not change. What changes is what counts as a document and what counts as a hard requirement.

Where we'd take this next

The build above is a starting point. Two extensions are obvious next steps. Both shape what a real deployment looks like.

Smarter handling of hard versus soft requirements

Knowing what is a deal-breaker and what is a preference is genuinely difficult, and getting it wrong is what makes most search tools frustrating. A buyer who asks for "a three-bedroom flat in Putney" probably will not accept a two-bedroom flat. They might well accept a three-bedroom flat near Putney. Bedroom count is a hard floor. Location is a soft target.

A serious build addresses this in three layers. Sensible defaults per domain handle the common case. In property, location is softer than bedroom count, which is softer than budget, which is softer than property type. The system should also listen for linguistic cues. "Must have a garden" and "ideally near a station" tell it which is which, and so do words like "at least," "open to," "would consider." When the choice is genuinely ambiguous, the system should ask the user before relaxing, rather than guess and apologise afterwards.

The same logic transfers to other domains. A compliance officer searching contracts may treat jurisdiction as a hard floor and clause language as soft. A technician searching service manuals may treat model number as hard and symptom phrasing as soft. The pipeline is the same. The defaults change.

Searching the images, not just the words

Property listings carry photos, and the photos hold information the text does not. The modern kitchen. The south-facing garden. The period features. A future version of the system would fingerprint the photos as well as the prose, so a buyer could ask "the one with the open-plan kitchen" and get a useful answer. The technique is the same as for text. A vision model produces a numerical fingerprint for the image, the fingerprints sit next to the text fingerprints in the same database, and the question is matched against both.

The same idea applies anywhere documents carry meaning that lives outside the words. Contracts with signature blocks, stamps, and embedded diagrams. Technical manuals with schematics. Audit working papers with photographic evidence. Adding image understanding does not replace the pipeline above. It extends it.

What you can trust about a system like this

Three properties matter, and they are worth being clear about.

Every answer is cited. The system is not allowed to produce an answer without sources. If it cannot find supporting passages, it says so. An AI answer with a verifiable citation is a different thing entirely from a confident guess.

Answers come only from your documents. The model is given the relevant passages and told to work only from those. It does not draw on its general training data. If the answer is not in your documents, you get "I do not have enough information to answer this from the available material" rather than a plausible-sounding invention.

It can run entirely on your network. For teams handling confidential data, sensitive IP, or anything bound by data residency rules, the whole system can be deployed on your own servers, including the AI model itself. Your documents never leave your network.

What this would look like for your team

The build above took us a few weeks against open data. A real deployment is shaped by the questions your team actually asks and the documents you actually own. The architecture is the same. The work is in learning the domain well enough to know which requirements are hard, which are soft, and what counts as a good answer.

If your team's institutional knowledge sits in documents and the questions you ask blend soft semantics with hard constraints, this pattern earns its keep. The case study covers what a delivery looks like, how long it takes, and what it costs.

See the code

The full source for the build above is on GitHub. It includes the adapters, the constraint extractor, the retriever with the relaxed-fallback logic, the validated-citation agent, the test suite, and the React dashboard. Clone it, run it locally against the bundled fixtures, and you can poke at every part of the pipeline described in this post.

github.com/qemtek/document-search-property-agent →

What RAG looks like in practice