Context Engineering for Better Semantic Search Results

In my previous article, I talked about how I cached users' intentions instead of their exact queries to reduce token usage and latency drastically. Context engineering played a HUGE role in doing that. So, I'm here to explain what I did and how context engineering can improve your AI project.

What is Context Engineering?

It sounds complicated... but it's not. Context engineering is adding relevant surrounding information to a piece of text to make it have enough context to operate by itself for a model or for a search.

Why do you need Context Engineering?

Let's go back to the basics of embeddings. When you embed a piece of text, the resulting vector is a mathematical representation of THE MEANING of that text, based on every token in it. Now, the more tokens that accurately describe what the text is about.... the more specific the region that vector lands in within the embedding space.

For instance, say you're storing restaurants for a search feature. If you embed just "Mama's Kitchen. Lagos. Nigerian."... the embedding model captures the meaning... but loosely. It knows it's a Nigerian spot in Lagos, nothing more. But if you enrich that same restaurant before embedding it — "Mama's Kitchen in Lagos. A budget-friendly Nigerian spot, great for groups and family meals." -- the vector now sits in a far more specific region of the embedding space.

Here's why that matters. The loose version can be found by a query like "Nigerian restaurant Lagos"... but it'll never surface for "cheap place to take the family for dinner" -- even though that's exactly what it is. The enriched version captures the intent, the vibe, the kind of diner it suits. So when a real user query comes in, it lands close to the document instead of far from it.

That's the whole game. You're not just adding words, you're shaping where the vector lands so the right queries can actually find it.

So how do you actually enrich a document?

The obvious move is to embed the facts and call it a day. But first... your data probably isn't text. It's a JSON object, a database row, something structured:

{
  "name": "Mama's Kitchen",
  "city": "Lagos",
  "country": "Nigeria",
  "category": "Nigerian",
  "cost": 15,
  "hours": "9am-10pm"
}

Embedding models work on text, not objects. So step one is turning that object into a string:

"Mama's Kitchen. Lagos, Nigeria. Category: Nigerian. Cost: $15. Open 9am-10pm."

And that's not wrong... it's just not enough. Because that's not how people search.

Nobody types "Nigerian restaurant, $15, open 9am-10pm." They type "somewhere cheap to take the family on a Sunday" or "a chill spot for dinner with friends." The intent is all there in their head... budget, vibe, who they're with -- but none of those words exist in the facts above. So the vectors sit far apart, and a perfect match never surfaces.

The fix is to put that intent into the document before you embed it. Not by hand, you derive it from the data you already have.

Take cost. A $15 meal and a $90 meal aren't just different numbers, they signal different things -- affordability. So instead of embedding the raw price, I turn it into meaning:

under $30 → "budget-friendly, good for students and casual outings" $30-100 → "mid-range, suitable for most diners" over $100 → "premium, suited for special occasions"


def _build_budget_signal(restaurant: Restaurant) -> str:
    cost = restaurant.average_cost

    # Infer budget tier from cost
    budget_tier = (
        "budget-friendly, good for students and casual outings"
        if cost < 30
        else (
            "mid-range, suitable for most diners"
            if cost < 100
            else "premium, suited for special occasions"
        )
    )

    return f"Suitability: {budget_tier}."

Now a query like "cheap place to eat" has something to actually land near.

Same with category. The cuisine tells you who the place suits, so I map it out:

Nigerian, Ghanaian → "great for groups, family meals, hearty eating" Sushi, Fine dining → "good for dates, quiet dinners, couples" cafe, Brunch → "casual, solo-friendly, good for working or catching up"

None of this was in the raw facts. I'm inferring it from what I already have and writing it into the text the model sees. The price stays the same, the cuisine stays the same — but the meaning around them gets richer, and the vector lands somewhere a real query can find it.

Do the same thing to the query Here's the part that ties it together. Enriching the document only works if the query meets it halfway. Think about it — you spent all that effort making the stored restaurant rich and intent-aware. But if the incoming query stays raw — "$20, halal, somewhere chill" — you've only fixed one side. The document speaks one language, the query speaks another, and they still don't quite line up. So I run the query through the same transformation. Same budget-tiering, same phrasing, same logic:


def build_retrieval_query(user_input: UserInput, city: str) -> str:
    parts: list[str] = [f"Location: {city}."]

    if user_input.budget:
        budget_tier = (
            "budget-friendly, cheap eats"
            if user_input.budget < 30
            else (
                "mid-range dining"
                if user_input.budget < 100
                else "premium, fine dining"
            )
        )
        parts.append(f"Budget: {budget_tier}.")

    if user_input.dietary_preferences:
        parts.append(
            "Dietary preferences: "
            f"{', '.join(user_input.dietary_preferences)}."
        )

    if user_input.description:
        parts.append(f"Preferences: {user_input.description}.")

    return " ".join(parts)

A raw budget of $20 doesn't just stay $20 — it becomes "budget-friendly, cheap eats", the exact same phrase the document would've used for a cheap restaurant. Now both sides are built from the same vocabulary, the same way. And that's the whole point. It's not enough to enrich the document and hope the query finds it. You enrich both, with the same logic, so they land in the same neighbourhood of the embedding space and actually meet.

The trick: bake the questions into the document

Even after all that, there's still a gap. And it's a subtle one.

A document — even a rich one — is a description. "Mama's Kitchen. Budget-friendly, good for groups and family meals." But a user doesn't search in descriptions. They search in questions: "where can I take my family for cheap dinner in Lagos?"

A description and a question are shaped differently, even when they mean the same thing. So they still land a little apart in the embedding space — close, but not as close as they could be.

So here's the trick: I generate the questions the place would answer, and embed them into the document itself.

@staticmethod
def _build_hypothetical_queries(restaurant: Restaurant) -> str:
    name = restaurant.name
    city = restaurant.city
    category = restaurant.category.lower()

    questions = [
        f"Where can I get {category} food in {city}?",
        f"What's a good {category} spot in {city}?",
        f"Recommend a place to eat in {city}.",
        f"Is {name} worth visiting in {city}?",
        f"Where should I eat in {city} for a {category} meal?",
    ]

    return "Relevant questions: " + " | ".join(questions)

Now the document literally contains question-shaped language. So when a real question comes in, it isn't matching against a description anymore — it's matching against other questions. And questions sit near questions.

So by the time a restaurant is ready to embed, the string isn't just facts anymore. Here's where we started:

Mama's Kitchen. Lagos, Nigeria. Category: Nigerian. Average cost: $15. Open 9am-10pm.

And here's what actually gets embedded:

Mama's Kitchen. Lagos, Nigeria. Category: Nigerian. Average cost: $15. Open 9am-10pm.

Suitability: budget-friendly, good for students and casual outings. Recommended for: groups, family meals, hearty eating. Location: Lagos, Nigeria.

Relevant questions: Where can I get Nigerian food in Lagos? | What's a good Nigerian spot in Lagos? | Recommend a place to eat in Lagos. | Is Mama's Kitchen worth visiting in Lagos? | Where should I eat in Lagos for a Nigerian meal?

That whole block is what gets embedded — not the bare facts I started with. The facts are still in there, but now they're surrounded by intent, suitability, and the questions a real person would actually ask.

Making the text operate by itself

This is what I meant at the start... context engineering is about making a piece of text operate by itself.

A raw restaurant record can't do that. It needs a query phrased just right, in just the right words, to ever get found. But a fully engineered document — the facts, the derived intent, the questions it answers — doesn't depend on the query being perfect anymore. It already knows who it's for, what it's good at, and what people would ask to find it. It carries its own context.

That's the whole job. You're not decorating the text; you're making it self-sufficient. So when a real query shows up, the document is already standing in the right place, waiting.

How I Used Context Engineering to Improve Retrieval

What is Context Engineering?

Why do you need Context Engineering?

So how do you actually enrich a document?

The trick: bake the questions into the document

Making the text operate by itself

Comments

More from this blog

How I cached intention, not queries

Threads vs Processes in Task Queues: The Reliability Tradeoff

The Reflection Pattern in AI: Teaching Models to Think About Their Thinking

Retrieval Is Not Resolution: Building a Hallucination-Resistant RAG System with LLMs and SQL Server

Command Palette

What is Context Engineering?

Why do you need Context Engineering?

So how do you actually enrich a document?

The trick: bake the questions into the document

Making the text operate by itself

Comments

More from this blog