Making the Context Across 46 Repositories Semantically Searchable for AI (Part 2)

13 min read

Contents

  1. The Hint Was in db-graph
  2. Bringing the Same Pattern to code-graph
  3. But API / Event / Page Still Need Meaning — and Annotating Every Function Is Off the Table
  4. Designing the annotation graph
  5. An Annotation Example
  6. Running Annotations Without Interfering With the Day-to-Day Dev Workflow
  7. Protecting Cross-Graph Consistency With an SLO
  8. Joining the Static Graph and the Annotation Graph via SAME_ENTITY Bridges
  9. The Result: Entering the Graph from "the subscription-fee calculation"
  10. Real Usage Numbers
  11. MCP as the Single Front Door
  12. April–May Timeline of Trial and Error
  13. April: Expansion and the First Bridges
  14. May: Stabilizing and Expanding
  15. What This Timeline Says
  16. What Still Isn't Solved
  17. 1. Maintaining Annotation Coverage
  18. 2. Bridge Mis-Joins Aren't Fully Eliminated Structurally
  19. 3. No Dynamic Analysis
  20. 4. Onboarding Cost When a New Repo Joins Production
  21. Closing: Not "Thrown Away," but "Evolved"

Hi, I'm Ryan, CTO at airCloset.

In Part 1, I wrote about unifying 46 repositories of production code into a single knowledge graph via static analysis. The graph itself got built, but I closed the post with four open issues: no semantic search, node explosion, having to open the file to actually know what a function does, and the cost of writing a new parser every time a new boundary pattern showed up.

This Part 2 is about how I solved the first one — the entry-point problem (no semantic search). The other three are left exactly as Part 1 described them — I'll come back to them at the end, together with the new issues that surfaced once the entry-point problem was out of the way.

The reason to start with the entry-point problem is simple: if the graph exists but the only way to reach it is grep, the model ends up inferring anyway. The whole point — "give the model verified facts, not inference" — falls apart. So the entry-point problem had to be solved before the others.

The Hint Was in db-graph

Months earlier, I'd already solved the same structural problem in a different domain — the db-graph project.

Internally, we had a large number of DB tables spread across many services, and no single person had the full picture. Different people knew different pieces well, but the whole map didn't fit in anyone's head. So I built db-graph: extract schemas statically from ORM definitions, generate per-table descriptions with Gemini, embed them as 768-dimensional vectors in the graph, and make the whole thing semantically searchable in natural language.

At the time of that article it covered 991 tables. Today it spans 21 schemas / 1,133 tables / 10,815 columns, and finding data in natural language without knowing table names is just how people work now.

The pattern that proved out there:

Static-analysis graph + AI-generated context = natural-language semantic search works.

Bringing the Same Pattern to code-graph

If it worked for db-graph, it should work for code-graph. The moment that thought landed, I noticed something:

code-graph already contains "DB table nodes" as boundary nodes — they're one of the boundary node types I covered in Part 1.

So if I just join code-graph and db-graph, code-graph automatically inherits db-graph's semantic context. Without writing a single annotation, the existing assets alone make the graph meaningfully richer.

That's where the idea of "joining graphs" first came up — not treating each graph as its own island, but designing the joins between them.

But API / Event / Page Still Need Meaning — and Annotating Every Function Is Off the Table

Joining db-graph took care of DB context. But the remaining boundaries (API / Event) and the graph's entry-point type (Page) still need meaning attached. Static analysis alone can't pull intent out of those, so context has to come from somewhere else.

The choice was clear: write the intent directly into the code via annotations (the same approach used by cortex's internal knowledge graph, which I covered in AI Harness Series, Part 2).

The catch: you can't annotate all the functions across 46 repos. There must be tens of thousands of them. Asking established teams running an existing production codebase to retroactively annotate everything is just not realistic.

But here's the second realization:

What matters is just the boundary nodes. So if I only annotate around the boundaries, that's enough.

When an AI agent asks "what breaks if I change this code" or "what other repos call this API," what it needs isn't a per-function logic explanation. It needs boundary intent — what is this screen for, what does this API return, what milestone in the business does this Event mark.

= Minimum annotations, maximum meaning. That became the heart of the design.

Designing the annotation graph

Putting it together (internally we call this annotation graph service-product-graph, or SPG):

Three graphs joined as peers form a knowledge graph that carries meaning

Three graphs sit as peers, joined by SAME_ENTITY edges. There's no hierarchy — you can start from any graph and reach the others.

The entry point for AI agents is a single MCP server that traverses all three graphs. AI agents never hit db-graph directly — the annotation graph's MCP server proxies db-graph calls on their behalf.

The annotation graph has 7 node types: Page / Section / Dialog / Field / Action / Api / Task. The early version was screen-focused and called screen-graph, but once it grew to cover backend Api / Task, it was renamed to service-product-graph.

An Annotation Example

Here's what an annotation looks like (fictional, but close in shape to the real ones):

/**
 * @graph-page /home
 * @graph-business Main screen. Members can see what they're currently renting, buy items, and initiate returns.
 * @graph-label Home Screen
 * @graph-has-section banners, wearing-items, wearing-return, delivery-status
 * @graph-has-dialog buying-modal, return-modal
 * @graph-navigates-to /return-procedure, /checkout, /my-karte
 * @graph-calls GET /api/v1/wearing
 * @graph-reads admin_delivery_orders, admin_rental_items
 * @graph-flow styling-loop
 * @graph-status monthly-member
 */

Two things matter here:

There's also @graph-case (the conditional pattern tag that test cases derive from), but that's for another time.

Running Annotations Without Interfering With the Day-to-Day Dev Workflow

This is where it gets practical.

Once I committed to building annotation graph, here were the constraints:

In other words: don't mix humans and AI inside the same PR.

The solution was to physically separate annotations onto their own branch.

Separate the AI-managed annotation branch from the human-managed main branch

This is the "every line of code passes through an AI gate" ideal from AI Harness Series, Part 6, adapted to the constraints of an existing organization. cortex (the internal AI platform) is a monorepo I assemble from scratch, so "every commit passes the AI gate" actually holds there. For the 46-repo production system, that precondition doesn't hold. So instead of giving up on the ideal, I split it: engineers' workflow on one branch, AI's annotation workflow on another, both running in parallel.

Protecting Cross-Graph Consistency With an SLO

Just running the annotation pipeline doesn't guarantee the quality of the joins between the three graphs (code-graph / db-graph / annotation graph). So there's a set of SLOs that automatically check the consistency across the entire graph.

The main rules:

These are really just a naive question — "shouldn't the boundaries connect to each other?" — turned into an SLO. If anything drops below threshold, an alert fires, and the trustworthiness of the whole graph gets defended every day.

The daily boundary-analysis cron from Part 1 (5% connection-rate drop = alert) was code-graph-only. This is a cross-graph SLO — it guards the joins between graphs themselves. Add a parser to one repo, write a new annotation, change a schema — whatever happens, by the next morning a quality drop in any join becomes visible.

Joining the Static Graph and the Annotation Graph via SAME_ENTITY Bridges

I've been writing "join" casually, but the actual joining wasn't that straightforward.

Static-analysis API / Page / Task nodes and annotation graph API / Page / Task nodes are created as separate nodes. They mean the same thing, but their names / paths / identifiers don't match by themselves — there's nothing automatic about lining them up.

To connect them, we generate a separate edge type called SAME_ENTITY. There are three bridges:

There was also one operational footgun. The first implementation used INSERT NOT EXISTS to avoid duplicates. But BigQuery's streaming-buffer visibility lag let duplicates slip in — in one repo the edges doubled from 106 to 214 overnight. We fixed it by rewriting to MERGE INTO to make the operation idempotent.

The Result: Entering the Graph from "the subscription-fee calculation"

With all of this in place, the entry-point problem from the end of Part 1 was finally solved:

"the subscription-fee calculation for members seems off"

Throw this natural-language query at annotation graph and vector search returns the related nodes (Page / Api / Function / DB table) as facts. From there, SAME_ENTITY takes you over to code-graph functions, including callers and callees in other repos. From the DB boundaries in code-graph, you can cross into db-graph and pull the relevant columns.

The entry point can be anywhere — "what calls this table?" starts from db-graph, "what's the blast radius of this function?" starts from code-graph, both walk the same connected network. From a single natural-language query, or from a specific node, you can now traverse all three graphs and get every relevant piece of code plus every relevant DB schema.

The Part 1 lament — "the graph is there but the entry point is missing" — could finally be put to bed.

Real Usage Numbers

From 2026-04-16 (first production deployment) to the time of writing — about 2.5 months — the annotation graph's MCP server has handled ~50,000 calls from ~73 users. The breakdown:

The interesting line is the second one. "Search the codebase in natural language" is usually an engineer's tool — but once the entry-point problem was solved, people outside engineering started using it too, asking things like "how does this feature actually work?" or "what's in this DB?" in their own words.

This is adjacent to the "non-engineers writing specs with AI" trend I covered in AI Harness Series, Part 5a graph that can be queried by meaning starts to matter org-wide. Call volume is overwhelmingly dominated by engineers, of course. The interesting thing is the range of job roles starting to pick it up. That's the real impact of solving the entry-point problem.

MCP as the Single Front Door

The MCP server is the cross-graph entry point. It exposes six tools — service search / service detail / API detail / data-flow tracing / impact-radius tracing / business-rule full-text search — and that's the only entry point AI agents ever touch.

One design choice worth calling out: AI agents never talk to db-graph directly. The annotation graph's MCP proxies db-graph calls. From the agent's side, the mental model stays simple: "ask one MCP and get everything back."

That makes the full chain — "Screen → API → Code → DB → Column" — traversable in a single MCP tool call.

April–May Timeline of Trial and Error

Same approach as Part 1 (pulling commits from Jan–Mar). For Part 2, the key commits are from April–May.

April: Expansion and the First Bridges

May: Stabilizing and Expanding

What This Timeline Says

April 15 was the day "expansion + cross-graph tools + bridges" landed in close succession. Over the next week, "Redis / EventBridge / Task bridges / annotation auto-maintenance" stacked up week over week.

In particular, the annotation auto-maintenance pipeline on April 21 is where the "humans alone can't do this, but AI can" promise from Part 1 got cashed in. From that point on, annotation shifted from "humans grind through writing them" to "design the whole operation assuming AI writes them."

What Still Isn't Solved

Solving the entry-point problem didn't make everything clean. A few issues remain.

1. Maintaining Annotation Coverage

The frontend side is annotated heavily. Backend / Go / batch are still thin. Some nodes will always be missing annotations — that's structural, and you can't drive it to zero. It's an ongoing operational issue.

2. Bridge Mis-Joins Aren't Fully Eliminated Structurally

The Page bridge in particular has cases where multiple annotation Pages map to the same boundary — that's structural and unavoidable. Adding more strategies got coverage to 100%, but guaranteeing "every join is correct" 100% is hard.

3. No Dynamic Analysis

The graph only carries the fact that "this edge exists statically." How often that edge actually gets used in production isn't recorded. Piping production execution counts back into the static graph and surfacing dead-code edges as a separate signal — that's still untouched.

4. Onboarding Cost When a New Repo Joins Production

Every time a new repo enters production, the bridge normalization rules and per-repo patterns need adjusting. This is the annotation-graph-side version of Part 1's fourth issue (the cost of adding a new parser for every new boundary pattern).

Closing: Not "Thrown Away," but "Evolved"

In Part 1's closing note, I touched on the fact that the cortex side (the internal AI platform) bailed out of the code-graph approach early and bet on an annotation-based knowledge graph instead. The bail-out was fast enough that calling it "thrown away" wouldn't be wrong — but looking back across this whole series, the more accurate word is "evolved."

What it evolved into, in the end, is three graphs joined as peers:

Joined by SAME_ENTITY, served to the agent through MCP. The thing static analysis alone couldn't deliver — querying by meaning — became workable by reusing the db-graph success pattern and adding minimal annotations only at the boundaries.

And one more framing: paired with the AI Harness Series, Parts 1–6, this series sits as:

= the same philosophy (design without trusting AI), implemented under two different sets of constraints.

Thanks for reading this far.

comments (0)

no comments yet.