June 9, 2026
5 min

Your LLM’s JSON Schema Is Already a TypeScript Type — and That Closed an Entire Bug Class

Whiteboard sketch — a JSON Schema document and a TypeScript interface on opposite sides of a big equals sign, with post-it notes citing protobuf 2008, the live-interview crash, and the rule ‘copies drift, generate don’t copy’

Genuinely fun problem to solve. A real bug class, an old pattern hiding in plain sight, and a fix the compiler can prove — problems with all three don’t come around often.

A renamed enum in one LLM prompt crashed an AI briefing during a live interview. The audit found nineteen latent copies of the same bug across the codebase, each waiting for the next model nuance to land them. The fix: every JSON Schema in our LLM prompt configuration was already a TypeScript type in different syntax, and the schema column in the database was the canonical version. We had been hand-writing the copy. Generating the TS types directly from that row turned the entire drift class into a compile error. Three CI blocks now refuse to let the database, the committed snapshot, and the consumer types disagree.

The crash

A candidate sat down for a live interview. The interviewer opened the AI-generated prep briefing. The page rendered as a punch-in-the-face white screen — no app, no fallback, no graceful anything. The interviewer recovered by opening the candidate’s resume directly, but the briefing was the artefact the whole feature existed for and it was gone.

Someone had updated the prompt to emit a new severity value, weak_evidence. The consumer’s hand-written lookup table only knew the old vocabulary (gap, thin, strong). The render path dereferenced undefined.className and the React tree crashed.

Hand-written types and review discipline can’t prevent this class of bug.

Hand-written types and review discipline can’t prevent this class of bug. We audited the codebase the same afternoon and found nineteen more places where a consumer’s pattern-match could drift from the prompt that fed it. With thirty-four prompts in production, the next crash was a prompt rewrite away.

Why LLM apps fail in ways CRUD apps don’t

Conventional QA assumes the data shape your code reads is the data shape your code wrote. LLM apps break that assumption: the model is a probabilistic writer, the consumer is a deterministic reader, and the JSON in between is the contract. When the writer and reader drift apart, the consumer crashes — or worse, silently misrenders. The unit test catches none of it, because the test fixture hand-written by the developer who hand-wrote the consumer agrees with the consumer by construction.

Distributed systems solved this in 1984.

This is just protobuf with a model as the producer

The pattern is forty years old. ASN.1 (1984), CORBA IDL (1991), Thrift (2007), Protocol Buffers (2008), Avro (2009), OpenAPI codegen (2011), gRPC (2015), sqlc (2019), Prisma (2019) — when a contract crosses a boundary between two languages, you generate one side from the other or the two sides drift. JSON Schema in a database row is a .proto file with different syntax. The generated TypeScript interface is what protoc would emit. The only thing the LLM era changed is the producer: a model that picks a different shape each call, instead of a service that emits the same one every time.

The only thing the LLM era changed is the producer: a model that picks a different shape each call, instead of a service that emits the same one every time.

Three surfaces, two drift gaps

Every LLM-driven feature has three surfaces. Bugs come from drift between them.

   ┌──────────────────┐
   │   LLM request    │  prompt + JSON Schema sent to the model
   └────────┬─────────┘
            │   ⟵  Gap A: does the model emit what the schema asked for?
   ┌────────▼─────────┐
   │   LLM response   │  JSON document cached as JSONB
   └────────┬─────────┘
            │   ⟵  Gap B: does the consumer expect what’s actually in the cache?
   ┌────────▼─────────┐
   │  Consumer code   │  React + TypeScript reads the cache, renders the user
   └──────────────────┘

Two gaps. Four moves to close them.

Do this

1. Force the model. Set response_format = ‘json_schema’ on every JSON-emitting prompt. The provider builds forced tool use:

req.toolConfig = {
  tools: [{ toolSpec: { name, description, inputSchema: { json: schema } } }],
  toolChoice: { tool: { name } },
};

The model can’t emit values outside the schema. Policy: ALWAYS json_schema, NEVER json. “The prompt asks nicely” is not a control.

That pins variance at the model boundary — the easy half. The rest is the price of ask anything, get anything: prompts evolve, schemas change, inputs arrive open-ended. Discipline the boundaries where flexibility meets typed code.

2. Generate the TypeScript from the DB row. The JSON Schema for the briefing’s gaps array:

{
  "type": "array",
  "items": {
    "type": "object",
    "required": ["deliverable", "deliverableRef", "severity", "reasoning"],
    "properties": {
      "severity": { "enum": ["gap", "weak_evidence", "strong"], "type": "string" },
      "reasoning": { "type": "string" },
      "deliverable": { "type": "string" },
      "deliverableRef": { "type": "string" }
    },
    "additionalProperties": false
  }
}

What a developer would have written by hand:

{
  severity: "gap" | "weak_evidence" | "strong";
  reasoning: string;
  deliverable: string;
  deliverableRef: string;
}[]

Same thing, different syntax. The hand-written type is a copy. Copies drift. Generate, don’t copy.

The hand-written type is a copy. Copies drift. Generate, don’t copy.

Pipeline:

llm_prompts.tool_schema  →  src/lib/llm/schemas/<slug>.schema.json
(production DB row)         (committed JSON snapshot)
                                       │
                                       ▼
                            src/lib/llm/generated/<slug>.ts
                            (generated TS interface)
                                       │
                                       ▼
                            consumer React + TypeScript
                            (imports the generated type)

llm:sync-schemas pulls every row’s tool_schema from prod; llm:codegen emits one TypeScript interface per snapshot. Codegen runs as prebuild — a developer can’t ship a build whose types disagree with the schema. Pattern-matching tables are typed against the generated enum, so any consumer that hasn’t been updated to a new value stops compiling.

3. Degrade stale cached rows, don’t crash on them. Wrap every cached read in validateCached<T>(slug, cached), which returns a tagged union:

type ValidatedRead<T> =
  | { ok: true; data: T }
  | { ok: false; reason: "missing" | "invalid"; details: string };

On ok: false, every consumer renders <LLMOutputUnavailable onRegenerate />. The user clicks regenerate; the queue runs a fresh job under the current schema; the new row validates. The cache turns over on its own.

4. Lock the contract in CI. Four blocks: snapshot ↔ DB parity (the committed schema must deep-equal the live tool_schema); schema sanity (every schema asserted against Bedrock’s structured-output subset); production-sample roundtrip (parse the five most recent rows — advisory); consumer-wrap coverage (grep src/ for direct .select() reads outside the validator). Three block merge, one is advisory. The committed snapshot is the source of truth for the build; the live database is the source of truth for the snapshot; CI reconciles the two.

The catch rate

The migration surfaced one latent drift bug per prompt — nineteen findings across nineteen prompts, none caught by the previous tests. The highest-impact catches never appeared in monitoring:

  • LinkedIn parsing had a fetch_status enum mismatch where terminal failure values weren’t in the consumer’s union. The UI showed a forever-spinner instead of an error. The bug had been live for months.
  • Quote-sort extraction was silently dropping the LLM-emitted rationale field — the consumer didn’t read it, the producer paid the tokens to write it, and the reviewers downstream of the feature had no grounding for any of the scores they saw.
  • A coach-editor flow had a “use what you’ve got” universal path that fell through to success copy on a failure case the consumer didn’t recognise. Users were told their entry had been saved when it had not.

None of these had been flagged by tests, monitoring, or QA. They surfaced only because the migration made the schema and the consumer compare themselves to each other.

What’s still open

Three open: UI inputs flowing into prompts have no equivalent enforcement yet; streaming partial JSON needs incremental validation we haven’t built; schema versioning waits for the first v2 alongside readable v1 rows.

The crash class can’t ship

The interviewer at the top of this article opened a briefing that wasn’t there. The prompt change that crashed it now fails tsc --noEmit before the build runs. We didn’t fix the bug — we fixed the condition that produced it.

We didn’t fix the bug — we fixed the condition that produced it.

Stack: Next.js 16 App Router, Supabase Postgres, AWS Bedrock direct routing, vitest, json-schema-to-typescript. Implementation captured by Claude Code (Opus 4.7) across a single PR plus a multi-batch migration trail. Article fact-checked against the live llm_prompts table the morning before publication. Related reading: Prisma Is the Right Default Until Your Authz Lives in Postgres covers the same generate-types-from-the-database discipline at the Postgres layer; Building Agentic Workflows: What Nobody Tells You covers the structural protections that come before and after this one.