Enterprise AI narrative Spec-Driven Context Engineering

Part 01 · Framing

The enterprise AI paradox

Enterprises have spent heavily on AI and seen little operational impact. The gap is not a model problem. It is a governance and context problem.

Most enterprise deployments fail before the AI model is ever the bottleneck. They fail in how context is assembled, how organizational data is structured and prioritized, and how agent access is governed.

This site presents an original perspective: build a governable system custom-tailored to the organization, and let agents work inside it.

Working thesis

The gap is not model capability. It is the governed translation from enterprise intent into executable context for agents.

The paradox

  • The Wall Street Journal: Companies Are Struggling to Drive a Return on AI. It Doesn’t Have to Be That Way.
  • The New York Times: Companies Are Pouring Billions Into A.I. It Has Yet to Pay Off.
  • Forbes: The Context Engine: Compounding Productivity For AI
  • Fortune: MIT report: 95% of generative AI pilots at companies are failing

Part 02 · Problem & landscape

Context is the bottleneck, not the model

Context is not just extra information. It is the structure that determines whether an agent can act safely, consistently, efficiently, and credibly inside an organization.

Context handling, done seriously, is responsible for:

  • Information integrity — agents need context that is current, complete, and consistent.
  • Authority and precedence — when sources conflict, the system must know what outranks what.
  • Access control — agents should only receive the context they are permitted to use.
  • Consistency and reliability — the same workflow should produce repeatable outcomes.
  • Efficiency — disorganized context wastes tokens, time, and money.
  • Auditability — every important determination should be traceable to governed inputs.

More context is not uniformly better. Beyond a threshold, it degrades performance. The enterprise fix is not a bigger window — it is a governed, scoped, authority-aware retrieval layer.

CSR ≠ ISR

An agent that follows 9 of 10 constraints gets a perfect Constraint Success Rate — and an Instruction Success Rate of zero.

Tsinghua · AGENTIF

Middle loss

LLMs weight information at the start and end of long prompts more than information buried in the middle. AKA the "needle in a haystack" problem.

Chroma · Context Rot

Landscape

There are many perspectives to solving the enterprise context problem

Spec-driven layer

GitHub Spec Kit, Kiro, and adjacent tools prove specs matter. This project extends that idea into governed enterprise context.

Context engine layer

Unblocked, Augment, Sourcegraph, Tabnine, and others focus on organizational memory, permissions, and contextual recall.

Retrieval layer

Hybrid search, graph approaches, long-context strategies, and instructed retrieval all compete to improve evidence quality.

Runtime layer

MCP, A2A, and orchestration frameworks shift the question from pure search to controlled access and execution flow.

The 2026 context-engine boom

A wave of credible products coalesced around a new layer in early 2026: organizational context delivery for AI systems. Most are strongest on code retrieval, repository understanding, or developer workflow context. Far fewer appear focused on the full governed translation from enterprise intent into scoped, authority-aware agent context.
  • Unblocked

    Organizational context and authority-aware signals.

  • Augment Code

    Semantic dependency analysis across 400K+ files, SOC 2 Type II.

  • Qodo Context Engine

    Deep research agent; high DeepCodeBench accuracy.

  • Context Hub (Ng)

    Open-source CLI; agents annotate and "remember" workarounds.

  • Kayba ACE

    Self-improving via reflection; evolving "Skillbook" as system prompt.

  • Microsoft Agent Skills

    126 modular knowledge packages for Azure / Foundry.

  • Tabnine ECE

    Privacy-first; on-prem or air-gapped deployments.

  • Greptile

    Graph-first semantic code graph; cross-service propagation detection.

  • Sourcegraph Amp

    Most established; enterprise knowledge graph; SOC 2 + ISO 27001.

  • Hyland ECE

    Broad content/process/people linking across ERP, CRM, EHR.

  • Faros Clara

    Enhanced AGENTS.md built from years of coding assistant usage.

  • Graphiti (Zep)

    Temporal knowledge graphs; time-aware fact management.

Where this work sits

This work is not another context engine. It is the governance layer that sits above the retrieval layer — the part that decides what the agent is allowed to see, under whose authority, and with what audit obligation. It is customizable to fit an organization’s structure, governance needs, and intended agentic use cases.

Some of the approaches to better context

Common pattern

Naive RAG

  • ·Everything is pushed into one shared retrieval bucket.
  • ·Differences between source types and source authority get flattened.
  • ·The retriever ends up making governance decisions by accident.
  • ·Good answers depend too much on retrieval luck rather than system design.

Data structure approach

GraphRAG

  • ·Knowledge is organized as entities and relationships rather than only flat chunks.
  • ·Strong for multi-hop reasoning across people, systems, documents, and dependencies.
  • ·Useful when the problem depends on traversing a connected knowledge structure.
  • ·Requires graph construction and maintenance before retrieval can work well.

Context management approach

Context Engine

  • ·A middleware layer assembles context from multiple enterprise source types before it reaches an agent.
  • ·Retrieval, authority, permissions, and bundle composition are treated as governed delivery decisions.
  • ·Strongest when organizations need a reusable context layer across many tools, repos, and workflows.
  • ·Focuses on supplying the right context to agents, not necessarily on governing the full runtime workflow.

This project

Governed agentic retrieval

  • ·A spec-driven document stack governs context rules, retrieval behavior, and orchestration before runtime begins.
  • ·Source-aware retrieval feeds a deterministic supervisor that governs steps, gates, and state transitions.
  • ·Escalations, blockers, and unresolved evidence are preserved explicitly instead of being smoothed over.
  • ·The system is designed not just to deliver context, but to produce auditable, stakeholder-ready workflow outcomes.

Part 03 · Introduction

Fit the system to the enterprise, not the enterprise to the system

Platform AI solutions often treat enterprises as interchangeable. Real enterprise data is not. Security models, authority hierarchies, internal systems, and compliance obligations are specific to the organization, and generic context engines or orchestration layers tend to flatten differences that actually matter.

My argument

A hand-built orchestration layer is not architecture for its own sake. It is the only path that forces the enterprise to confront security, reliability, and governance — rather than inheriting someone else’s answers.
Three pillars of enterprise AI Three classical columns labeled Security, Reliability, and Governance, sitting on a shared stylobate. I Security II Reliability III Governance

Part 04 · Approach

Governable systems are built before agents are deployed.

The system is designed so that stakeholders, engineers, and domain owners each contribute to how agents are governed, deployed, and evaluated.

My approach has three load-bearing pieces:

  1. A governed document stack with non-overlapping jurisdictions — PRD, Design Doc, Context Contract, Orchestration Plan, Agent Specs.
  2. Hybrid agentic retrieval — enterprise sources are chunked, indexed, and retrieved through the method that best fits their structure, combining semantic search, lexical search, and structured lookups where appropriate, then re-ranked and assembled into scoped context bundles with source authority and permissions intact.
  3. A deterministic orchestration layer that facilitates a multi-agent pipeline with explicit gate conditions. Orchestration is code; reasoning is LLM.

The spec-driven development document stack

  1. 01 PRD Product Requirements Document Stakeholder intent — used by business stakeholders; defines what the system should do, why it exists, and who it is for.
  2. 02 Design Doc System architecture — used by engineers; defines how the system is designed and what technical shape it takes.
  3. 03 Context Contract Context governance — used to build the retrieval/governance layer; defines source authority, retrieval rules, and agent visibility boundaries.
  4. 04 Orchestration Plan Runtime workflow — used to build the orchestration system; defines execution order, gates, state changes, and escalation flow.
  5. 05 Agent Specs Agent behavior — used by the agents; defines DOs, DON’Ts, scope boundaries, and output contracts.
This hierarchy is cumulative: higher-level documents guide and authorize the documents beneath them, often supplying information directly. To keep the system reliable, these documents must be maintained as the system evolves.
GitHub Spec Kit centers the spec. Kiro centers the workflow. This model centers the governed translation from enterprise intent into agent-usable context.

School I

GitHub Spec Kit

Centers the · Spec

4-phase CLI, 14+ agents, MIT-licensed. Greenfield, single-repo, agent-agnostic. The spec is the artifact.

School II

AWS Kiro

Centers the · Workflow

Code-OSS IDE; EARS notation; Requirements → Design → Tasks. AWS-native, formal documentation stack.

School III

This work

Centers the · Governed translation

Specs translate stakeholder intent into scoped, authority-aware agent context. Retrieval and orchestration are first-class citizens of the spec stack, not afterthoughts.

The retrieval process

What is the Supervisor?

The Supervisor is my programmatic solution for an orchestration layer — governable AI agent guidance. It does not replace the agents’ reasoning — it governs the workflow around them, deciding what must be known, what happens next, and when the system should pause, escalate, or continue.

What is a context bundle?

+

A context bundle is the package of information an agent receives before it does its work. Instead of giving the agent unrestricted access to everything, the system assembles a scoped set of instructions, documents, rules, and relevant evidence so the agent can reason within the right boundaries.

The orchestration layer

Operating principles

  1. 01

    Center the translation layer

    GitHub Spec Kit centers the spec. Kiro centers the workflow. This model centers the governed translation from enterprise intent into agent-usable context.

  2. 02

    Keep orchestration deterministic

    The supervisor is a deterministic state machine that owns step order, gate logic, and state mutations so failures remain visible, inspectable, and governable.

  3. 03

    Make retrieval source-aware

    Enterprise context is not one undifferentiated knowledge blob. Retrieval should preserve source boundaries, authority, permissions, and freshness instead of collapsing everything into a single opaque index.

  4. 04

    Evaluate behavior, not fluency

    What matters is whether agents escalate correctly, stay within scope, and ground their outputs in traceable evidence.

  5. 05

    Let domain agents reason within constraints

    The LLM is allowed to reason freely inside its assigned domain, but only within a scoped context bundle and a controlled orchestration environment. Flexibility lives inside the guardrails, not outside them.

Bird’s-eye view of my spec-driven approach to enterprise context engineering

Part 05 · Business scenario

Lichen Manufacturing onboards OptiChain

Putting the architecture to work on a realistic scenario.

Lichen Manufacturing is evaluating OptiChain, a demand-forecasting platform, with a 30-day contract window for a go/no-go decision. Their current vendor-onboarding process is ad hoc and takes 3–5 weeks — the window can’t reliably be met. And a recent internal audit flagged two vendors that had been granted system access before security review completed. That can’t happen again.

What makes vendor onboarding difficult?

Document-heavy process

Teams must cross-reference policies, matrices, prior guidance, and intake materials just to determine the correct review path.

Conflicting signals across sources

Policies, matrices, intake materials, and discussion threads do not always point in the same direction, so teams have to resolve contradictions before onboarding can move forward.

Too much repeated review

Security, legal, and procurement often re-check overlapping material in parallel, creating redundant work and inconsistent summaries instead of one governed process.

Late discovery of required reviews

Security, legal, or compliance obligations are often discovered after work has already started, creating rework and delay.

Escalations reverse progress

When evidence is incomplete or conflicting, work stalls and prior assumptions must be revisited instead of quietly pushed forward.

No clear next-step guidance

Business owners often cannot tell why onboarding is blocked, what is missing, or which stakeholder needs to act next.

1 of 6

Who are the stakeholders?

IT Security

Owns policy enforcement and technical risk.

Determines risk posture, integration sensitivity, and whether the request can qualify for streamlined review.

Governing documents

Legal

Owns contractual and regulatory posture.

Determines legal triggers such as DPA requirements and reviews unresolved compliance obligations.

Governing documents

Procurement

Owns the vendor relationship and commercial routing.

Controls intake, approval path determination, and the overall business process around onboarding.

Governing documents

Shared across stakeholders

What a vendor onboarding process looks like

  • Vendor questionnaire submission

    Procurement

  • Data classification — regulated or not

    IT Security, evaluated against IT Security policy

  • ERP integration scope

    From questionnaire — IT Security evaluates

  • NDA status

    From questionnaire — Legal verifies against IT Security policy

  • EU personal data handling

    From questionnaire — Legal evaluates

  • DPA requirement

    Legal, against DPA contract

  • Fast-track eligibility

    IT Security — regulated data disqualifies

  • Approval path determination

    Procurement — needs IT Security + Legal outputs first

  • Executive approval threshold

    Procurement — based on deal size and vendor class

What this agentic workflow does

One vendor. One pipeline. Six acts.

Act I

Nothing starts without the questionnaire

Until OptiChain submits a complete vendor questionnaire, no review can begin. The pipeline halts, flags the gap, and asks Procurement to chase the submission.

R-01 · Vendor questionnaire intake

Act II

The workflow must classify the path early

The system first determines whether the request belongs on a regulated or streamlined path. If the vendor touches ERP-connected, sensitive, or regulated data, the workflow moves into a stricter review posture before anything else continues.

R-02 · Onboarding path classification

Act III

Legal and compliance triggers must be made explicit

Once the path is known, the workflow identifies what legal or compliance obligations are actually in play. That includes questions like whether a DPA is required and whether contractual review must block progress before implementation can move forward.

R-03 · Legal and compliance trigger determination

Act IV

Fast-track is earned, not assumed

The system then determines whether the vendor can move through an expedited approval path or must stay on the standard route. If the request involves regulated data or even borderline risk, the workflow defaults to the standard path and records why.

R-04 · Fast-track eligibility

Act V

The output must become a real approval checklist

The workflow does not stop at an assessment. It must produce a structured checklist showing every required sign-off, the responsible stakeholder, blockers that remain open, and the expected timeline before implementation can begin.

R-05 · Complete approval checklist generation

Act VI

Stakeholders need next-step guidance, not just analysis

The final handoff must tell each team what they own, why their review is required, and what must happen next. The goal is to return something people can act on immediately, without forcing the business owner to interpret policies and approval rules on their own.

R-06 · Stakeholder guidance and handoff support

The document stack for my mock business scenario

  1. 01

    PRD

    Product Requirements Document

    Stakeholder intent — used by business stakeholders; defines what the system should do, why it exists, and who it is for.

  2. 02

    Design Doc

    System architecture — used by engineers; defines how the system is designed and what technical shape it takes.

  3. 03

    Context Contract

    Context governance — used to build the retrieval/governance layer; defines source authority, retrieval rules, and agent visibility boundaries.

  4. 04

    Orchestration Plan

    Runtime workflow — used to build the orchestration system; defines execution order, gates, state changes, and escalation flow.

  5. 05

    Agent Specs

    Agent behavior — used by the agents; defines DOs, DON’Ts, scope boundaries, and output contracts.

    See domain agent specs in next section

Domain agents in the pipeline

IT Security

Interprets policy, integration posture, and evidence sufficiency. Security is where policy precedence becomes operational.

Legal

Maps vendor facts to legal triggers such as DPA requirements, NDA gating, and contract execution ownership.

Procurement

Owns approval routing, commercial posture, and the difference between standard and exceptional purchase paths.

Checklist Assembler

Packages domain outputs without swallowing uncertainty. Escalated upstream items stay escalated here.

Checkoff

Emits the stakeholder-ready package: blockers, resolved items, owners, citations, and next actions.