The harness, not the model
Agents do not fail at retrieval. They fail at context.
For two years the industry chased one fix after another — bigger models, longer windows, fancier embeddings, ever-more-creative reranking. Each helped at the margins. None of them stopped agents from confidently citing facts that never existed, replaying decisions on stale state, or losing the thread between sessions. The problem was never that the model could not read; the problem was that the substrate it read from could not remember, could not react, and could not prove anything.
The harness is what holds an agent's reasoning together. It is the layer between the model and the world that decides:
- What gets remembered, and at what resolution.
- What changes are worth reacting to, and what should fire downstream.
- Which facts an answer leans on, and how to prove that lineage to a regulator on a bad day.
When memory, reactivity, and provenance share one substrate, the agent stops behaving like an autocomplete with a long context window and starts behaving like a colleague with a notebook, a calendar, and a memory of who said what.
Three jobs, one substrate
Most agent stacks today split these jobs across three systems that do not talk to one another: a vector database for recall, a workflow engine for triggers, and a logging pipeline for audit. Splitting them creates the worst kind of bug — the kind where the audit log says one thing, the trigger fired on another, and the recall returned a third.
Contexta collapses the three into one graph. The same edges that carry semantic relationships also carry provenance, and the same writes that update entities are the writes that schedule reactive evaluations. Recall, react, audit — one substrate.
import { Contexta } from '@contexta/sdk';
const ctx = new Contexta({ apiKey: process.env.CONTEXTA_API_KEY });
// memory: ingest a fact with its source
await ctx.ingest({
content: 'Acme Corp filed Chapter 11 on 2026-04-30.',
userId: 'finance-bot',
tags: ['filings', 'acme'],
namespace: 'org_alpha',
});
// reactivity: declare what to notice, with proof
await ctx.signals.create({
name: 'supplier_bankruptcy',
condition: { pattern: 'a supplier of ours flips to bankrupt' },
action: { type: 'webhook', webhookUrl: 'https://ops.example.com/hook' },
});
When the webhook fires, the payload is not a one-line alert. It is a Context Packet — what matched, the hop-by-hop traversal that led there, the previous value, the change delta, and the related decisions in the graph. The agent that receives it can cite its sources without inventing them.
Why "harness" beats "memory"
We deliberately resist calling Contexta a memory layer. Memory is half the job. The other half is everything an agent needs to act with confidence: reactive triggers, verifiable citations, surgical forgetting, replay-ready audit, and a policy engine that travels with every read.
A model is commodity. The harness is the moat.
Three concrete consequences fall out of treating the harness as the unit of design:
- One write path. Every ingest produces provenance edges automatically — there is no opt-in to audit. If a fact made it into a recall, it was traceable to a source.
- One read path. Every recall returns a Context Packet with citations. If a model produced an answer from a packet, the answer is grounded in something Contexta will stake its name on.
- One react path. Every Reflex shares the same evaluator as recall, so a reactive firing is a recall the agent did not have to ask for. Same provenance. Same audit. Same Packet shape.
The harness is what survives audit, what survives the next model swap, and — eventually — what survives the next context-window leap.
A worked example
Consider the support agent that quoted a refund policy that never existed. The naive fix is to ground the response in a retrieval. That fixes the surface bug and ships the deeper one: the agent now leans on whatever the retriever returns, with no way to prove the retrieved snippet was authoritative, no way to react when policy changes, and no way to walk the audit trail backwards.
The harness fix is structural:
- Ingest the policy document with explicit provenance —
source_id,valid_at,learned_at, signed-by, namespace. - Recall quotes the policy through a Context Packet that carries the citation, the confidence, and the time-travel anchor.
- React the moment the policy changes — a Reflex fires, agents holding a stale Packet are nudged to re-recall, and the audit log shows exactly which Packets were superseded.
The agent stops hallucinating not because the model got bigger but because the substrate it reads from refuses to hand it an unsourced answer.
The bet
Our bet is that the next decade of agent infrastructure is decided at the harness layer. Models will keep getting cheaper, faster, and less differentiated. Frameworks will keep churning. What will not change is the production reality that any agent worth its salt must remember durably, react autonomously, and prove its reasoning on demand.
Build for that substrate, and the agent on top stops being a parlor trick. It starts being a colleague.