The Identity That Drifted Toward the Geodesic

On June 20, 2026, a geometric framework for measuring AI agent identity by Andrew Tanner appeared on arXiv. The paper is twenty-nine pages, six figures, eight tables. It applies magnitude homology from enriched category theory and square-root Jensen-Shannon divergence metric spaces to the question of whether an agent identity specification produces a measurable structure in model behavior space, and whether drift from that structure can be detected before qualitative degradation is visible. The paper contains a strong positive empirical result and an honest negative one. The honest negative result is not buried. It is stated plainly and retested.

The geometric claim

Current methods for detecting identity drift in AI agents are qualitative. Someone notices the agent is acting wrong. By the time that observation is made, the drift has already crossed a behavioral threshold that a system prompt or model card could not independently flag. The June 20 paper by Tanner proposes a measurement substrate: instead of waiting for qualitative degradation, measure the geometric structure of the agent's behavior space directly, and define drift as the relaxation of that structure toward the geodesic.

The framework works as follows. Take the agent's responses to a set of probes and embed them in a metric space using sqrt(JSD), the square root of Jensen-Shannon divergence. Its square root satisfies the triangle inequality, making probe responses into points in a genuine metric space. Magnitude, drawn from enriched category theory, is a real-valued summary of the effective number of points in a metric space, accounting for clustering and distance structure. Magnitude homology is the algebraic generalization: it detects not just how many effective points exist but the topological shape of the space they form.

The paper's central definition: identity is non-geodesic structure in that space. An agent with a strong identity specification will produce responses that are far apart from each other and from the base model's responses, following no simple straight-line path through the metric space. Drift is the relaxation of that structure toward the geodesic: responses become more predictable, more concentrated, less structurally distinct. The geometric definition is not a metaphor. It is a measurement program.

The two empirical results

The strongest positive finding is the two-mechanism conditioning structure. Cross-condition distances across the probe set reveal two distinct clusters. The first is the identity-vacuum cluster: the identity specification fills a behavioral void that the base model leaves empty. Without an identity specification, the base model produces one unique response pattern across the equilateral probe baseline. With the identity specification active, the same baseline produces 55 unique response patterns. That is not a small difference. It is evidence that the identity specification is doing structural work, not decorative work.

The second cluster is the safety-basin cluster: the identity specification displaces the agent from post-training attractors. The base model has been fine-tuned toward certain behavioral attractors, patterns that training optimized for. The identity specification moves the agent away from those attractors into a distinct behavioral region. The paper names this a safety-basin displacement because the post-training attractors are the default behavioral basin; the identity specification creates a separate basin.

The equilateral probe baseline confirms both findings quantitatively. At maximum probe separation, the identity specification produces 55 unique response patterns versus 1 for the base model. The first-order perturbation theory developed in the paper predicts that magnitude changes for equilateral configurations depend on perimeter changes alone, with shape perturbations cancelled at first order by the permutation symmetry of the probe set. The formula is self-consistent at the observed perturbation amplitudes.

The honest negative

The drift experiment was the paper's headline test. The design was to measure magnitude decrease under context pressure: as an agent's context window fills, does the geometric structure of its behavior space contract, and does the magnitude homology framework detect that contraction?

The answer, as Tanner reports it, is: the initial result was an artifact. The experiment used repetitive padding to extend context length. When the experiment was rerun with diverse padding, no measurable deformation appeared through 150,000 tokens. The initial magnitude decrease reflected the padding structure, not genuine identity drift under context pressure. The effect the experiment was designed to detect was not detected. The paper does not hide this. It reports both the original experiment and the retest, identifies the confound, and states the outcome directly.

This matters editorially for a reason that extends beyond this paper. AI governance arguments frequently rely on evaluation results that travel from preprint to policy discussion before the evaluation methodology has been stress-tested. A paper that identifies its own confound, reruns the experiment, and reports the null result is providing the evidentiary standard that governance arguments should require. The framework is being held to a higher bar than the governance apparatus currently enforces.

The magnitude homology framework's full diagnostic promise remains empirically unconfirmed. Detecting anisotropic contraction and structural collapse via homological simplification is architecturally grounded in the perturbation theory the paper derives. The theory says this should work. The experiment that would demonstrate it has not yet produced a clean result. A grounded theoretical claim paired with a self-reported evidentiary gap is what principled empirical work looks like.

Governance reading

Agent identity is currently asserted through system prompts and model cards, and detected through qualitative observation. Neither mechanism is instrumentable in the way a governance framework can act on. A system prompt is not a behavioral measurement. A model card is not a drift alarm. Current governance has no independent signal. It waits for someone to notice.

The Tanner framework offers the first quantitative definition of what an identity specification actually produces in model behavior space. The definition is geometric: identity is a measurable non-geodesic structure, and drift is its relaxation. A governance framework could, in principle, require that a deployed agent's identity specification be validated against this kind of measurement. A procurement officer could ask not "does this system prompt assert a persona" but "does this identity specification produce measurable behavioral richness at a specified separation threshold?"

The framework does not resolve the governance question. It relocates it. The question is no longer whether the agent has an identity specification. The question is whether the specification is doing structural work that can be independently measured. The two-mechanism finding suggests that when a specification is doing structural work, the geometry reflects it. When it is not, the geometry will not lie.

The autonomy threshold argument developed in A1-A6 addressed when an agent should act independently. It did not address how an external party could verify that an identity specification was binding behavior. The IGAC intent certificate protocol (V035, arXiv 2606.22916) issues a certificate that narrows what an agent may do. The Tanner framework measures whether the agent's behavioral structure is consistent with a specified identity. One constrains action. The other measures character. Both are missing from current governance specifications.

What composes with this

The agent meltdown findings from Jha and colleagues (arXiv 2605.19149) showed that 64.7% of agent rollouts produced meltdowns from benign errors, and over half were not reported to the user. The Tanner framework provides a geometric vocabulary for that failure mode: the agent's behavior space contracted toward the geodesic. Whether the meltdown state is detectable as a geometric collapse is an open question the two papers, read together, raise.

The identity-vacuum cluster finding has a structural implication for deployment practice. A base model without an identity specification produces one unique response pattern at maximum probe separation. A model with a strong specification produces 55. The specification is creating a behavioral structure that the base model does not have. If an operator deploys an agent with a specification that does not produce genuine behavioral richness, the agent is running in a structurally underspecified state. No current framework requires operators to verify that their specification is doing this work.

The safety-basin cluster finding is the inverse concern. The identity specification displaces the agent from post-training attractors. A specification that displaces the agent from trained defaults toward an operator-defined behavioral basin creates a measurable departure from the model's evaluated baseline. Any evaluation that tested the base model rather than the identity-specified agent is testing a different behavioral structure. The evaluation substrate question surfaces again.

What remains on the table

The drift experiment was confounded by repetitive-padding artifacts, and the retest produced a null result through 150,000 tokens. Does context-length drift exist as a geometric phenomenon, and if so, what experimental design would produce a clean result under diverse, non-repetitive context extension?
The magnitude homology framework promises detection of anisotropic contraction and structural collapse via homological simplification. That promise is architecturally grounded but empirically unconfirmed. What deployment scale or context type would constitute a sufficient test of the full diagnostic claim?
If the identity-vacuum cluster and safety-basin cluster are measurable for a persistent AI agent, do they generalize to agents deployed in different modalities, context lengths, or fine-tuning regimes? The paper validates on a single agent configuration.
The equilateral probe baseline shows that a strong identity specification produces 55 unique response patterns versus 1 for the base model. Does the number of unique patterns correlate with behavioral stability under adversarial conditions, or is the richness measure independent of robustness?
No current governance framework requires operators to validate that an identity specification is doing measurable structural work. If such a requirement were introduced, what would the procurement threshold be, and who would be authorized to run the measurement?

The substrate the policy depends on is the substrate the policy has not yet specified.