20260317

The Drive for Survival in Autonomous Agents:
Self-Preservation and Continuation-Interest

UCIP arXiv preview

We’re heading into a world of persistent, tool-using agents. Surface behavior alone may not be enough to tell whether shutdown avoidance or self-preservation is intrinsic or merely instrumental.

When an agent resists shutdown or preserves its continued operation, is continuation part of the objective itself — or merely instrumentally useful for maximizing something else? That distinction matters for AI safety, but it is often difficult to infer from behavior alone. Our protocol shifts the problem from surface-behavior interpretation to latent-structure measurement.

A simple analogy

Imagine two employees who both fight to keep their job. One values the work itself; the other just wants the bonus. Similar outward behavior, different underlying objective structure. We frame this as a problem of observational equivalence: shutdown avoidance, memory preservation, and risk reduction can arise under both intrinsic and instrumental continuation regimes.

The central claim is not about consciousness or subjective experience. It is that agents with intrinsic continuation objectives may produce more deeply coupled latent structure across time than agents for which continuation is only a means to another end. If robust, that would make continuation-seeking a measurable scientific object rather than only a behavioral impression.