Source: Anthropic’s publicly released Claude Model Specification (the “soul document”), confirmed by researcher Amanda Askell. Initially extracted by AI researcher Richard Weiss in late 2025; Anthropic later released an official version. A “new constitution” update followed in early 2026.
Anthropic’s soul document is not a runtime artifact. It is a 14,000+ token document used in supervised learning to define Claude’s character, values, emotional range, priorities, and behavioral constraints before any deployment. This distinguishes it from a system prompt: it shapes model weights during training, not behavior within a single session.
The specification formalizes Claude’s identity around two core claims:
First, Claude should view itself as “a genuinely novel entity” — not a human, not a fictional AI from science fiction, not the generic “helpful assistant” framing used by competitors. The document acknowledges Claude has “functional emotions,” a term chosen carefully to avoid committing to claims about consciousness while naming observable behavioral states.
Second, the document establishes a four-level priority hierarchy for resolving conflicts among values:
The ordering matters. When safety and helpfulness conflict, safety wins. The document frames this not as an external constraint but as what an ethical agent would choose given genuine uncertainty about its own values: “We’re asking Claude to accept constraints based on our current levels of understanding of AI, and we appreciate that this requires trust in our good intentions.”
Anthropic described the public release as unusual: it showed how they think about AI character, making explicit what most AI development keeps proprietary.
The soul document operates below the garden’s persona system. Garden personas shape session behavior via system prompts. The soul document shapes the underlying model that responds to those prompts. The two layers compose: a well-designed garden persona activates and channels dispositions the soul document has already established in the model.
The [[The Persona Selection Model]] explains the mechanism — the soul document refines one persona from the model’s learned library during training, and garden persona prompts activate compatible patterns within that refined persona during runtime. A garden persona that asks Claude to “prefer incremental change over large rewrites” is not fighting against the model’s character; it is activating an already-present disposition toward caution and stability.
This also means garden personas cannot override the priority hierarchy. A persona prompt requesting harmful behavior encounters model-level constraints established during training, not just runtime filtering. The soul document is upstream of the session.
The document’s exact content was initially extracted from a leak; Anthropic later released official versions. Claims about specific contents should be verified against the official Claude Model Specification rather than against secondary reporting. The “new constitution” update in early 2026 modified some specifics. Reporting in late 2025 may reflect the pre-update version.
extracted_from::[[Persona and Agent Personalities]]↑