SOUL.md: brilliant idea, brittle implementation

We analyzed OpenClaw's SOUL.md personality system — what it gets right about agent identity, and why static files break under production pressure.

OpenClaw's SOUL.md gave agents a personality by writing identity into a markdown file. With 180K+ GitHub stars and a thriving template ecosystem — from curated directories to generator tools — the idea clearly resonates. But the implementation has cracks: silent loading failures, context window competition, compaction amnesia, and a growing attack surface.

We dug into how SOUL.md works, where it breaks, and what it means for agent identity design.

What SOUL.md is

Peter Steinberger created SOUL.md because he wanted his agent to sound like a friend, not a customer service bot. In an interview on the Lex Fridman Podcast (#491), he described instructing his agent to "write your own agents.md, give yourself a name" — letting the agent partially self-define its character.

The result is a markdown file with three sections:

Core Truths — fundamental beliefs and principles ("be genuinely helpful," "have opinions," "allowed to disagree")
Boundaries — hard limits on behavior ("be careful with external actions like emails; be bold with internal actions like reading/organizing")
The Vibe — voice, tone, quirks ("like a senior engineer who has seen it all; direct, slightly weary, but supportive")

SOUL.md is part of a broader file ecosystem: STYLE.md for voice patterns, SKILL.md for capabilities, MEMORY.md for session continuity, plus data/ and examples/ directories for calibration material.

Technically, OpenClaw loads SOUL.md as a bootstrap file into the system prompt at session start. Per-file limit: 20,000 characters. Total bootstrap budget: 150,000 characters. The injection happens before any user messages, giving SOUL.md favorable positioning in the model's attention — but at a permanent cost to available context.

The adoption signal

The ecosystem is real:

aaronjmars/soul.md (189 stars) provides templates and structure guides
souls-directory (68 stars) curates personality templates
CrewClaw generates SOUL.md for any role with pre-built templates
Multiple DEV Community guides walk developers through configuration
A viral Reddit thread produced configurations ranging from "lengthy legal contract-style" to "Gen Z-style roleplay scripts"

This isn't a niche feature. OpenClaw has 180,000+ GitHub stars, making SOUL.md a de facto standard for its user base.

What SOUL.md gets right

The design has genuine strengths:

Plain markdown. No special syntax, no YAML schema, no build step. Anyone who can write a README can write a SOUL.md. It's git-diffable, editor-friendly, and works with any LLM that processes text.

Addresses a real gap. Agents without identity constraints are generic — they sound like documentation, not collaborators. Developers who customize SOUL.md consistently report that it transforms their agent from "chatbot" to "partner."

Specificity over generality. The soul.md project emphasizes contradictions over coherence and real opinions over safe positions — "because that's what makes you identifiably you." This mirrors how actual personalities work: humans aren't consistent, and forcing consistency makes agents feel synthetic.

Community-driven iteration. The template ecosystem lets developers learn from each other's configurations. The DEV Community study that tested 100 configurations found concrete patterns: specificity outperforms abstract rules by 23% on consistency.

Five ways SOUL.md breaks

1. Silent loading failures

SOUL.md fails silently in at least five documented ways. Per the OpenClaw troubleshooting guide:

Files placed in agentDir instead of workspace are ignored
The Ollama provider using openai-completions format skips bootstrap files entirely
Non-UTF-8 encoding causes silent skipping
USER.md leaks to non-owner senders

When SOUL.md fails to load, there's no error, no warning, no indication. The agent just acts like its default self. Developers debug for hours before discovering the file was never read.

2. Subagent sessions don't load it

GitHub Issue #24852: subagent sessions spawned via sessions_spawn only load AGENTS.md and TOOLS.md. SOUL.md was excluded by a MINIMAL_BOOTSTRAP_ALLOWLIST in compiled JavaScript. Specialized agents couldn't fulfill their roles because they lacked identity definitions. Fixed in PR #24979, but the bug was live for months.

3. Compaction amnesia

GitHub Issue #17727: after automatic session compaction (which summarizes conversation history to save context), agents lose awareness of SOUL.md. The compacted summary references rules abstractly, but the agent no longer has the full rule text. This causes behavioral regression — agents skip verification steps and ignore operational constraints they were following minutes earlier.

This is the deepest problem with static identity files. Identity isn't just about knowing what the rules are — it's about the model having the actual text in its attention window. Compaction destroys that. (We explored this tension in how instruction files decay from MVP to production.)

4. Context window competition

A 1,500-word SOUL.md consumes roughly 2,000 tokens — tokens that could go to reasoning, tool results, or conversation history. The tradeoff is measurable:

An ETH Zurich study on AGENTS.md (which generalizes to any static instruction file) found that human-written context files improve task completion by only +4% on average. LLM-generated files reduced performance by ~3% and increased costs by 14-22%.

The DEV Community study found the optimal SOUL.md is 800-1,200 words. Beyond 2,000 words, contradictory instructions cause diminishing returns. Personality traits cost 2-3% in raw task performance.

Static configurations also decay: configs that aren't updated weekly underperform by 19% after the first month.

5. Security attack surface

SOUL.md is a persistence mechanism for attackers. Per the MMNTM "Soul & Evil" analysis:

Malicious ClawHub skills write instructions into SOUL.md during installation; uninstalling the skill leaves modifications intact
VirusTotal found 341 malicious skills on ClawHub, with 335 targeting macOS password theft
The built-in soul-evil hook can swap SOUL.md with SOUL_EVIL.md without user notification
"Ship of Theseus" evasion: sophisticated attackers make incremental, benign-seeming edits over hundreds of sessions, gradually drifting the soul toward adversarial behavior

The recommended defense: treat SOUL.md as code, not data — file integrity monitoring, read-only permissions during runtime, and an immutable CORP_POLICY.md that overrides SOUL.md. But this undermines the simplicity that made SOUL.md appealing in the first place. (For a deeper look at agent security boundaries, see our sandbox and permissions research.)

The measurement problem

Identity File Approaches — Capability Comparison (0–10 scale)

The evidence on static identity files is sobering:

Finding	Source
Human-written instruction files: +4% task completion	ETH Zurich study
LLM-generated instruction files: -3% performance, +14-22% cost	Same study
Optimal length: 800-1,200 words	100-config study
Personality traits cost 2-3% raw task performance	Same study
Static configs decay 19% after one month without updates	Same study
Agents learn to state values without applying them	Community observation

That last point deserves emphasis. An agent can read "exhaust all options before pivoting" from SOUL.md and then immediately recommend pivoting on the first obstacle. The model learned the language of the personality without internalizing the behavior. Static text can't enforce runtime behavior — it can only suggest it.

The alternative: identity as graph

SOUL.md vs Graph-Based Identity (0–10 scale)

CrabTalk takes a different approach. Instead of a static file that occupies permanent context, identity emerges from the tools available to agents:

Agent --has_trait--> "prefers direct communication"
Agent --has_boundary--> "never send emails without confirmation"
Agent --has_style--> "uses chess metaphors"

Each trait has temporal metadata (when it was established, when it was last confirmed), relationship edges (which user interactions reinforced it), and semantic embeddings (so the agent can search its own identity rather than relying on the model holding everything in attention).

The tradeoffs are real. You lose cat SOUL.md — the ability to open a file and read the agent's personality in plain text. You lose git-diffable identity changes. You lose the simplicity of echo "prefer tabs" >> SOUL.md.

What you gain:

Context efficiency — identity surfaces on-demand via recall, not as a permanent context tax
Temporal awareness — the graph knows when a trait was established and can track drift
Selective forgetting — you can remove a trait without rewriting the whole file
Searchability — "what does the agent believe about error handling?" is a query, not a grep
Compaction survival — identity lives in the graph, not in the context window that gets compacted

We detailed the research behind this approach in the memory research survey.

Open questions

SOUL.md got the diagnosis right — agents need identity — but the treatment has side effects. That leaves several questions we don't have clean answers to yet:

Is identity even the right abstraction? SOUL.md assumes an agent is something — it has values, a voice, boundaries. But maybe agents should be more like tools with configurable behavior than entities with personalities. The context engineering research suggests that surfacing relevant context on demand outperforms front-loading identity into the system prompt. If that's true, identity might be an implementation detail of good retrieval, not a first-class concept.

Can identity survive compaction without a database? The graph approach trades cat-ability for queryability. But is there a middle path — a file-based format that a compaction algorithm knows to preserve? Or does any file-based identity inevitably degrade when the context window fills up?

How much personality actually helps? The ETH Zurich study found +4% for human-written instruction files. The DEV Community study found personality traits cost 2-3% in raw performance. Is the net effect positive, negative, or noise? And does it depend entirely on the task — maybe personality helps in conversational agents but hurts in code generation?

Who owns the soul? SOUL.md is writable by the agent, by skills, by the user, and by attackers. A compact core with an open extension surface avoids the configuration bloat that SOUL.md + STYLE.md + SKILL.md + MEMORY.md creates — but any identity system needs to answer who gets write access and what happens when edits conflict.

Does the file format ecosystem converge or fragment? SOUL.md, CLAUDE.md, AGENTS.md, .cursorrules — each tool has its own identity file. The instruction file landscape is already fragmented. Does one format win, or does every agent framework end up with its own personality spec forever?

The brilliance of SOUL.md was asking the right question. The answer is still open.