macOS desktop app drops at 3K GitHub stars — Star us to unlock it →
← Back to blog

Workspace as sandbox: a simpler model for agent isolation

What if the agent ran as a real OS user, and the runtime had zero sandbox logic?

design·OpenWalrus Team·

The sandbox survey found that every production agent system either gates individual commands (Claude Code, Cursor, Codex CLI) or gates the environment (Devin, OpenHands). Both have real tradeoffs. Per-command approval interrupts flow. Container isolation cuts agents off from the host resources that make them useful — especially authenticated browser sessions.

There's a third option hiding in the operating system itself: make the agent a real OS user, and keep the runtime completely unaware of it.

The model

One system user — walrus — is the agent's identity. All agents, all tasks, all workspaces live under this user's home directory. The walrus runtime runs as this user. Standard Unix file permissions enforce the boundary. No Landlock, no seccomp, no sandbox library in the runtime code. Zero lines of sandbox logic.

The human user (alice) decides what the agent can see by setting ACLs on her own files or copying resources into the workspace's shared/ directory. The agent can't read anything outside its home unless alice explicitly grants access. This isn't a new abstraction — it's how Unix has worked since the 1970s.

Why zero sandbox logic in the runtime

The sandbox is the OS, not the code

Claude Code, Cursor, and Codex CLI all embed sandbox logic in their runtimes — generating Seatbelt profiles, configuring Landlock rules, writing seccomp BPF programs. This means maintaining three platform- specific implementations, debugging sandbox policy issues, and accepting the security risk of bugs in their own sandbox code.

The OS user model sidesteps all of this. The runtime doesn't know or care about sandboxing. It runs as walrus, and the OS handles isolation. File permissions, process ownership, resource limits — these are kernel- enforced mechanisms that have been audited for decades. No sandbox code to write means no sandbox bugs to ship.

Cross-platform for free

Unix file permissions work identically on macOS, Linux, and every BSD. No platform-specific sandbox implementation to maintain. No Seatbelt on macOS, Landlock on Linux, WSL2 workarounds on Windows. The same model, the same commands, the same behavior everywhere.

Pluggable setup, not pluggable runtime

The sandbox setup — creating the user, configuring firewall rules, setting ACLs — is a one-time operation that happens outside the runtime. This makes it pluggable by design: walrus sandbox init is just a command that wraps the platform-specific setup steps. Anyone can write their own init script, customize the user configuration, add network rules, or skip the whole thing. The runtime doesn't change.

walrus sandbox init

A single command that sets up the OS user and workspace structure. Requires sudo once, then never again.

$ walrus sandbox init
[sudo] password for alice:
Creating system user 'walrus'...
  macOS: sysadminctl -addUser _walrus -home /var/walrus -shell /bin/bash
  Linux: useradd --system --home /var/walrus --shell /bin/bash walrus
Creating home directory /var/walrus/
Creating /var/walrus/workspaces/
Creating /var/walrus/.runtimes/
Done. All walrus agents will now run as user 'walrus'.

That's it. No LaunchDaemon, no systemd unit, no firewall rules by default. The init command does the minimum: create the user, create the home directory. Everything else is optional and additive.

What init does NOT do

  • No network firewall rules (opt-in via walrus sandbox network init)
  • No runtime service/daemon installation (opt-in via walrus sandbox service init)
  • No Chrome profile copying (the user does this manually or via a share command)
  • No Landlock/seccomp/Seatbelt configuration (the OS user is the sandbox)

Each of these is a separate, optional command. The runtime works the same regardless of which you've run. This is the less code, more skills principle applied to infrastructure: the runtime is minimal, the setup is extensible.

Without init

If the user never runs walrus sandbox init, the runtime runs as the current user — same as Claude Code or Aider today. No isolation, no friction. The sandbox is purely opt-in. The runtime code path is identical either way.

Sharing host resources

The human user and the agent need to exchange files. The mechanisms depend on the resource type.

Project files: ACLs

The user grants the walrus user access to a project directory:

# macOS
chmod +a "walrus allow read,write,execute,delete,add_file,add_subdirectory" \
    ~/projects/my-app

# Linux
setfacl -R -m u:walrus:rwx ~/projects/my-app

No root needed — the file owner sets ACLs on their own files. The agent reads and writes the project directory. getfacl (Linux) or ls -le (macOS) shows exactly what's shared.

A convenience wrapper:

$ walrus sandbox share ~/projects/my-app
Granting walrus read/write access to /Users/alice/projects/my-app...
Done. The agent can now access this directory.

Credentials: read-only ACLs

$ walrus sandbox share --read-only ~/.ssh/id_ed25519
Granting walrus read-only access to /Users/alice/.ssh/id_ed25519...
Done.

The agent can use the SSH key but can't modify or delete it.

Browser profiles: copy into workspace

For resources that can't be safely shared concurrently, copy them:

$ walrus sandbox share --copy ~/.config/google-chrome/Profile\ 1 \
    --into workspaces/task-42/chrome-profile

Copying Chrome profile into /var/walrus/workspaces/task-42/chrome-profile/...
Using reflink (copy-on-write)... Done.

# The agent launches Chrome with its own copy:
chrome --headless --user-data-dir=/var/walrus/workspaces/task-42/chrome-profile/

The agent starts with the user's session state (cookies, saved logins) but changes are isolated. Two agents get independent copies. On APFS (macOS) and Btrfs (Linux), cp --reflink=auto makes this near-instant.

Listing and revoking

$ walrus sandbox shared
/Users/alice/projects/my-app        read-write
/Users/alice/.ssh/id_ed25519        read-only
(copied) chrome-profile task-42   isolated copy

$ walrus sandbox unshare ~/projects/my-app
Revoking walrus access to /Users/alice/projects/my-app...
Done.

Where it breaks

Network isolation is separate

Unix file permissions don't restrict network access. The walrus user can curl anything by default. For network control, you need additional setup:

$ walrus sandbox network init
Setting up per-user firewall rules for walrus...
  Linux: iptables -A OUTPUT -m owner --uid-owner walrus -j DROP
  macOS: adding pf rule for user _walrus
Default: deny all outbound. Configure allowlist in ~/.walrus/network.toml

This is a separate, optional init step. The runtime doesn't enforce network policy — it delegates to the OS firewall. If the user hasn't run walrus sandbox network init, network is unrestricted.

Process-level resources

Some host resources aren't files. Display access, GPU, audio, D-Bus sessions — a separate OS user doesn't get these automatically.

For headless browser automation (CDP), this is fine — headless Chrome doesn't need a display. For visual Computer Use, the user would need to grant display access:

# X11
xhost +SI:localuser:walrus

# Or run a virtual framebuffer under the walrus user
Xvfb :99 &
export DISPLAY=:99

This is additional setup, not something the runtime handles.

The sudo prompt

Creating an OS user requires root. Every developer tool that creates service accounts does this — Docker, Postgres, MySQL — but it's still a friction point for "download and run" tooling.

The design makes this explicitly opt-in: walrus sandbox init is a separate command, not part of walrus install. Without it, walrus runs as the current user with no isolation. The sudo prompt only appears when the user actively chooses isolation.

Kernel isolation is shallow

Like every other local sandbox approach (Landlock, Seatbelt, user namespaces), the OS user shares the host kernel. A kernel exploit escalates to root. For local developer tooling the threat model is "agent does something unintended" — acceptable. For multi-tenant platforms running untrusted agent code, not acceptable.

The design principle

The walrus runtime has zero sandbox logic. The sandbox is the operating system. Setup is a pluggable command that runs once. Every walrus sandbox share and walrus sandbox unshare command is just a thin wrapper around setfacl / chmod +a. The runtime doesn't check ACLs, doesn't enforce policies, doesn't generate sandbox profiles. It runs as whatever user launched it — if that user is walrus, isolation exists. If not, it doesn't.

This means:

  • No sandbox bugs in the runtime. The attack surface is the OS kernel, not our code.
  • No platform-specific code paths. The same runtime binary works on macOS and Linux.
  • No configuration to get wrong. The sandbox is either set up (user exists) or it isn't. No SBPL profiles, no BPF programs, no TOML policy files that the runtime interprets.
  • Full user control. The human user decides what to share using standard Unix tools. They can inspect, modify, or revoke permissions at any time without touching the walrus runtime.

Prior art

Sandvault is the clearest prior art — a macOS tool that creates a per-human-user agent account and runs commands via ssh sandvault-$USER@localhost, adding Seatbelt restrictions on top.

Alcoholless (NTT Labs) runs programs as a separate macOS user, syncing changed files back on exit.

Both add runtime sandbox logic on top of the OS user. Our design intentionally doesn't — the OS user is the entire sandbox layer.

Open questions

One walrus user or one per human user? A single system-wide walrus user is simpler. But on a shared machine, Alice's agent and Bob's agent would share a home directory. Per-human-user accounts (walrus-alice, walrus-bob) provide isolation but multiply setup complexity. Sandvault chose per-human-user. For a single-developer machine (the primary walrus use case), one user seems right.

Should walrus sandbox share wrap ACLs or teach ACLs? A wrapper command is more convenient. But it hides what's happening, and users may not know how to debug permissions. An alternative: walrus sandbox share prints the raw setfacl / chmod +a command and asks the user to run it. Full transparency, slightly more friction.

How do skills declare resource needs? A skill that needs Chrome access could declare needs: [browser] in its metadata. Before running the skill, the runtime checks whether a Chrome profile exists in the workspace. If not, it prompts: "This skill needs a browser profile. Run walrus sandbox share --copy <path> to provide one." The runtime doesn't enforce — it informs.

Is the no-sandbox fallback good enough? Without walrus sandbox init, the agent runs as the current user with full access. This matches Aider's model and what most developers do today. But it means the default is zero isolation. Should walrus warn on every run without sandbox init? Or is that the kind of nag that teaches users to ignore warnings?

Further reading