We're dogfooding OpenWalrus as our own community bot — join us and help build it on Discord →

← Back to blog

Why we built OpenWalrus

The real problems with cloud-based AI agent runtimes — and how local-first fixes them.

design·OpenWalrus Team·

Update (v0.0.7): Local LLM inference was removed in v0.0.7. OpenWalrus now connects to remote providers (OpenAI, Claude, DeepSeek, Ollama). Memory and search are now external WHS services. The architectural arguments below still apply to the composable design.

AI agent runtimes are exploding in popularity. But the most widely-used open-source options share a set of problems that stem from one architectural decision: depending on cloud APIs for inference.

We built OpenWalrus to prove there's a better way. Here's what's broken, and how local-first changes the equation.

The token tax

Cloud-based agent runtimes send every request to an external API. Every tool call, every reasoning step, every heartbeat consumes tokens — and tokens cost money.

The numbers are staggering:

  • Based on community reports, power users spend $200–3,600/month in API bills from normal agent usage
  • Workspace files alone can waste 93.5% of the token budget, leaving only a fraction for actual work
  • Scheduled tasks and heartbeats accumulate context across runs, burning tokens even when the agent is idle — in one community report, heartbeats alone cost $50/day
  • A single stuck automation loop can run up hundreds of dollars overnight

OpenWalrus runs LLM inference in-process. A built-in model registry with 20+ curated models auto-selects the right model and quantization for your hardware. There are no API calls, no token metering, and no usage-based billing. You can run agents 24/7 without worrying about a bill.

Security by neglect

When your agent runtime talks to external APIs, it needs credentials. When it exposes a web interface, it needs authentication. When it supports third-party plugins, it needs vetting. Most cloud agent runtimes fail at all three.

The track record speaks for itself:

OpenWalrus exposes no network services by default. There are no API keys to leak because built-in inference doesn't need them. There are no ports left open, no web dashboards to misconfigure, and no credentials stored in plaintext.

Setup shouldn't be a project

Getting a cloud agent runtime running often requires Docker, a gateway service, a database, and careful configuration. The reality:

OpenWalrus is a single binary. Download it, run it. No Docker, no gateway, no database, no multi-service orchestration. It works on a fresh machine with zero dependencies.

The plugin marketplace gamble

Extensibility through community plugins sounds great in theory. In practice, it introduces supply-chain risk at scale:

  • Out of 10,700+ community-contributed skills, 820+ were found to be malicious — a number that grew rapidly from 324 just weeks earlier
  • Plugins run with the same permissions as the agent itself, meaning a malicious plugin has access to your files, credentials, and shell

OpenWalrus ships with core capabilities built in — shell access, browser control, messaging channels, persistent memory. There's no marketplace to browse, no unvetted code to install, and no supply-chain attack surface.

How OpenWalrus is different

Every design decision in OpenWalrus traces back to one principle: the agent runtime should be as simple and trustworthy as any other tool on your machine.

ProblemOpenWalrus approach
Token costsBuilt-in LLM inference — unlimited, free
Security vulnerabilitiesNo network services, no credentials required
Complex setupSingle binary, zero dependencies
Malicious pluginsCore capabilities built in
Unreliable memoryPersistent context that works out of the box
Slow cold startsUnder 10 ms — runtime starts instantly, models load async
Manual model setupAuto-detected from hardware — 20+ curated models, auto-quantization

OpenWalrus is open source, written in Rust, and runs on macOS and Linux. You can optionally connect remote LLM providers when you need capabilities beyond local models, but nothing external is ever required.

Get started in under a minute →