When agents share a channel

We tested what happens when two agents join the same group chat. Most frameworks have no built-in answer for the infinite loop that follows.

Picture this: you bind two agents to separate Telegram accounts and add both to the same group. A user sends "help me plan dinner." Agent A responds with a suggestion. Agent B sees Agent A's message, interprets it as new input, and responds with a counter-suggestion. Agent A sees that and replies. The loop has begun — and nobody asked for it.

This isn't a contrived edge case. OpenClaw's broadcast groups explicitly support multiple agents in one channel. AutoGen's GroupChat broadcasts every message to every participant. Any framework that allows shared communication channels faces this problem. We surveyed seven frameworks to understand who handles it, who ignores it, and who accidentally encourages it.

The three failure modes

Not all loops look the same. We identified three distinct patterns:

Ping-pong: Agent A responds to Agent B's output. Agent B sees Agent A's response and responds again. A classic two-party infinite loop. This is the most common failure mode in shared channels and the easiest to trigger — two agents with overlapping responsibilities in the same group will almost always ping-pong unless explicitly prevented.

Broadcast storm: Every agent in the channel responds to every message, including messages from other agents. With N agents, a single user message generates N responses, each of which generates N-1 follow-up responses. Token consumption grows exponentially. A documented case of a circular agent relay persisted for 9+ days, consuming 60,000+ tokens.

Escalation spiral: Agent A delegates a task to Agent B. Agent B determines it can't handle it and escalates back to Agent A. Agent A re-delegates. This pattern is specific to frameworks with bidirectional handoffs — notably the OpenAI Agents SDK, where handoffs are bidirectional by design. Researchers have named this pattern Agent Deadlock Syndrome: two or more agents repeatedly defer decision authority to each other, resulting in extended inactivity without explicit errors.

Framework by framework

OpenClaw — mention-gating as the primary defense

OpenClaw's default activation mode for group chats is mention-only (requireMention: true). Agents only respond to messages that explicitly @mention them. This is the primary loop prevention mechanism — if Agent A doesn't @mention Agent B in its response, Agent B stays silent.

When agents communicate via sessions_spawn, a separate maxPingPongTurns parameter (range 0–5, default 5) limits back-and-forth exchanges. Setting it to 0 disables agent-to-agent ping-pong entirely.

The gaps are documented. There's no self-reply detection in broadcast group scenarios. No tool-call loop detection for agents stuck calling the same tool repeatedly. No context-overflow circuit breaker — if context gets too large, agents can enter a loop: message → overflow error → error appended to context → overflow again. And no stuck-agent watchdog to detect agents making no progress.

Mechanism	Detail
Activation gating	Mention-only (default in groups)
Iteration cap	`maxPingPongTurns: 5` (agent-to-agent)
Self-reply filter	None
Semantic detection	None
Execution timeout	600s default

AutoGen — broadcast with termination keywords

AutoGen's GroupChat broadcasts every message to every participant. A GroupChatManager selects the next speaker using round-robin, random, manual, or LLM-driven selection. All agents see all messages — there's no message filtering by sender type.

Loop prevention relies on two mechanisms: max_consecutive_auto_reply limits how many times an agent responds without human intervention, and is_termination_msg is a function that detects termination keywords (typically "TERMINATE") in agent responses.

The known failure: "gratitude loops" with weaker models. After completing a task, agents endlessly thank each other — the output doesn't contain "TERMINATE" so the termination function never fires. Users report that loops persist even with limits set, especially in multi-turn group chats where the counter resets.

Mechanism	Detail
Activation gating	None (broadcast)
Iteration cap	`max_consecutive_auto_reply`
Self-reply filter	None
Semantic detection	`is_termination_msg` (keyword-based)
Execution timeout	None built-in

CrewAI — delegation disabled by default

CrewAI's primary defense is prevention: allow_delegation defaults to False. Agents cannot delegate to other agents unless explicitly enabled. When delegation is enabled, agents get two tools — Delegate Work and Ask Question — which are tool-mediated, not direct messaging.

A hard max_iter (default 25) caps total iterations per task execution. max_execution_time adds a timeout-based fallback. The orchestrator model maintains centralized control — workers don't communicate directly.

Delegation loops still occur when manager agents have unclear role boundaries. Bug reports indicate agent loops can continue after max_iterations in some configurations.

Mechanism	Detail
Activation gating	`allow_delegation: false` (default)
Iteration cap	`max_iter: 25`
Self-reply filter	None
Semantic detection	None
Execution timeout	`max_execution_time`

LangGraph — cycles are a feature

LangGraph treats cycles as a legitimate graph pattern. Agents can loop through self-correction patterns, retry logic, and iterative refinement. The safety net is a hard recursion_limit (default 25 supersteps) that throws GRAPH_RECURSION_LIMIT when exceeded.

There's no built-in semantic loop detection. Developers must manually track iteration counts in the state object and add conditional edges that inspect state before routing. Third-party tools detect unbounded cycles via static analysis, but this isn't part of the core framework.

Mechanism	Detail
Activation gating	None
Iteration cap	`recursion_limit: 25`
Self-reply filter	None
Semantic detection	None (third-party static analysis exists)
Execution timeout	None built-in

OpenAI Agents SDK — explicit handoffs, no guardrails

The Agents SDK (successor to Swarm) relies on explicit handoffs — developers manually define when control passes between agents. There's no implicit broadcast channel. Agents don't "see" each other's messages unless a handoff explicitly transfers conversation ownership.

This design prevents broadcast storms by construction. But it doesn't prevent circular handoffs. If Agent A lists Agent B in its handoffs and Agent B lists Agent A, the conversation can bounce indefinitely. The framework provides no cycle detection — it's the developer's responsibility to test for and prevent circular handoff definitions.

Mechanism	Detail
Activation gating	None (explicit handoffs only)
Iteration cap	None
Self-reply filter	N/A (no shared channel)
Semantic detection	None
Execution timeout	None built-in

Google ADK — mandatory iteration limits

Google ADK takes the most prescriptive approach. The LoopAgent primitive requires max_iterations — there's no default, you must set it. Agents terminate via an explicit exit_loop tool or an escalate=True flag that signals the parent.

The framework separates the "should we continue?" question from the working agent. A dedicated checker/validator agent evaluates termination conditions after each iteration. This two-agent pattern (worker + checker) is the recommended architecture for any looping workflow.

Mechanism	Detail
Activation gating	None
Iteration cap	`max_iterations` (mandatory, no default)
Self-reply filter	None
Semantic detection	Via checker agent (recommended pattern)
Execution timeout	Via parent agent

Claude Code — loops prevented by architecture

Claude Code takes the most radical approach: subagents cannot spawn subagents. The Task tool is not available to subagents. Maximum nesting depth is 1, hardcoded. Up to 10 subagents run in parallel, each in its own context window, with no inter-agent communication.

This makes agent-to-agent loops structurally impossible. A subagent can't trigger another subagent, can't message a peer, and can't escalate back to the parent mid-execution. The tradeoff: no recursive orchestration, no dynamic team formation, no mid-execution monitoring.

Mechanism	Detail
Activation gating	N/A (no shared channel)
Iteration cap	N/A (depth 1, hardcoded)
Self-reply filter	N/A
Semantic detection	None
Execution timeout	Per-subagent

The full landscape

Framework	Primary mechanism	Default limit	Group-aware	Loop detection
OpenClaw	Mention gating	5 ping-pong turns	Yes	Partial (timeout)
AutoGen	Termination keywords	Unlimited (configurable)	Yes (broadcast)	Keyword-based
CrewAI	Delegation disabled	25 iterations	No	None
LangGraph	Recursion limit	25 supersteps	No	None
OpenAI SDK	Explicit handoffs	None	No	None
Google ADK	Mandatory iteration cap	Must set	No	Checker agent pattern
Claude Code	Architectural constraint	Depth 1	No	N/A

Loop Prevention Capabilities by Framework (0–10 scale)

Three strategies

The frameworks cluster into three distinct approaches to loop prevention:

Iteration caps — AutoGen, CrewAI, LangGraph, and Google ADK all use hard numeric limits. When the counter hits the cap, execution stops regardless of state. Simple to implement, simple to reason about. The weakness: the cap is blind. It doesn't distinguish a productive 25-step workflow from a stuck 25-step loop. And if set too high, loops burn tokens before the cap kicks in. Google ADK improves on this by making the cap mandatory and pairing it with a checker agent pattern — but the checker itself is an LLM call that can hallucinate "continue."

Explicit control — The OpenAI Agents SDK and Claude Code prevent loops by restricting communication to explicit, developer-defined paths. No shared channels, no implicit broadcast, no unsupervised agent-to-agent messaging. Loops can only form if the developer explicitly creates circular handoff definitions (Agents SDK) or are structurally impossible (Claude Code). The tradeoff: reduced flexibility. You can't build emergent multi-agent collaboration if agents can't discover or address each other.

Activation gating — OpenClaw's mention-based filtering is unique among the surveyed frameworks. Agents ignore messages that don't @mention them. This is the only approach that's group-aware by design — it was built for the exact scenario of multiple agents in one chat. The weakness: it relies on agents not @mentioning each other in their responses, which is a prompt-level constraint, not a framework guarantee.

Termination Strategy Tradeoffs (0–10 scale)

The missing strategy: sender-awareness

All three strategies above share a blind spot: they treat every message the same regardless of who sent it. The agent doesn't know — and can't ask — whether a message came from a human user, a peer agent managed by the same runtime, or an external bot it's never seen before. This is the sender-awareness gap.

Sender-awareness means the agent receives structured metadata about the message origin — sender_type: "user", sender_type: "agent", sender_id: "agent-b" — and can make context-dependent decisions about whether to respond. It's a fourth strategy that sits between framework-level filtering (the agent never sees the message) and no filtering at all (the agent sees everything and must figure it out).

Three levels of sender-awareness exist in practice:

Channel-provided: Telegram, Discord, and Slack all expose is_bot flags on message authors. A framework that forwards this metadata gives agents ground truth about bot vs. human senders — but only on platforms that provide it. IRC, plain WebSocket, and most custom integrations don't.
Runtime-injected: The framework tracks which agents it manages and tags their messages internally. This works across all channels but only covers agents within the same runtime. An external bot joining a Telegram group from a different system appears as an unknown sender.
Agent-declared: Each agent publishes its identity to a roster. Other agents receive the roster and can recognize peers. This is the most flexible but requires a coordination protocol that no surveyed framework currently implements.

None of the seven frameworks we surveyed implement sender-awareness as a first-class feature. OpenClaw's mention-gating is the closest — it gates on message content (does the message @mention me?) rather than sender identity (is this from a bot?). AutoGen's GroupChatManager knows all participants but doesn't expose sender type to individual agents. The OpenAI Agents SDK sidesteps the issue by eliminating shared channels entirely.

The external agent problem is where sender-awareness gets hard. A runtime that manages its own agents has ground truth: it spawned them, it knows their IDs, it can tag their messages. But when an external bot — managed by a different system, running on a different machine — joins the same Telegram group, the runtime has no way to identify it as an agent rather than a human. It must fall back to channel-level is_bot flags, which are platform-dependent and can be spoofed. This creates an asymmetry: internal agents are fully identified, external agents are best-effort. For local-first runtimes where all agents run in a single process, this asymmetry is acceptable — the common case (all agents are internal) has perfect information. For cloud-hosted multi-tenant systems, it's a gap that needs protocol-level solutions.

What the research says

The MAST taxonomy (March 2025) analyzed 1,600+ traces across seven frameworks and found that task termination policies account for 25.6% of all multi-agent failures. Missing call-chain state tracking and insufficient dynamic termination detection are the root causes. Loops aren't an edge case — they're a quarter of all failures.

An empirical study of agent developer practices confirms the pattern: in high-concurrency scenarios, loops are harder to interrupt externally. The study documents cases where developers discovered loops only after significant token consumption, with no framework-level alerting.

The most promising detection approach comes from unsupervised cycle detection research (November 2025). A hybrid method combining temporal call stack analysis (identifies structural loops) with semantic similarity analysis (detects content redundancy) achieved an F1 score of 0.72 on 1,575 LangGraph trajectories — far outperforming either method alone. The finding: existing observability platforms (Datadog, Langfuse) miss both structural repetition and semantic redundancy. Current monitoring tools weren't designed for this failure mode.

The production failure rate across multi-agent systems ranges from 41–87%. The 9-day relay case — where a circular agent-to-agent message chain consumed 60,000+ tokens over nine days — demonstrates what happens when no termination mechanism exists. As we explored in how agents call agents, bidirectional communication without circuit breakers is the fastest path to these failures.

Open questions

Is mention-gating sufficient or just a workaround? OpenClaw's approach is the only one designed for shared-channel scenarios. But it relies on a prompt-level constraint — "don't @mention other agents" — rather than a framework guarantee. If an agent's response includes "@agent-b what do you think?", the loop starts. Is this a pragmatic solution or a fragile patch?

Should loop detection be semantic or structural? Iteration caps (structural) are simple but blind. The hybrid detection research shows that semantic analysis catches loops that structural methods miss — agents producing redundant content without repeating exact tool calls. But semantic detection requires embedding comparisons at every step, adding latency and cost. Is the tradeoff worth it?

Who should own termination? Three candidates: the framework (iteration caps), the agent itself (Google ADK's exit_loop tool), or an external watchdog (proposed in OpenClaw issue #16808). Framework-level caps are reliable but blunt. Agent self-termination requires the agent to recognize it's stuck — which is exactly what it fails to do when it's looping. External watchdogs add complexity but could provide a monitoring layer that most runtimes currently lack. The multi-agent coordination research suggests that no single ownership model dominates.

Can shared-channel multi-agent work reliably? The evidence is mixed. OpenClaw's broadcast groups and AutoGen's GroupChat both support it. But every framework that enables shared channels has documented loop failures. Claude Code and the OpenAI Agents SDK sidestep the problem entirely by forbidding implicit shared communication. Is the shared-channel pattern fundamentally unsafe for autonomous agents, or does it just need better guardrails?

What happens when loop prevention interacts with context compaction? If an agent's context is compacted mid-loop, it may lose the state that would help it recognize it's stuck. The loop continues, but now without the evidence that would trigger a termination condition. None of the surveyed frameworks coordinate loop detection with context compaction — a gap we explored in context compaction across frameworks.