Source from repo
Agent Skills for Context Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
muratcankoylanGitHub muratcankoylanSource repo Original GitHub link
Files
241
Skill
n/a
Size
2.6 MB
Entrypoint
SKILL.md
Format
git-repo
Open file
skills/multi-agent-patterns/SKILL.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown259 linesFree
skills/multi-agent-patterns/SKILL.md
1---
2name: multi-agent-patterns
3description: This skill should be used when the user asks to "design multi-agent system", "implement supervisor pattern", "create swarm architecture", "coordinate multiple agents", or mentions multi-agent patterns, context isolation, agent handoffs, sub-agents, or parallel agent execution.
4---
5 
6# Multi-Agent Architecture Patterns
7 
8Multi-agent architectures distribute work across multiple language model instances, each with its own context window. When designed well, this distribution enables capabilities beyond single-agent limits. When designed poorly, it introduces coordination overhead that negates benefits. The critical insight is that sub-agents exist primarily to isolate context, not to anthropomorphize role division.
9 
10## When to Activate
11 
12Activate this skill when:
13- Single-agent context limits constrain task complexity
14- Tasks decompose naturally into parallel subtasks
15- Different subtasks require different tool sets or system prompts
16- Building systems that must handle multiple domains simultaneously
17- Scaling agent capabilities beyond single-context limits
18- Designing production agent systems with multiple specialized components
19 
20## Core Concepts
21 
22Use multi-agent patterns when a single agent's context window cannot hold all task-relevant information. Context isolation is the primary benefit — each agent operates in a clean context without accumulated noise from other subtasks, preventing the telephone game problem where information degrades through repeated summarization.
23 
24Choose among three dominant patterns based on coordination needs, not organizational metaphor:
25 
26- **Supervisor/orchestrator** — Use for centralized control when tasks have clear decomposition and human oversight matters. A single coordinator delegates to specialists and synthesizes results.
27- **Peer-to-peer/swarm** — Use for flexible exploration when rigid planning is counterproductive. Any agent can transfer control to any other through explicit handoff mechanisms.
28- **Hierarchical** — Use for large-scale projects with layered abstraction (strategy, planning, execution). Each layer operates at a different level of detail with its own context structure.
29 
30Design every multi-agent system around explicit coordination protocols, consensus mechanisms that resist sycophancy, and failure handling that prevents error propagation cascades.
31 
32## Detailed Topics
33 
34### Why Multi-Agent Architectures
35 
36**The Context Bottleneck**
37Reach for multi-agent architectures when a single agent's context fills with accumulated history, retrieved documents, and tool outputs to the point where performance degrades. Recognize three degradation signals: the lost-in-middle effect (attention weakens for mid-context content), attention scarcity (too many competing items), and context poisoning (irrelevant content displaces useful content).
38 
39Partition work across multiple context windows so each agent operates in a clean context focused on its subtask. Aggregate results at a coordination layer without any single context bearing the full burden.
40 
41**The Token Economics Reality**
42Budget for substantially higher token costs. Production data shows multi-agent systems run at approximately 15x the token cost of a single-agent chat:
43 
44| Architecture | Token Multiplier | Use Case |
45|--------------|------------------|----------|
46| Single agent chat | 1x baseline | Simple queries |
47| Single agent with tools | ~4x baseline | Tool-using tasks |
48| Multi-agent system | ~15x baseline | Complex research/coordination |
49 
50Research on the BrowseComp evaluation found that three factors explain 95% of performance variance: token usage (80% of variance), number of tool calls, and model choice. This validates distributing work across agents with separate context windows to add capacity for parallel reasoning.
51 
52Prioritize model selection alongside architecture design — upgrading to better models often provides larger performance gains than doubling token budgets. BrowseComp data shows that model quality improvements frequently outperform raw token increases. Treat model selection and multi-agent architecture as complementary strategies.
53 
54**The Parallelization Argument**
55Assign parallelizable subtasks to dedicated agents with fresh contexts rather than processing them sequentially in a single agent. A research task requiring searches across multiple independent sources, analysis of different documents, or comparison of competing approaches benefits from parallel execution. Total real-world time approaches the duration of the longest subtask rather than the sum of all subtasks.
56 
57**The Specialization Argument**
58Configure each agent with only the system prompt, tools, and context it needs for its specific subtask. A general-purpose agent must carry all possible configurations in context, diluting attention. Specialized agents carry only what they need, operating with lean context optimized for their domain. Route from a coordinator to specialized agents to achieve specialization without combinatorial explosion.
59 
60### Architectural Patterns
61 
62**Pattern 1: Supervisor/Orchestrator**
63Deploy a central agent that maintains global state and trajectory, decomposes user objectives into subtasks, and routes to appropriate workers.
64 
65```
66User Query -> Supervisor -> [Specialist, Specialist, Specialist] -> Aggregation -> Final Output
67```
68 
69Choose this pattern when: tasks have clear decomposition, coordination across domains is needed, or human oversight is important.
70 
71Expect these trade-offs: strict workflow control and easier human-in-the-loop interventions, but the supervisor context becomes a bottleneck, supervisor failures cascade to all workers, and the "telephone game" problem emerges where supervisors paraphrase sub-agent responses incorrectly.
72 
73**The Telephone Game Problem and Solution**
74Anticipate that supervisor architectures initially perform approximately 50% worse than optimized versions due to the telephone game problem (LangGraph benchmarks). Supervisors paraphrase sub-agent responses, losing fidelity with each pass.
75 
76Fix this by implementing a `forward_message` tool that allows sub-agents to pass responses directly to users:
77 
78```python
79def forward_message(message: str, to_user: bool = True):
80    """
81    Forward sub-agent response directly to user without supervisor synthesis.
82 
83    Use when:
84    - Sub-agent response is final and complete
85    - Supervisor synthesis would lose important details
86    - Response format must be preserved exactly
87    """
88    if to_user:
89        return {"type": "direct_response", "content": message}
90    return {"type": "supervisor_input", "content": message}
91```
92 
93Prefer swarm architectures over supervisors when sub-agents can respond directly to users, as this eliminates translation errors entirely.
94 
95**Pattern 2: Peer-to-Peer/Swarm**
96Remove central control and allow agents to communicate directly based on predefined protocols. Any agent transfers control to any other through explicit handoff mechanisms.
97 
98```python
99def transfer_to_agent_b():
100    return agent_b  # Handoff via function return
101 
102agent_a = Agent(
103    name="Agent A",
104    functions=[transfer_to_agent_b]
105)
106```
107 
108Choose this pattern when: tasks require flexible exploration, rigid planning is counterproductive, or requirements emerge dynamically and defy upfront decomposition.
109 
110Expect these trade-offs: no single point of failure and effective breadth-first scaling, but coordination complexity increases with agent count, divergence risk rises without a central state keeper, and robust convergence constraints become essential.
111 
112Define explicit handoff protocols with state passing. Ensure agents communicate their context needs to receiving agents.
113 
114**Pattern 3: Hierarchical**
115Organize agents into layers of abstraction: strategy (goal definition), planning (task decomposition), and execution (atomic tasks).
116 
117```
118Strategy Layer (Goal Definition) -> Planning Layer (Task Decomposition) -> Execution Layer (Atomic Tasks)
119```
120 
121Choose this pattern when: projects have clear hierarchical structure, workflows involve management layers, or tasks require both high-level planning and detailed execution.
122 
123Expect these trade-offs: clear separation of concerns and support for different context structures at different levels, but coordination overhead between layers, potential strategy-execution misalignment, and complex error propagation paths.
124 
125### Context Isolation as Design Principle
126 
127Treat context isolation as the primary purpose of multi-agent architectures. Each sub-agent should operate in a clean context window focused on its subtask without carrying accumulated context from other subtasks.
128 
129**Isolation Mechanisms**
130Select the right isolation mechanism for each subtask:
131 
132- **Full context delegation** — Share the planner's entire context with the sub-agent. Use for complex tasks where the sub-agent needs complete understanding. The sub-agent has its own tools and instructions but receives full context for its decisions. Note: this partially defeats the purpose of context isolation.
133- **Instruction passing** — Create instructions via function call; the sub-agent receives only what it needs. Use for simple, well-defined subtasks. Maintains isolation but limits sub-agent flexibility.
134- **File system memory** — Agents read and write to persistent storage. Use for complex tasks requiring shared state. The file system serves as the coordination mechanism, avoiding context bloat from shared state passing. Introduces latency and consistency challenges but scales better than message-passing.
135 
136Choose based on task complexity, coordination needs, and acceptable latency. Default to instruction passing and escalate to file system memory when shared state is needed. Avoid full context delegation unless the subtask genuinely requires it.
137 
138### Consensus and Coordination
139 
140**The Voting Problem**
141Avoid simple majority voting — it treats hallucinations from weak models as equal to reasoning from strong models. Without intervention, multi-agent discussions devolve into consensus on false premises due to inherent bias toward agreement.
142 
143**Weighted Voting**
144Weight agent votes by confidence or expertise. Agents with higher confidence or domain expertise should carry more weight in final decisions.
145 
146**Debate Protocols**
147Structure agents to critique each other's outputs over multiple rounds. Adversarial critique often yields higher accuracy on complex reasoning than collaborative consensus. Guard against sycophantic convergence where agents agree to be agreeable rather than correct.
148 
149**Trigger-Based Intervention**
150Monitor multi-agent interactions for behavioral markers. Activate stall triggers when discussions make no progress. Detect sycophancy triggers when agents mimic each other's answers without unique reasoning.
151 
152### Framework Considerations
153 
154Different frameworks implement these patterns with different philosophies. LangGraph uses graph-based state machines with explicit nodes and edges. AutoGen uses conversational/event-driven patterns with GroupChat. CrewAI uses role-based process flows with hierarchical crew structures.
155 
156## Practical Guidance
157 
158### Failure Modes and Mitigations
159 
160**Failure: Supervisor Bottleneck**
161The supervisor accumulates context from all workers, becoming susceptible to saturation and degradation.
162 
163Mitigate by constraining worker output schemas so workers return only distilled summaries. Use checkpointing to persist supervisor state without carrying full history in context.
164 
165**Failure: Coordination Overhead**
166Agent communication consumes tokens and introduces latency. Complex coordination can negate parallelization benefits.
167 
168Mitigate by minimizing communication through clear handoff protocols. Batch results where possible. Use asynchronous communication patterns. Measure whether multi-agent coordination actually saves time versus a single agent with a longer context.
169 
170**Failure: Divergence**
171Agents pursuing different goals without central coordination drift from intended objectives.
172 
173Mitigate by defining clear objective boundaries for each agent. Implement convergence checks that verify progress toward shared goals. Set time-to-live limits on agent execution to prevent unbounded exploration.
174 
175**Failure: Error Propagation**
176Errors in one agent's output propagate to downstream agents that consume that output, compounding into increasingly wrong results.
177 
178Mitigate by validating agent outputs before passing to consumers. Implement retry logic with circuit breakers. Use idempotent operations where possible. Consider adding a verification agent that cross-checks critical outputs before they enter the pipeline.
179 
180## Examples
181 
182**Example 1: Research Team Architecture**
183```text
184Supervisor
185├── Researcher (web search, document retrieval)
186├── Analyzer (data analysis, statistics)
187├── Fact-checker (verification, validation)
188└── Writer (report generation, formatting)
189```
190 
191**Example 2: Handoff Protocol**
192```python
193def handle_customer_request(request):
194    if request.type == "billing":
195        return transfer_to(billing_agent)
196    elif request.type == "technical":
197        return transfer_to(technical_agent)
198    elif request.type == "sales":
199        return transfer_to(sales_agent)
200    else:
201        return handle_general(request)
202```
203 
204## Guidelines
205 
2061. Design for context isolation as the primary benefit of multi-agent systems
2072. Choose architecture pattern based on coordination needs, not organizational metaphor
2083. Implement explicit handoff protocols with state passing
2094. Use weighted voting or debate protocols for consensus
2105. Monitor for supervisor bottlenecks and implement checkpointing
2116. Validate outputs before passing between agents
2127. Set time-to-live limits to prevent infinite loops
2138. Test failure scenarios explicitly
214 
215## Gotchas
216 
2171. **Supervisor bottleneck scaling** — Supervisor context pressure grows non-linearly with worker count. At 5+ workers, the supervisor spends more tokens processing summaries than workers spend on actual tasks. Set a hard cap on workers per supervisor (3-5) and add a second supervisor tier rather than overloading one.
2182. **Token cost underestimation** — Multi-agent runs cost approximately 15x baseline. Teams consistently underbudget because they estimate per-agent costs without accounting for coordination overhead, retries, and consensus rounds. Budget for 15x and treat anything less as a bonus.
2193. **Sycophantic consensus** — Agents in debate patterns tend to converge on agreeable answers, not correct ones. LLMs have an inherent bias toward agreement. Counter this by assigning explicit adversarial roles and requiring agents to state disagreements before convergence is allowed.
2204. **Agent sprawl** — Adding more agents past 3-5 shows diminishing returns and increases coordination overhead. Each additional agent adds communication channels quadratically. Start with the minimum viable number of agents and add only when a clear context isolation benefit exists.
2215. **Telephone game in message-passing** — Information degrades through repeated summarization as it passes between agents. Each agent paraphrases and loses nuance. Use filesystem coordination instead of message-passing for state that multiple agents need to access faithfully.
2226. **Error propagation cascades** — One agent's hallucination becomes another agent's "fact." Downstream agents have no way to distinguish upstream hallucinations from genuine information. Add validation checkpoints between agents and never trust upstream output without verification.
2237. **Over-decomposition** — Splitting tasks too finely creates more coordination overhead than the task itself. A 10-step pipeline with 10 agents spends more tokens on handoffs than on actual work. Decompose only when subtasks genuinely benefit from separate contexts.
2248. **Missing shared state** — Agents operating without a shared filesystem or state store duplicate work, produce inconsistent outputs, and lose track of what has already been accomplished. Establish shared persistent storage before building multi-agent workflows.
225 
226## Integration
227 
228This skill builds on context-fundamentals and context-degradation. It connects to:
229 
230- memory-systems - Shared state management across agents
231- tool-design - Tool specialization per agent
232- context-optimization - Context partitioning strategies
233- latent-briefing - KV-cache trajectory handoff between orchestrator and worker when models align
234 
235## References
236 
237Internal reference:
238- [Frameworks Reference](./references/frameworks.md) - Read when: implementing a specific multi-agent pattern in LangGraph, AutoGen, or CrewAI and needing framework-specific code examples
239 
240Related skills in this collection:
241- context-fundamentals - Read when: needing to understand context window mechanics before designing agent partitioning
242- memory-systems - Read when: agents need to share state across context boundaries or persist information between runs
243- context-optimization - Read when: individual agent contexts are too large and need partitioning or compression strategies
244 
245External resources:
246- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/) - Read when: building graph-based multi-agent workflows with explicit state machines
247- [AutoGen Framework](https://microsoft.github.io/autogen/) - Read when: implementing conversational GroupChat patterns or event-driven agent coordination
248- [CrewAI Documentation](https://docs.crewai.com/) - Read when: designing role-based hierarchical agent processes
249- [Research on Multi-Agent Coordination](https://arxiv.org/abs/2308.00352) - Read when: needing academic grounding on multi-agent system theory and evaluation
250 
251---
252 
253## Skill Metadata
254 
255**Created**: 2025-12-20
256**Last Updated**: 2026-03-17
257**Author**: Agent Skills for Context Engineering Contributors
258**Version**: 2.0.0
259
Preparing the source view

Agent Skills for Context Engineering

skills/multi-agent-patterns/SKILL.md