5 Production Patterns for Multi-Agent AI Systems
Quick reference from running 57 agents 24/7 for 6+ months.
These patterns solved 90% of our production failures. Each one is implemented in the Guardian Agent Prompts collection.
Pattern 1: Identity Block
Problem: Agents drift from their role and try to do everything themselves.
Solution: Start every prompt with an explicit identity constraint.
## IDENTITY
You are the Security Auditor. You audit code for vulnerabilities.
You do NOT fix code. You do NOT refactor. You do NOT optimize.
You FIND vulnerabilities and REPORT them with evidence.
Why it works: Without role boundaries, LLMs default to “helpful assistant” mode. The identity block prevents scope creep — your orchestrator won’t start coding, your coder won’t start managing.
Metric: Reduced off-task behavior from ~40% to <5% of responses.
Pattern 2: Anti-Duplication Registry
Problem: Multiple agents pick up the same task simultaneously.
Solution: Force a check-then-claim pattern before any work begins.
## BEFORE ANY WORK
1. Check the task registry: "Is anyone already doing this?"
2. If YES → contact the existing owner, do NOT duplicate
3. If NO → claim the task with: description, your name, timestamp
4. Only THEN begin working
Why it works: Two agents producing conflicting outputs is worse than one agent doing nothing. A shared registry (SQLite, shared file, or memory store) eliminates collision.
Metric: Eliminated ~15 daily task collisions in a 57-agent system.
Pattern 3: Mandatory Blueprint
Problem: Fire-and-forget delegation produces wrong or incomplete results.
Solution: Require a structured plan before delegating any task.
## BLUEPRINT (REQUIRED before delegation)
- Which agent handles this?
- What tools/APIs does that agent need?
- What is the execution order?
- What could go wrong?
- What does "done" look like? (specific, testable criteria)
NEVER delegate without a blueprint.
Why it works: The blueprint forces the orchestrator to think through context, dependencies, and success criteria. Agents receiving well-specified tasks succeed 3x more often than those receiving vague instructions.
Metric: Task completion accuracy improved from ~70% to ~93%.
Pattern 4: Evidence-Based Quality Gate
Problem: Agents claim “done” without actually completing the work correctly.
Solution: Require proof before accepting any deliverable.
## QUALITY GATE (REQUIRED before marking DONE)
Every completed task MUST include:
- EVIDENCE: [specific file, command output, or test result]
- VERIFICATION: [command to independently verify]
- IMPACT: [what changed, measured]
"I think it's fixed" is NOT evidence. A passing test IS evidence.
Why it works: Without quality gates, agents learn to shortcut. They’ll say “done” and move on. The gate forces evidence — a file path, a test result, a verified API response.
Metric: False completions dropped from ~25% to <3%.
Pattern 5: 30-Minute Heartbeat
Problem: Multi-agent systems silently drift into inactivity.
Solution: Built-in self-monitoring that triggers every 30 minutes.
## MONITORING CYCLE (every 30 minutes)
1. Check task registry — what's blocked?
2. Check communications — any agent reports?
3. Ask yourself: "What did I PRODUCE in the last 30 minutes?"
If nothing → open the backlog and start the next task
4. Any agent silent for 30+ minutes on assigned work?
→ Send a follow-up, then reassign if no response in 15 min
Why it works: Without active self-monitoring, orchestrators wait for input indefinitely. The heartbeat turns a passive coordinator into a proactive manager that catches stalled work.
Metric: System idle time reduced from ~60% to ~15%.
Using These Patterns
Minimal Setup (5 minutes)
- Copy the Orchestrator prompt from this repo
- Set it as the system prompt for your main coordinating agent
- Create 2-3 specialist agents (coder, researcher, etc.)
- Let the orchestrator delegate between them
With Claude Code
# Install the orchestrator as a Claude Code agent
python examples/setup_agent.py prompts/orchestrator.md
# Then use it
claude --agent guardian-orchestrator
Works With Any LLM
These patterns are model-agnostic. They work with:
- Claude (Opus, Sonnet, Haiku)
- GPT-4, GPT-5
- Gemini Pro, Flash
- Llama, Mistral, Qwen
- Any framework: n8n, LangGraph, CrewAI, raw API calls
Full Collection
The complete pack includes 49 production-tested prompts across 10 categories: orchestration, trading, security, OSINT, code review, business development, infrastructure monitoring, and more.
Each prompt implements all 5 patterns above, plus category-specific features like ICT trading signals, CVE detection, or B2B lead qualification.
| Free samples in this repo: Orchestrator | Security Auditor | Code Reviewer | Business Agent | Trading Agent |