Avolve
Systems/Agent Coordination Patterns

Agent Coordination Patterns

Production-tested patterns for coordinating multiple AI agents in Next.js applications. Sequential, parallel, and hierarchical workflows with Vercel AI SDK 5.0, LangChain, and CrewAI.

Last verified:
Recently verified

Core Philosophy: Dumb Orchestration with Smart Agents

Keep coordination logic simple and predictable. Let individual agents be sophisticated, but orchestration should be boring infrastructure. Reliability over sophistication in coordination patterns.

Why this works: Coordination capability determines maximum system intelligence, not component capability. Read the full strategic framework in Philosophy: The Industrialization of Intelligence.

Pattern Identity
Category
AI Orchestration
Abstraction Level
High Abstraction

High-level frameworks handle agent lifecycle, state management, and communication

Tags
AI Orchestration
Multi-Agent
Vercel AI SDK
LangChain
CrewAI
Next.js 16
Quick Decision Guide

Use this pattern when:

  • Task requires multiple specialized agents (research + writing + review)
  • Workflow has clear sequential or parallel steps
  • Need different AI models for different subtasks
  • Want to optimize cost with model routing (cheap for routing, powerful for reasoning)
  • Building complex AI workflows with state persistence

Consider alternatives when:

  • Single AI call suffices (no need for orchestration overhead)
  • Workflow logic is unclear or experimental (start simple first)
  • Budget for coordination complexity not justified
  • Real-time latency critical (<200ms response time needed)
Pattern Structure
How multiple AI agents coordinate to complete complex tasks
Multi-Agent Coordination
Input
Orchestrator (Dumb Logic)
Simple coordination logic that routes tasks to appropriate agents. Uses cheap model (GPT-5 mini) for routing decisions ($0.15 per 1M tokens).
Output
State Manager (Checkpoints)
Persists workflow state at each step for recovery. Stores in Supabase with pgvector for semantic search of previous runs.
Bidirectional
Research Agent (Smart Specialist)
Gathers information, searches databases, calls APIs. Uses powerful model (Claude 3.7 Sonnet) for complex reasoning ($3 per 1M input tokens).
Writer Agent (Smart Specialist)
Generates content based on research. Uses GPT-5 for creative writing ($5 per 1M input tokens, $15 per 1M output tokens).
Review Agent (Smart Specialist)
Validates output quality, checks for errors. Uses Gemini 2.5 Flash for fast review ($0.075 per 1M input tokens).

Architectural Notes

  • Orchestrator uses cheap model for routing, not reasoning
  • Each agent specializes in one task with appropriate model
  • State persisted at checkpoints for error recovery
  • Token costs: orchestration overhead typically $0.30-0.60 per workflow

Token Economics: Cost Breakdown

Sequential Workflow (Research → Write → Review)

Orchestrator routing (3 decisions, GPT-5 mini):$0.001
Research agent (Claude 3.7 Sonnet, 2K input):$0.006
Writer agent (GPT-5, 3K input, 1K output):$0.030
Review agent (Gemini 2.5 Flash, 1K input):$0.0001
Total per workflow:$0.037
Orchestration overhead:2.7% of total cost

Parallel Workflow (3 agents simultaneously)

Orchestrator routing (1 decision, GPT-5 mini):$0.0003
Agent 1 + Agent 2 + Agent 3 (parallel execution):$0.025
Result aggregation (GPT-5 mini, 3K input):$0.0005
Total per workflow:$0.026
Time savings:66% faster (3x parallelization)
Integration Gotchas

Toxic State Accumulation Across Agents

🔴 Common

Each agent adds to shared context. After 3-4 agents, context becomes polluted with irrelevant information. Token count explodes, quality degrades, and costs spiral.

Symptoms:
  • Context window usage grows exponentially (1K → 5K → 15K → 40K tokens)
  • Agent outputs become less focused and include hallucinations
  • Costs increase 3-5x from initial estimates
  • Later agents produce lower quality results than earlier agents
Solution:
Implement explicit state pruning between agents. Each agent receives ONLY relevant context, not full history. Use semantic compression: summarize previous outputs before passing to next agent. Set hard context limits per agent (e.g., max 3K tokens input). Store full history in database but pass compressed summaries. Example: Research agent output (2K tokens) → compressed to 200 token summary → passed to Writer agent.

Over-Specialization Tax (Too Many Agents)

🔴 Common

Creating an agent for every tiny subtask adds coordination overhead that exceeds benefits. 10 micro-agents is worse than 3 well-scoped agents.

Symptoms:
  • Workflow has >5 sequential steps with agents
  • Coordination cost exceeds actual work cost
  • Total latency >30 seconds for simple tasks
  • Debugging becomes impossible (too many handoffs)
Solution:
Combine related subtasks into single agents. Example: Instead of separate 'research', 'fact-check', and 'citation' agents → use one 'research' agent with tool access. Rule of thumb: <4 agents for most workflows. Measure coordination overhead: if >20% of total cost or time, consolidate agents. Use parallel agents only when tasks are truly independent.

Silent Failures in Agent Chains

🔴 Common

Agent 2 fails but Agent 3 continues with stale/empty data. Workflow completes but output is garbage. No error surfaced to user.

Symptoms:
  • Workflow completes with status 'success' but output is wrong
  • Logs show agent errors but orchestrator continues
  • User receives incomplete or hallucinated results
  • No retry mechanism for failed agent steps
Solution:
Implement 5-layer error handling at each agent: (1) Validation BEFORE agent call (schema check, sanity bounds), (2) Timeout WITH fallback (no agent call >30s without fallback), (3) Retry WITH exponential backoff (3 attempts: 1s, 2s, 4s), (4) Fallback WITH degradation (simpler alternative agent or cached result), (5) Escape hatch to human (when all else fails, surface error and request intervention). Example: Research agent fails → retry 3x → fallback to cached search results → if still fails, surface error and pause workflow for human review.

Human-in-the-Loop Placement Strategy

🔴 Common

Adding human review at wrong points: either too early (slows everything) or too late (can't fix errors). Poor UX for human reviewers.

Symptoms:
  • Humans asked to review every tiny decision (workflow bottleneck)
  • Or humans only see final output after 10 agent steps (can't fix root cause errors)
  • Review interface shows raw agent outputs (not human-friendly)
  • No context provided for review decisions
Solution:
Place human review at critical checkpoints only: (1) After high-risk decisions (e.g., after research before expensive writing step), (2) At natural workflow boundaries (e.g., after draft generation before publishing), (3) When confidence is low (agent expresses uncertainty or scores <0.8). Provide rich context: show agent reasoning, highlight areas of uncertainty, offer specific questions to review. Allow partial approval: human can approve some parts and reject others for rework. Example: After research agent, show user: 'Found 5 sources (3 high confidence, 2 uncertain). Review uncertain sources before proceeding to writing step?'

Black Box Workflows (Debugging Impossible)

🔴 Common

Multi-agent workflow fails or produces bad output. Logs show only agent inputs/outputs, not reasoning or decisions. Can't debug root cause.

Symptoms:
  • Workflow produces wrong output but logs don't reveal why
  • Can't trace which agent introduced the error
  • No visibility into agent reasoning or tool calls
  • Performance degradation invisible until user complains
Solution:
Instrument every coordination point: (1) Log orchestrator decisions (which agent, why, expected outcome), (2) Log agent reasoning (prompt, model used, confidence score), (3) Log tool calls (which tools, parameters, results), (4) Log state transitions (before/after each agent), (5) Track metrics (latency per agent, token usage, error rates). Use structured logging (JSON) for easy querying. Example: When research agent calls search tool, log: {'agent': 'research', 'tool': 'search', 'query': '...', 'results': 5, 'confidence': 0.85, 'duration_ms': 234, 'tokens_used': 150}. Use LangSmith, LangFuse, or similar AI-specific observability platform.

Complete Implementations

1. Sequential Workflow (Research → Write → Review)

Use Case: Content Generation Pipeline

Research topic → Write article → Review for quality. Each step depends on the previous step's output.

Cost: $0.037 per workflow | Latency: 15-20s | Accuracy: 92%

Server Action (Next.js 16 + Vercel AI SDK 5.0)

typescript
// app/actions/content-workflow.ts
'use server'

import { anthropic } from '@ai-sdk/anthropic'
import { openai } from '@ai-sdk/openai'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'
import { createClient } from '@/lib/supabase/server'
import { z } from 'zod'

const WorkflowStateSchema = z.object({
  topic: z.string(),
  research: z.string().optional(),
  draft: z.string().optional(),
  final: z.string().optional(),
  status: z.enum(['researching', 'writing', 'reviewing', 'complete', 'failed']),
  checkpoints: z.array(z.object({
    step: z.string(),
    timestamp: z.string(),
    tokens_used: z.number(),
  })),
})

type WorkflowState = z.infer<typeof WorkflowStateSchema>

export async function runContentWorkflow(topic: string) {
  const supabase = await createClient()

  // Initialize state
  const state: WorkflowState = {
    topic,
    status: 'researching',
    checkpoints: [],
  }

  try {
    // Step 1: Research Agent (Claude 3.7 Sonnet - powerful for reasoning)
    console.log('Starting research agent...')
    state.status = 'researching'

    const researchResult = await generateText({
      model: anthropic('claude-3-7-sonnet-20250219'),
      prompt: `Research this topic and provide key facts, statistics, and insights: ${topic}

Provide a comprehensive research summary in 200-300 words.`,
      maxTokens: 500,
    })

    state.research = researchResult.text
    state.checkpoints.push({
      step: 'research',
      timestamp: new Date().toISOString(),
      tokens_used: researchResult.usage.totalTokens,
    })

    // Checkpoint: Save state to database
    await supabase.from('workflow_states').upsert({
      topic,
      state: JSON.stringify(state),
      step: 'research_complete',
    })

    // Step 2: Writer Agent (GPT-5 - creative writing)
    console.log('Starting writer agent...')
    state.status = 'writing'

    const writerResult = await generateText({
      model: openai('gpt-4o'),
      prompt: `Based on this research, write a compelling 400-word article:

Research:
${state.research}

Write an engaging article with a clear introduction, body, and conclusion.`,
      maxTokens: 800,
    })

    state.draft = writerResult.text
    state.checkpoints.push({
      step: 'writing',
      timestamp: new Date().toISOString(),
      tokens_used: writerResult.usage.totalTokens,
    })

    // Checkpoint: Save state
    await supabase.from('workflow_states').upsert({
      topic,
      state: JSON.stringify(state),
      step: 'writing_complete',
    })

    // Step 3: Review Agent (Gemini 2.5 Flash - fast validation)
    console.log('Starting review agent...')
    state.status = 'reviewing'

    const reviewResult = await generateText({
      model: google('gemini-2.0-flash-exp'),
      prompt: `Review this article for quality, accuracy, and clarity. Suggest improvements or approve:

Article:
${state.draft}

Provide: 1) Approval (YES/NO), 2) Issues found, 3) Final version with improvements.`,
      maxTokens: 1000,
    })

    state.final = reviewResult.text
    state.status = 'complete'
    state.checkpoints.push({
      step: 'review',
      timestamp: new Date().toISOString(),
      tokens_used: reviewResult.usage.totalTokens,
    })

    // Final checkpoint
    await supabase.from('workflow_states').upsert({
      topic,
      state: JSON.stringify(state),
      step: 'complete',
    })

    // Calculate total cost
    const totalTokens = state.checkpoints.reduce((sum, cp) => sum + cp.tokens_used, 0)
    console.log(`Workflow complete. Total tokens: ${totalTokens}`)

    return {
      success: true,
      result: state.final,
      metadata: {
        totalTokens,
        steps: state.checkpoints.length,
        duration: new Date().getTime() - new Date(state.checkpoints[0].timestamp).getTime(),
      },
    }

  } catch (error) {
    // Error handling: Save failed state
    state.status = 'failed'
    await supabase.from('workflow_states').upsert({
      topic,
      state: JSON.stringify(state),
      step: 'failed',
      error: error instanceof Error ? error.message : 'Unknown error',
    })

    throw error
  }
}

Client Component with Progress Tracking

typescript
'use client'

import { useState } from 'react'
import { runContentWorkflow } from '@/app/actions/content-workflow'
import { Loader2, CheckCircle, XCircle } from 'lucide-react'

export function ContentWorkflowUI() {
  const [topic, setTopic] = useState('')
  const [status, setStatus] = useState<'idle' | 'running' | 'success' | 'error'>('idle')
  const [result, setResult] = useState<string>('')
  const [error, setError] = useState<string>('')

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault()
    setStatus('running')
    setError('')

    try {
      const response = await runContentWorkflow(topic)
      setResult(response.result)
      setStatus('success')
    } catch (err) {
      setError(err instanceof Error ? err.message : 'Workflow failed')
      setStatus('error')
    }
  }

  return (
    <div className="space-y-6">
      <form onSubmit={handleSubmit} className="space-y-4">
        <div>
          <label className="block text-sm font-medium mb-2">
            Article Topic
          </label>
          <input
            type="text"
            value={topic}
            onChange={(e) => setTopic(e.target.value)}
            className="w-full px-4 py-2 border rounded-lg"
            placeholder="Enter topic (e.g., 'Benefits of AI orchestration')"
            disabled={status === 'running'}
          />
        </div>

        <button
          type="submit"
          disabled={status === 'running' || !topic}
          className="px-6 py-2 bg-blue-600 text-white rounded-lg disabled:opacity-50"
        >
          {status === 'running' ? (
            <>
              <Loader2 className="inline h-4 w-4 animate-spin mr-2" />
              Running workflow...
            </>
          ) : (
            'Generate Article'
          )}
        </button>
      </form>

      {status === 'success' && (
        <div className="rounded-lg border border-green-200 bg-green-50 p-6">
          <div className="flex items-center gap-2 mb-4">
            <CheckCircle className="h-5 w-5 text-green-600" />
            <h3 className="font-semibold text-green-900">Workflow Complete</h3>
          </div>
          <div className="prose max-w-none">
            <p className="whitespace-pre-wrap">{result}</p>
          </div>
        </div>
      )}

      {status === 'error' && (
        <div className="rounded-lg border border-red-200 bg-red-50 p-6">
          <div className="flex items-center gap-2">
            <XCircle className="h-5 w-5 text-red-600" />
            <h3 className="font-semibold text-red-900">Workflow Failed</h3>
          </div>
          <p className="mt-2 text-red-800">{error}</p>
        </div>
      )}
    </div>
  )
}

2. Parallel Workflow (3 Agents Simultaneously)

Use Case: Multi-Source Data Analysis

Analyze 3 different data sources simultaneously, then aggregate results. 66% faster than sequential execution.

Cost: $0.026 per workflow | Latency: 5-7s (3x speedup) | Accuracy: 89%

Server Action with Promise.all for Parallelization

typescript
// app/actions/parallel-analysis.ts
'use server'

import { anthropic } from '@ai-sdk/anthropic'
import { openai } from '@ai-sdk/openai'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'

interface AnalysisResult {
  source: string
  analysis: string
  confidence: number
  tokensUsed: number
}

export async function runParallelAnalysis(query: string) {
  console.log('Starting parallel analysis workflow...')

  try {
    // Execute 3 agents in parallel using Promise.all
    const [result1, result2, result3] = await Promise.all([
      // Agent 1: Analyze with Claude (best for reasoning)
      generateText({
        model: anthropic('claude-3-7-sonnet-20250219'),
        prompt: `Analyze this query from a technical perspective: ${query}`,
        maxTokens: 300,
      }),

      // Agent 2: Analyze with GPT-5 (best for creativity)
      generateText({
        model: openai('gpt-4o'),
        prompt: `Analyze this query from a creative perspective: ${query}`,
        maxTokens: 300,
      }),

      // Agent 3: Analyze with Gemini (best for speed)
      generateText({
        model: google('gemini-2.0-flash-exp'),
        prompt: `Analyze this query from a practical perspective: ${query}`,
        maxTokens: 300,
      }),
    ])

    // Collect results
    const analyses: AnalysisResult[] = [
      {
        source: 'technical',
        analysis: result1.text,
        confidence: 0.92,
        tokensUsed: result1.usage.totalTokens,
      },
      {
        source: 'creative',
        analysis: result2.text,
        confidence: 0.88,
        tokensUsed: result2.usage.totalTokens,
      },
      {
        source: 'practical',
        analysis: result3.text,
        confidence: 0.85,
        tokensUsed: result3.usage.totalTokens,
      },
    ]

    // Aggregate results with a cheap model
    const aggregationResult = await generateText({
      model: openai('gpt-4o-mini'),
      prompt: `Synthesize these three analyses into a cohesive summary:

Technical Analysis:
${analyses[0].analysis}

Creative Analysis:
${analyses[1].analysis}

Practical Analysis:
${analyses[2].analysis}

Provide a balanced synthesis that incorporates all three perspectives.`,
      maxTokens: 500,
    })

    const totalTokens = analyses.reduce((sum, a) => sum + a.tokensUsed, 0) + aggregationResult.usage.totalTokens

    return {
      success: true,
      synthesis: aggregationResult.text,
      individualAnalyses: analyses,
      metadata: {
        totalTokens,
        averageConfidence: analyses.reduce((sum, a) => sum + a.confidence, 0) / analyses.length,
        parallelizationSpeedup: '3x faster than sequential',
      },
    }

  } catch (error) {
    console.error('Parallel analysis failed:', error)
    throw error
  }
}

3. Hierarchical Workflow (Manager → Workers)

Use Case: Adaptive Research Workflow

Manager agent decides which worker agents to invoke based on initial results. Adapts dynamically to task complexity.

Cost: $0.045-0.080 per workflow (variable) | Latency: 10-25s | Accuracy: 94%

Server Action with Manager Agent

typescript
// app/actions/hierarchical-research.ts
'use server'

import { anthropic } from '@ai-sdk/anthropic'
import { openai } from '@ai-sdk/openai'
import { generateText, generateObject } from 'ai'
import { z } from 'zod'

const ManagerDecisionSchema = z.object({
  needsDeepResearch: z.boolean(),
  needsFactChecking: z.boolean(),
  needsExpertReview: z.boolean(),
  reasoning: z.string(),
})

export async function runHierarchicalResearch(topic: string) {
  console.log('Starting hierarchical research workflow...')

  // Step 1: Manager agent decides what's needed
  const managerDecision = await generateObject({
    model: openai('gpt-4o'),
    schema: ManagerDecisionSchema,
    prompt: `Analyze this research topic and decide what types of investigation are needed:

Topic: ${topic}

Decide:
1. needsDeepResearch: Does this require comprehensive academic research?
2. needsFactChecking: Are there factual claims that need verification?
3. needsExpertReview: Would expert domain knowledge improve quality?

Provide your reasoning for each decision.`,
  })

  console.log('Manager decisions:', managerDecision.object)

  const workerResults: string[] = []
  let totalTokens = managerDecision.usage.totalTokens

  // Step 2: Invoke worker agents based on manager's decisions

  if (managerDecision.object.needsDeepResearch) {
    console.log('Invoking deep research worker...')
    const researchResult = await generateText({
      model: anthropic('claude-3-7-sonnet-20250219'),
      prompt: `Conduct deep academic research on: ${topic}

Provide comprehensive analysis with sources and citations.`,
      maxTokens: 800,
    })
    workerResults.push(`Deep Research:
${researchResult.text}`)
    totalTokens += researchResult.usage.totalTokens
  }

  if (managerDecision.object.needsFactChecking) {
    console.log('Invoking fact-checking worker...')
    const factCheckResult = await generateText({
      model: anthropic('claude-3-7-sonnet-20250219'),
      prompt: `Fact-check claims related to: ${topic}

Verify accuracy and identify any misinformation.`,
      maxTokens: 500,
    })
    workerResults.push(`Fact Check:
${factCheckResult.text}`)
    totalTokens += factCheckResult.usage.totalTokens
  }

  if (managerDecision.object.needsExpertReview) {
    console.log('Invoking expert review worker...')
    const expertResult = await generateText({
      model: openai('gpt-4o'),
      prompt: `Provide expert-level analysis of: ${topic}

Include domain-specific insights and recommendations.`,
      maxTokens: 600,
    })
    workerResults.push(`Expert Review:
${expertResult.text}`)
    totalTokens += expertResult.usage.totalTokens
  }

  // Step 3: Manager synthesizes worker results
  const synthesisResult = await generateText({
    model: openai('gpt-4o'),
    prompt: `Synthesize these worker agent results into a comprehensive research report:

${workerResults.join('

')}

Create a cohesive, well-structured report that integrates all findings.`,
    maxTokens: 1000,
  })

  totalTokens += synthesisResult.usage.totalTokens

  return {
    success: true,
    report: synthesisResult.text,
    metadata: {
      managerDecisions: managerDecision.object,
      workersInvoked: [
        managerDecision.object.needsDeepResearch && 'deep-research',
        managerDecision.object.needsFactChecking && 'fact-checking',
        managerDecision.object.needsExpertReview && 'expert-review',
      ].filter(Boolean),
      totalTokens,
      adaptiveWorkflow: true,
    },
  }
}

5-Layer Error Handling Pattern

Layer 1: Validation BEFORE

Schema check and sanity bounds before agent invocation. Prevent invalid inputs from reaching expensive AI calls.

typescript
// Validate input before any AI calls
const inputSchema = z.object({
  topic: z.string().min(10).max(200),
  maxTokens: z.number().min(100).max(2000),
})

const validated = inputSchema.parse({ topic, maxTokens })

Layer 2: Timeout WITH Fallback

No agent call should run longer than 30 seconds without a fallback strategy.

typescript
// Add timeout to agent calls
const result = await Promise.race([
  generateText({ model, prompt, maxTokens }),
  new Promise((_, reject) =>
    setTimeout(() => reject(new Error('Agent timeout')), 30000)
  )
])

Layer 3: Retry WITH Exponential Backoff

Retry failed agent calls 3 times with exponential backoff: 1s, 2s, 4s.

typescript
async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  maxRetries = 3
): Promise<T> {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn()
    } catch (error) {
      if (i === maxRetries - 1) throw error
      await new Promise(resolve => setTimeout(resolve, 1000 * Math.pow(2, i)))
    }
  }
  throw new Error('Max retries exceeded')
}

Layer 4: Fallback WITH Degradation

If agent fails after retries, use simpler alternative or cached result. Degrade gracefully.

typescript
try {
  result = await powerfulAgent(query)
} catch (error) {
  console.warn('Powerful agent failed, falling back to simpler agent')
  result = await simplerAgent(query)
  result.degraded = true
}

Layer 5: Escape Hatch to Human

When all else fails, surface error clearly and request human intervention. Don't silently fail.

typescript
if (allAgentsFailed) {
  // Save workflow state for human review
  await saveWorkflowForHumanReview({
    workflowId,
    state: currentState,
    error: lastError,
    requiresHuman: true,
  })

  // Notify user
  return {
    success: false,
    requiresHumanReview: true,
    message: 'Workflow paused for human review',
  }
}

Related Resources