Parallel Agent Execution

Many real-world tasks benefit from running multiple agents simultaneously. Whether you are gathering information from different sources, generating multiple drafts, or executing independent tool calls, parallel execution can dramatically reduce end-to-end latency. This lesson covers the patterns and pitfalls of concurrent agent operations.

Promise.allSettled for Agents

When running multiple agents in parallel, Promise.allSettled is preferred over Promise.all because it never short-circuits. If one agent fails, the others still complete. This is critical for AI workflows where partial results are often better than no results.

import { createAgent } from '@waymakerai/aicofounder-core';

const agent = createAgent({ model: 'gpt-4o' });

async function gatherResearch(topic: string) {
  const tasks = [
    agent.run(`Summarize recent developments in ${topic}`),
    agent.run(`List key players and companies in ${topic}`),
    agent.run(`What are the main challenges facing ${topic}?`),
    agent.run(`Predict future trends for ${topic}`),
  ];

  const results = await Promise.allSettled(tasks);

  const research = results.map((result, index) => {
    if (result.status === 'fulfilled') {
      return { section: index, content: result.value.text, success: true };
    }
    return { section: index, error: result.reason.message, success: false };
  });

  // Use successful results even if some failed
  const successfulSections = research.filter((r) => r.success);
  console.log(`${successfulSections.length}/${tasks.length} sections completed`);

  return research;
}

Worker Pools for Agent Tasks

Running unlimited parallel agents can overwhelm your LLM provider's rate limits and spike costs. A worker pool pattern limits concurrency while still processing tasks as fast as possible. CoFounder provides an AgentPool that manages a fixed number of concurrent agent executions with automatic queuing.

import { createAgent, AgentPool } from '@waymakerai/aicofounder-core';

const pool = new AgentPool({
  agent: createAgent({ model: 'gpt-4o' }),
  concurrency: 5,        // Max 5 simultaneous LLM calls
  queueLimit: 100,       // Max 100 pending tasks
  timeoutMs: 30000,      // Per-task timeout
  onQueueFull: () => {
    console.warn('Agent pool queue is full, requests will be rejected');
  },
});

// Submit tasks - they execute up to 5 at a time
const promises = documents.map((doc) =>
  pool.submit(`Summarize this document: ${doc.content}`)
);

const summaries = await Promise.allSettled(promises);

// Graceful shutdown waits for in-flight tasks
await pool.drain();

Concurrent Tool Execution

When an agent needs to call multiple tools that are independent of each other, executing them concurrently saves significant time. CoFounder's tool executor detects independent tool calls and runs them in parallel automatically. You can also explicitly mark tools as parallelizable in your tool definitions.

The key consideration is dependency analysis: if tool B needs the output of tool A, they must run sequentially. CoFounder builds a dependency graph from the agent's tool call plan and executes the maximum number of tools in parallel at each step.

Resource Throttling

Beyond concurrency limits, you need to throttle based on tokens per minute, requests per minute, and cost budgets. CoFounder's throttling layer integrates with the rate limiter to ensure parallel execution stays within your provider's limits. If you are approaching a rate limit, the pool automatically slows down rather than hitting errors.

A good rule of thumb: set your pool concurrency to 50-70% of your provider's rate limit to leave headroom for other parts of your application. Monitor the pool's queue depth as a leading indicator of capacity problems.