Evolve¶

Self-improving agents through iterations.

Coding agents repeat the same mistakes because they start fresh every session. Evolve gives agents memory — they learn from what worked and what didn't, so each session is better than the last.

Evolve is a system designed to help agents improve over time by learning from their trajectories. It uses a combination of an MCP server for tool integration, vector storage for memory, and LLM-based conflict resolution to refine its knowledge base.

On the AppWorld benchmark, Evolve improved agent reliability by +8.9 points overall, with a 74% relative increase on hard multi-step tasks. Evolve is a system designed to help agents improve over time by learning from their trajectories. It uses a combination of an MCP server for tool integration, vector storage for memory, and LLM-based conflict resolution to refine its knowledge base.

The Announcement → Check out the Blog → Read the Paper → See the Results →

LiteFull

When setting up API keys and extra services are too much

General Installation

Claude Code IBM Bob Codex

Total Control

Under Development

MCP Server

Exposes tools to get guidelines and save trajectories.
Conflict Resolution

Intelligently merges new insights with existing guidelines using LLMs.
Trajectory Analysis

Automatically analyzes agent trajectories to generate guidelines and best practices.
Milvus Integration

Uses Milvus (or Milvus Lite) for efficient vector storage and retrieval.

Guides¶

Configuration: Configure models, backends, and environment variables.
Low-Code Tracing: Instrument agents with Phoenix and verify end-to-end tracing.
Phoenix Sync: Pull trajectories from Phoenix and generate stored guidelines.
Extract Trajectories: Export Phoenix traces into an OpenAI-style message format.

Reference¶

CLI Reference: Manage namespaces, entities, and sync jobs from the command line.
Policies: Structured policy entities and how to retrieve them with MCP tools.

Demos¶

Claude CodeIBM BobCodex

How It Works¶

Evolve analyzes agent trajectories to extract guidelines and best practices, then recalls them in future sessions. It supports both a lightweight file-based mode (Evolve Lite) and a full mode backed by an MCP server with vector storage and LLM-based conflict resolution.