Skip to content

ALTK Components

We summarize the components currently in ALTK in the table below.

altk_lifecycle

Lifecycle Step Component Problem Description Performance Resources
Pre-LLM Spotlight Agent does not follow instructions in the prompt. SpotLight enables users to emphasize important spans within their prompt and steers the LLMs attention towards those spans. It is an inference-time hook and does not involve any training or changes to model weights. 5 and 40 point accuracy improvements Paper
Pre-tool Refraction Agent generates inconsistent tool sequences. Verify the syntax of tool call sequences and repair any errors that will result in execution failures. 48% error correction Demo
Pre-tool SPARC The agent calls incorrect tools (in the wrong order, redundantly, etc.) or uses incorrect or hallucinated arguments. Evaluates tool calls before execution, identifying potential issues and suggesting corrections with reasoning for tool selection or argument values, including the corrected values. Achieved 88% accuracy in detecting tool-calling mistakes and +15% improvement in end-to-end tool-calling agent pass^k performance across GPT-4o, GPT-4o-mini, and Mistral-Large models.
Post-tool JSON Processor Agent gets overwhelmed with large JSON payloads in its context. If the agent calls tools which generate complex JSON objects as responses, this component will use LLM based Python code generation to process those responses and extract relevant information from them. +3 to +50 percentage point gains observed across 15 model from various families and sized on a dataset with 1298 samples Paper, Demo
Post-tool Silent Review Tool calls return subtle semantic errors that aren’t handled by the agent. A prompt-based approach to identify silent errors in tool calls (errors that do not produce any visible or explicit error message); Determines whether the tool response is relevant, accurate and complete based on the user's query 4% improvement observed in end-to-end agent accuracy
Post-tool RAG Repair Agent isn’t able to recover from tool call failures. Given a failing tool call, this component attempts to use an LLM to repair the call while making use of domain documents such as documentation or troubleshooting examples via RAG. This component will require a set of related documents to ingest 8% improvement observed on models like GPT-4o Paper
Pre-Response Policy Guard Agent returns responses that violate policies or instructions. Checks if the agent's output adheres to the policy statement and repairs the output if it does not +10 point improvement in accuracy Paper