Tool Validation Component¶

This component validates Python tools by running test cases with a ReAct agent bound to the required tools. After execution, the tool logs are analyzed to identify error types and provide corresponding recommendations.

Features¶

This component performs tool validation and provide test report with the following error types and corresponding recommendations:

error_type	Recommendation
Incorrect tool identification	Agent identified incorrect tool - expected tool:,identified tool: * please modify or add the tool descriptions accordingly *
Agent tool calling issue	Agent could not invoke tool calling for the identified tool - * please check the tool definition *
Agent tool calling issue - Incorrect Tool Input format	Agent could not invoke tool calling for the identified tool - due to malformed tool input payload format , * please modify the tool input processing code *
Incorrect tool input payload	Check tool input payload for : Missing or Incorrect parameters * please modify tool input processing code accordingly, provide parameter descriptions in the tool definition for the agent to correctly identify parameters *
Incorrect tool inputs - Parameter Type Mismatch	Check tool input parameter type mismatch for : * please modify the tool input processing code accordingly *
Incorrect tool inputs - Parameter Value Mismatch	Check tool input parameter value mismatch for : *** please provide parameter descriptions, examples in the tool definition, for the agent to correctly identify parameter values
Agent Recursive Tool Calling	Agent invoked the tool repeatedly several times, tools invoked - * please modify the output returned in the tool definition accordingly, such that agent will not invoke the tool recursively *
Tool Output parsing error	Could not parse the tool output, * please check output returned in tool definition *
Tool Output exceeding LLM token limit	Agent LLM exceeding token limit, * please try with a different LLM with large token limit *
Incorrect tool autorization or access credentials	please provide valid credentials for the tool
Tool back-end server issue	Tool server is down, try after some time
Issue with the inputs provided to the tool	Tool input malformed, please provide tool inputs in correct format
No results found in the tool output	Please provide relevant tool input values

Architecture¶

The figure below shows the flow of tool validation.

Benchmarking¶

Tool Validation is benchmarked on more than 180 tools related to enterprise applications with 2400 test cases. Following is the benchmarking results show that tool failures happen due to various errors and approximately 30 percent of test cases are passed with out any errors and only 36 percent of tools are working without any failures.

Interface¶

This component expects the following input and generates the following output.

Input¶

python_tool_name - tool that is used of validation.
tool_test_cases - multiple tool test cases used for testing the tool with an agent.
agent_with_tools - We use langgraph type react agent bounded to the necessary tools to execute tool test cases.

Output¶

test_report: Tool test report with identified tool error taxonomy, recommendations for each test case (additionally tool execution logs can be added).

Getting Started¶

Refer to this README for instructions on how to get started with the code. See an example in action here.