Skip to main content

Tracing

Tracing, monitoring, and debugging help you understand how an agent or team agent handled a request, which tools were used, what reasoning was applied, and where errors may have occurred.


Tracing agent executions

When you run an agent, the system returns an AgentResponse containing:

  • input: The original input, chat history, and parameters.
  • output: Final generated result.
  • session_id: Used to track conversation sessions.
  • intermediate_steps: A full trace of the agent’s tool usage, responses, and reasoning.
  • execution_stats: Timing, API calls made, credits consumed, runtime breakdown by tools.
AgentResponse(
status="SUCCESS",
completed=True,
data=AgentResponseData(
input={...},
output="Summarized news content...",
session_id="abc123",
intermediate_steps=[...],
execution_stats={...}
),
used_credits=0.0003,
run_time=4.1
)

Example usage:

query = "Identify and qualify EdTech leads for AI-based personalized learning."
response = lead_finder.run(query)

from pprint import pprint
pprint(response)
Show output

intermediate_steps include:

  • agent name (e.g., Lead Finder, Summarizer, Translator)
  • input provided to the agent
  • output generated by the agent
  • tool_steps used (if a tool was called)
  • runTime for that agent’s step
  • usedCredits for that step
  • apiCalls made (if any)
  • thought records the agent’s internal reasoning about its action.
  • task a field used only inside team agents to define the assigned work for user-defined agents.

This makes it easy to review what happened inside the agent and debug if needed.

note

If thought is None or empty, it simply means: The agent did not generate any internal reasoning (it moved straight from input to output).

task is not used when the user sends a message directly to an agent.

Example usage:

response.data["intermediate_steps"]
Show output

Tracing team agent executions

Team agents extend this behavior by orchestrating multiple user-defined agents and micro agents (mentalists, inspectors, orchestrators, feedback combiners).

When you run a team agent, the system returns a ModelResponse containing:

  • input: The original input and parameters.
  • output: Final structured answer.
  • session_id: Session ID for multi-step traces.
  • intermediate_steps:
    • Micro agent actions (e.g., mentalist planning, inspector reviews, feedback combinations)
    • Each step run by different agents (Lead Finder, Lead Analyzer, etc.)
    • Tool usage and reasoning
  • execution_stats: Overall run time, credits consumed, API calls made, breakdowns per agent and micro agent.
  • plan: Describes the intended execution flow: a sequence of tasks, assigned agents (workers), and their expected outcomes.
ModelResponse(
status="SUCCESS",
data={
"input": {...},
"output": "final output...",
"session_id": "abc-123",
"intermediate_steps": [...],
"executionStats": {...},
"plan": [
{"step": "Task: find leads", "worker": "Lead Finder"},
{"step": "Task: analyze leads", "worker": "Lead Analyzer"}
]
},
completed=True
)

Example usage:

query = "Identify and qualify EdTech leads for AI-based personalized learning."
team_response = team.run(query)

from pprint import pprint
pprint(team_response)
Show output

intermediate_steps

The intermediate_steps for a team agent contain a detailed log of every micro agent actions. Each step includes:

  • agent is agent name (e.g., orchestrator, mentalist, Lead Finder, inspector, response generator)
  • input provided to that agent
  • output generated by that agent
  • tool_steps used (if the agent called a tool)
  • runTime for that agent’s step
  • usedCredits for that step
  • apiCalls made by the agent (if any)
  • thought
    • Mentalist: See how the tasks were planned.
    • Inspector: Review validation feedback or issues detected during or after agent execution.
    • Feedback combiner: Summarizes multiple inspector comments into one.
  • task assignment information (when available). It helps orchestrate sub-tasks between user-defined agents and ensures each agent knows its role.

Example usage:

team_response.data["intermediate_steps"]
Show output

Understnding team agent behavior

Mentalist

  • The mentalist started with no input and created a plan to fulfill the user's request: Identify and qualify EdTech leads for AI-based personalized learning.
  • It broke the goal into two tasks:
    • Find leads: Generate a list of EdTech companies with contact information.
    • Analyze leads: Prioritize those companies based on their fit with the AI platform.
  • It assigned:
    • "Lead Finder" agent to handle task 1
    • "Lead Analyzer" agent to handle task 2

Orchestrator

  • The orchestrator took the user's request and the mentalist’s plan, and assigned the first task to the agent: Lead Finder.
  • It injected the user request and instructions into the Lead Finder's input.
  • After the Lead Finder completed the task, the orchestrator collected the results and passed them to the inspector.
  • When the inspector reported that the results were incomplete (missing contact information, not focused enough on AI personalization), the orchestrator created a new assignment to have the Lead Finder redo the task, this time with stricter instructions based on the feedback.
  • After two incomplete attempts by the Lead Finder, the orchestrator reassigned the work to the Lead Analyzer. When the Lead Analyzer could not provide a fully qualified output and the inspector confirmed no further progress could be made, the orchestrator issued a FINISH signal to end the task.

Inspector

  • The inspector received the Lead Finder agent results after the first run.
  • It analyzed the output, not just for existence but against the user's goals:
    • Were the leads qualified?
    • Was complete contact information provided?
    • Was the AI personalization focus clear?
  • The inspector found that:
    • While some leads were relevant, the contact info was often missing.
    • Some companies were not clearly focused on AI personalization.
  • It gave structured feedback: Re-run the step to include comprehensive contact information and ensure each company focuses on AI-based personalized learning.

Monitoring execution

For both agent and team agent, execution stats help you monitor:

  • Session ID: Allows resuming or analyzing multi-turn interactions.
  • API call breakdown: Number of tool/API calls per agent.
  • Credit consumption: Per tool, per agent, and total.
  • Run time breakdown: How much time each step or agent consumed.
  • Tool failures
  • Unexpected behaviors

Monitor agents

response.data["execution_stats"]
Show output

Monitor team agents

team_response.data["executionStats"]
Show output

Key fields:

  • status: Overall result of the execution (e.g., SUCCESS).
  • apiCalls: Total number of API calls made during execution.
  • credits: Total compute credits consumed by the execution.
  • runtime: Total runtime (in seconds) for the execution.
  • apiCallBreakdown: Number of API calls made by each asset.
  • runtimeBreakdown: Time (in seconds) spent by each asset.
  • creditBreakdown: Compute credits used by each asset.
  • sessionId: Unique identifier for the execution session.
  • environment: Execution environment (e.g., prod, dev).
  • assetsUsed: List of agents, tools, or models utilized.
  • timeStamp: Timestamp when execution finished (UTC).
  • params: Additional parameters like id and sessionId.

Debugging failures

  • Internal failures: If an agent or micro agent fails (tool timeout, wrong output format), the step will show an error or failure status inside intermediate_steps.
  • External failures: If the final output cannot be produced, the run status will be "FAILED" and the error will appear in the execution_stats.

Important:

  • A single agent failure inside a team agent does not necessarily fail the team.
  • However, if a critical step (e.g., response generation) fails, the entire team agent run can fail.
  • Inspecting the intermediate_steps and thoughts from inspectors will show exactly where and why a failure happened.

Best practices for tracing, monitoring, and debugging

  • Always check intermediate_steps to understand agent decision-making.
  • Review executionStats to catch expensive tool calls or slow steps.
  • Use inspectors for early error catching.
  • For complex team agents, trace the plan and each micro-agent individually.
  • Investigate any failures in steps or missing outputs early.