Skip to main content

How to build an agent

This guide will walk you through creating and deploying AI agents and multi-agent systems using the aiXplain Python SDK. You'll learn how to instantiate agent tools, configure agents to use those tools, and assemble those agents into a team agent.

You can learn more about aiXplain Agents and Team Agents in the Agent concept page.

Template

from aixplain.modules.agent import ModelTool, PipelineTool
from aixplain.factories import AgentFactory
from aixplain.factories import TeamAgentFactory

# Agent tools
model_tool = ModelTool(
model="<model_id>"
)

pipeline_tool = PipelineTool(
pipeline="<pipeline_id>",
description="<description_of_what_the_pipeline_does>"
)

# Agent
agent = AgentFactory.create(
name="<agent_name>",
tools=[
model_tool,
pipeline_tool,
...
],
llm_id="<model_id_of_the_llm_to_power_the_agent>",
)

# Team Agent
team = TeamAgentFactory.create(
name="<team_agent_name>",
agents=[
agent,
...
],
llm_id="model_id_of_the_llm_to_power_the_team_agent"
)

1. Agent Tools

You can empower Agents by equipping them with models and pipelines as agent tools. They can use these tools when responding to requests. This sections shows you how to instantiate model and pipeline tools once you have identified their model and pipeline IDs.

info

Lean how to search for models and pipelines in the aiXplain marketplace by following the How to search the marketplace guide.

1.1 Models

Use the ModelTool class to create agent tools out of models. You can create them direcetly or use the AgentFactory.create_model_tool method. You can specify the model tools in two ways:

  1. Specify the exact model you want (using its Model ID - learn how to search for models here).
  2. Specify the function you want the tool to perform and (optionally) the supplier you want the model to come from.

As an example, let's specify

  • The AWS English speech synthesis - Amy model, which has ID 618ba6e5e2e1a9153ca2a3a5.
  • Any Google Speech Recognition model.
  • Any Microsoft Named Entity Recognition model.
  • Any Optical Character Recognition (OCR) model.
from aixplain.modules.agent import ModelTool
from aixplain.enums import Function, Supplier

speech_synthesis_tool = ModelTool(
model="618ba6e5e2e1a9153ca2a3a5"
)

speech_recognition_tool = ModelTool(
function=Function.SPEECH_RECOGNITION,
supplier=Supplier.GOOGLE
)

ner_tool = ModelTool(
function=Function.NAMED_ENTITY_RECOGNITION,
supplier=Supplier.MICROSOFT
)

ocr_tool = ModelTool(function=Function.OCR)

# Display the dictionaries
display(speech_synthesis_tool.__dict__)
display(speech_recognition_tool.__dict__)
display(ner_tool.__dict__)
display(ocr_tool.__dict__)
Show output
note

You can optionally add add names and descriptions to your tools. These attributes have no impact on agents.

ner_tool.name = 'Microsoft NER'
ner_tool.description = 'Named Entity Recognition model by Microsoft'

display(ner_tool.__dict__)
Show output
tip

Use the _member_names_ attribute to see the list of available function types and suppliers.

Function._member_names_
Show output
Supplier._member_names_
Show output

1.2 Pipelines

Use the PipelineTool class to create agent tools out of pipelines. You can create them direcetly or use the AgentFactory.create_pipeline_tool method. You must specify

  • the exact pipeline you want (using its Pipeline ID - learn how to search for pipelines here),
  • a description for the agent to know what it can use the pipeline for.
note

The pipeline below is Private, so the ID will raise an error. Create this pipeline on your account using this template.

from aixplain.modules.agent import PipelineTool

text_analysis_pipeline_tool = PipelineTool(
description="Analyses text. It provides outputs for Topic Classification, Sentiment Analysis and Entity Linking.",
pipeline="666198a06f1b3d64bd8f8dcc"
)

display(text_analysis_pipeline_tool.__dict__)
Show output

2. Create and deploy an Agent

Use the AgentFactory class to create and deploy an agent. Specify the following parameters:

  • a unique name for your agent,
  • a brief description stating the agent's purpose,
  • [optonal] the tools you want your agent to use (default is []),
  • [optional] the llm_id of the language model you want to power the agent (default is OpenAI's GPT-4o Mini).
  • [optional] the information about the supplier and version of each component.

For this example let's use the Llama 3.1 70B model hosted by groq, which has ID 66b2708c6eb5635d1c71f611.

from aixplain.factories import AgentFactory

text_analysis_agent = AgentFactory.create(
name="Text Analysis Agent",
description="An agent that analyses text.",
tools=[
ner_tool,
text_analysis_pipeline_tool,
],
llm_id="6646261c6eb563165658bbb1" # GPT-4o
)

text_analysis_agent.__dict__
Show output
tip

You can use any LLMs on the aiXplain marketplace to power your agent. Use the code below to see our available LLMs.

from aixplain.factories import ModelFactory

model_list = ModelFactory.list(function=Function.TEXT_GENERATION, page_size=100)

# Sort the models by their names
sorted_models = sorted(model_list['results'], key=lambda model: model.name)

# Print the sorted models
for model in sorted_models:
print(model.name, model.id)
Show output

3. Create and deploy a Team Agent

Use the TeamAgentFactory class to create a team agent. Specify the following parameters:

  • a unique name for your team agent,
  • the agents you want your team agent to consist of,
  • a description of the team agent's purpose and functionality.
  • [optional] the llm_id of the language model you want to power the team agent (default is OpenAI's GPT-4o Mini).
  • [optional] the information about the supplier and version of the agents or tools for compatibility and reproducibility.
  • [optional] whether to use_mentalist_and_inspector to include advanced debugging and optimization tools like Mentalist and Inspector. This is set to True by default for enhanced performance and troubleshooting capabilities.
from aixplain.factories import TeamAgentFactory

team = TeamAgentFactory.create(
name="Team of Agents for Text Audio and Image Processing",
agents=[
text_analysis_agent,
multimedia_agent,
],
llm_id="6646261c6eb563165658bbb1"
)

team.__dict__

4. Call Agents and Team Agents

The aiXplain SDK allows you to run agents and team agents and gives you the option to use short-term memory (history).

note

Agent and Team Agent inputs can be URLs, file paths, or direct text. The examples below use only direct text.

Use the run method to call an agent or team agent.

agent_response = text_analysis_agent.run(
"""
Analyse the following text.

Bilbo Baggins leaves the One Ring to his heir, Frodo, after his birthday celebration.
Seventeen years later, Gandalf confirms that the Ring belongs to the Dark Lord Sauron
and advises Frodo to leave the Shire. Frodo embarks on a journey with Sam, Merry, and
Pippin, pursued by Black Riders. They encounter various dangers, including Old Man
Willow and a barrow-wight, but are saved by Tom Bombadil. In Bree, they meet Strider,
who helps guide them to Rivendell. Along the way, Frodo is wounded by the Black Riders
but is saved by Strider and Glorfindel. At Rivendell, the Council of Elrond determines
that the Ring must be destroyed in Mordor. Frodo volunteers for the task, and a
fellowship forms to aid him: Sam, Merry, Pippin, Gandalf, Aragorn, Boromir, Legolas,
and Gimli. After an attempt to cross the Misty Mountains fails, they travel through
the Mines of Moria, where Gandalf falls in battle with a Balrog. The remaining members
reach Lothlórien, where Galadriel tests them and offers gifts. As they continue their
journey, Boromir tries to seize the Ring, leading Frodo to decide to go to Mordor alone.
Sam follows him, and together they set off for Mordor.
"""
)

display(agent_response)
Show output
tip

You can use the regular expression module re to retrieve a URL in an agent response by searching for the "https://" pattern in the output.

import re

def extract_url(text):
url_pattern = r'https://[^\s]+'
match = re.search(url_pattern, text)
if match:
return match.group(0)
return None

You will need to adjust this function to handle multiple URL outputs.

tip

You can make a HTTP requst to retrieve your output, then use the IPython Audio and display to display it.

from IPython.display import Audio, display
import requests

def download_and_display_audio(url):
response = requests.get(url, stream=True)
if response.status_code == 200:
file_name = "downloaded_audio.mp3"
with open(file_name, 'wb') as f:
f.write(response.content)

display(Audio(file_name))
else:
print(f"Failed to download file: {response.status_code}")
output_text = team_agent_response['data']['output']
print("output_text:", output_text)

url = extract_url(output_text)
print("url:", url)

download_and_display_audio(url)
Show output

4.1 Short-Term Memory

Every agent response includes a session_id value, which corresponds to a history.

To continue using a history, specify the session_id from the first query as an input parameter for all subsequent queries.

agent_response = text_analysis_agent.run(
"Analyse the following text: The quick brown fox jumped over the lazy dog. Answer in text and audio."
)

display(agent_response)
Show output
session_id = response["data"]["session_id"]
print(f"Session id: {session_id}")
Show output
agent_response = text_analysis_agent.run(
"What makes this text interesting?",
session_id=session_id,
)

display(agent_response)
Show output