How to build an agent
This guide will walk you through creating and deploying AI agents and multi-agent systems using the aiXplain Python SDK. You'll learn how to instantiate agent tools, configure agents to use those tools, and assemble those agents into a team agent.
You can learn more about aiXplain Agents and Team Agents in the Agent concept page.
Template
from aixplain.modules.agent import ModelTool, PipelineTool
from aixplain.factories import AgentFactory
from aixplain.factories import TeamAgentFactory
# Agent tools
model_tool = ModelTool(
model="<model_id>"
)
pipeline_tool = PipelineTool(
pipeline="<pipeline_id>",
description="<description_of_what_the_pipeline_does>"
)
# Agent
agent = AgentFactory.create(
name="<agent_name>",
tools=[
model_tool,
pipeline_tool,
...
],
llm_id="<model_id_of_the_llm_to_power_the_agent>",
)
# Team Agent
team = TeamAgentFactory.create(
name="<team_agent_name>",
agents=[
agent,
...
],
llm_id="model_id_of_the_llm_to_power_the_team_agent"
)
1. Agent Tools
You can empower Agents by equipping them with models and pipelines as agent tools. They can use these tools when responding to requests. This sections shows you how to instantiate model and pipeline tools once you have identified their model and pipeline IDs.
Lean how to search for models and pipelines in the aiXplain marketplace by following the How to search the marketplace guide.
1.1 Models
Use the ModelTool
class to create agent tools out of models. You can create them direcetly or use the AgentFactory.create_model_tool
method. You can specify the model tools in two ways:
- Specify the exact model you want (using its Model ID - learn how to search for models here).
- Specify the function you want the tool to perform and (optionally) the supplier you want the model to come from.
As an example, let's specify
- The AWS English speech synthesis - Amy model, which has ID
618ba6e5e2e1a9153ca2a3a5
. - Any Google Speech Recognition model.
- Any Microsoft Named Entity Recognition model.
- Any Optical Character Recognition (OCR) model.
- ModelTool
- AgentFactory.create_model_tool
from aixplain.modules.agent import ModelTool
from aixplain.enums import Function, Supplier
speech_synthesis_tool = ModelTool(
model="618ba6e5e2e1a9153ca2a3a5"
)
speech_recognition_tool = ModelTool(
function=Function.SPEECH_RECOGNITION,
supplier=Supplier.GOOGLE
)
ner_tool = ModelTool(
function=Function.NAMED_ENTITY_RECOGNITION,
supplier=Supplier.MICROSOFT
)
ocr_tool = ModelTool(function=Function.OCR)
# Display the dictionaries
display(speech_synthesis_tool.__dict__)
display(speech_recognition_tool.__dict__)
display(ner_tool.__dict__)
display(ocr_tool.__dict__)
from aixplain.factories import AgentFactory
from aixplain.enums import Function, Supplier
speech_synthesis_tool = AgentFactory.create_model_tool(
model="618ba6e5e2e1a9153ca2a3a5"
)
speech_recognition_tool = AgentFactory.create_model_tool(
function=Function.SPEECH_RECOGNITION,
supplier=Supplier.GOOGLE
)
ner_tool = AgentFactory.create_model_tool(
function=Function.NAMED_ENTITY_RECOGNITION,
supplier=Supplier.MICROSOFT
)
ocr_tool = AgentFactory.create_model_tool(function=Function.OCR)
# Display the dictionaries
display(speech_synthesis_tool.__dict__)
display(speech_recognition_tool.__dict__)
display(ner_tool.__dict__)
display(ocr_tool.__dict__)
You can optionally add add names and descriptions to your tools. These attributes have no impact on agents.
ner_tool.name = 'Microsoft NER'
ner_tool.description = 'Named Entity Recognition model by Microsoft'
display(ner_tool.__dict__)
Use the _member_names_
attribute to see the list of available function types and suppliers.
Function._member_names_
Supplier._member_names_
1.2 Pipelines
Use the PipelineTool
class to create agent tools out of pipelines. You can create them direcetly or use the AgentFactory.create_pipeline_tool
method. You must specify
- the exact pipeline you want (using its Pipeline ID - learn how to search for pipelines here),
- a description for the agent to know what it can use the pipeline for.
The pipeline below is Private
, so the ID will raise an error. Create this pipeline on your account using this template.
- PipelineTool
- AgentFactory.create_pipeline_tool
from aixplain.modules.agent import PipelineTool
text_analysis_pipeline_tool = PipelineTool(
description="Analyses text. It provides outputs for Topic Classification, Sentiment Analysis and Entity Linking.",
pipeline="666198a06f1b3d64bd8f8dcc"
)
display(text_analysis_pipeline_tool.__dict__)
from aixplain.factories import AgentFactory
text_analysis_pipeline_tool = AgentFactory.create_pipeline_tool(
description="Analyses text. It provides outputs for Topic Classification, Sentiment Analysis and Entity Linking.",
pipeline="666198a06f1b3d64bd8f8dcc"
)
display(text_analysis_pipeline_tool.__dict__)
2. Create and deploy an Agent
Use the AgentFactory
class to create and deploy an agent. Specify the following parameters:
- a unique
name
for your agent, - a brief
description
stating the agent's purpose, - [optonal] the
tools
you want your agent to use (default is[]
), - [optional] the
llm_id
of the language model you want to power the agent (default is OpenAI's GPT-4o Mini).
For this example let's use the Llama 3.1 70B model hosted by groq, which has ID 66b2708c6eb5635d1c71f611
.
- Text Analysis Agent
- Multimedia agent
from aixplain.factories import AgentFactory
text_analysis_agent = AgentFactory.create(
name="Text Analysis Agent",
description="An agent that analyses text.",
tools=[
ner_tool,
text_analysis_pipeline_tool,
],
llm_id="6646261c6eb563165658bbb1" # GPT-4o
)
text_analysis_agent.__dict__
from aixplain.factories import AgentFactory
multimedia_agent = AgentFactory.create(
name="Multimedia Agent AVI",
description="An agent for Audio and Image Processing.",
tools=[
speech_recognition_tool,
speech_synthesis_tool,
ocr_tool,
],
)
multimedia_agent.__dict__
You can use any LLMs on the aiXplain marketplace to power your agent. Use the code below to see our available LLMs.
from aixplain.factories import ModelFactory
model_list = ModelFactory.list(function=Function.TEXT_GENERATION, page_size=100)
# Sort the models by their names
sorted_models = sorted(model_list['results'], key=lambda model: model.name)
# Print the sorted models
for model in sorted_models:
print(model.name, model.id)
3. Create and deploy a Team Agent
Use the TeamAgentFactory
class to create a team agent. Specify the following parameters:
- a unique
name
for your team agent, - the
agents
you want your team agent to consist of, - [optional] the
llm_id
of the language model you want to power the team agent (default is OpenAI's GPT-4o Mini).
from aixplain.factories import TeamAgentFactory
team = TeamAgentFactory.create(
name="Team of Agents for Text Audio and Image Processing",
agents=[
text_analysis_agent,
multimedia_agent,
],
llm_id="6646261c6eb563165658bbb1"
)
team.__dict__
4. Call Agents and Team Agents
The aiXplain SDK allows you to run agents and team agents and gives you the option to use short-term memory (history).
Agent and Team Agent inputs can be URLs, file paths, or direct text. The examples below use only direct text.
Use the run
method to call an agent or team agent.
- Text Analysis Agent
- Multimedia Agent
- Multimedia Analysis Team Agent
agent_response = text_analysis_agent.run(
"""
Analyse the following text.
Bilbo Baggins leaves the One Ring to his heir, Frodo, after his birthday celebration.
Seventeen years later, Gandalf confirms that the Ring belongs to the Dark Lord Sauron
and advises Frodo to leave the Shire. Frodo embarks on a journey with Sam, Merry, and
Pippin, pursued by Black Riders. They encounter various dangers, including Old Man
Willow and a barrow-wight, but are saved by Tom Bombadil. In Bree, they meet Strider,
who helps guide them to Rivendell. Along the way, Frodo is wounded by the Black Riders
but is saved by Strider and Glorfindel. At Rivendell, the Council of Elrond determines
that the Ring must be destroyed in Mordor. Frodo volunteers for the task, and a
fellowship forms to aid him: Sam, Merry, Pippin, Gandalf, Aragorn, Boromir, Legolas,
and Gimli. After an attempt to cross the Misty Mountains fails, they travel through
the Mines of Moria, where Gandalf falls in battle with a Balrog. The remaining members
reach Lothlórien, where Galadriel tests them and offers gifts. As they continue their
journey, Boromir tries to seize the Ring, leading Frodo to decide to go to Mordor alone.
Sam follows him, and together they set off for Mordor.
"""
)
display(agent_response)
agent_response = multimedia_agent.run(
"""
Convert the following text to audio.
Bilbo Baggins leaves the One Ring to his heir, Frodo, after his birthday celebration.
Seventeen years later, Gandalf confirms that the Ring belongs to the Dark Lord Sauron
and advises Frodo to leave the Shire. Frodo embarks on a journey with Sam, Merry, and
Pippin, pursued by Black Riders. They encounter various dangers, including Old Man
Willow and a barrow-wight, but are saved by Tom Bombadil. In Bree, they meet Strider,
who helps guide them to Rivendell. Along the way, Frodo is wounded by the Black Riders
but is saved by Strider and Glorfindel. At Rivendell, the Council of Elrond determines
that the Ring must be destroyed in Mordor. Frodo volunteers for the task, and a
fellowship forms to aid him: Sam, Merry, Pippin, Gandalf, Aragorn, Boromir, Legolas,
and Gimli. After an attempt to cross the Misty Mountains fails, they travel through
the Mines of Moria, where Gandalf falls in battle with a Balrog. The remaining members
reach Lothlórien, where Galadriel tests them and offers gifts. As they continue their
journey, Boromir tries to seize the Ring, leading Frodo to decide to go to Mordor alone.
Sam follows him, and together they set off for Mordor.
"""
)
display(agent_response)
team_agent_response = multimedia_analysis_agent.run(
"""
Analyse the following text and provide your response as text and audio.
Bilbo Baggins leaves the One Ring to his heir, Frodo, after his birthday celebration.
Seventeen years later, Gandalf confirms that the Ring belongs to the Dark Lord Sauron
and advises Frodo to leave the Shire. Frodo embarks on a journey with Sam, Merry, and
Pippin, pursued by Black Riders. They encounter various dangers, including Old Man
Willow and a barrow-wight, but are saved by Tom Bombadil. In Bree, they meet Strider,
who helps guide them to Rivendell. Along the way, Frodo is wounded by the Black Riders
but is saved by Strider and Glorfindel. At Rivendell, the Council of Elrond determines
that the Ring must be destroyed in Mordor. Frodo volunteers for the task, and a
fellowship forms to aid him: Sam, Merry, Pippin, Gandalf, Aragorn, Boromir, Legolas,
and Gimli. After an attempt to cross the Misty Mountains fails, they travel through
the Mines of Moria, where Gandalf falls in battle with a Balrog. The remaining members
reach Lothlórien, where Galadriel tests them and offers gifts. As they continue their
journey, Boromir tries to seize the Ring, leading Frodo to decide to go to Mordor alone.
Sam follows him, and together they set off for Mordor.
"""
)
display(team_agent_response)
You can use the regular expression module re
to retrieve a URL in an agent response by searching for the "https://" pattern in the output.
import re
def extract_url(text):
url_pattern = r'https://[^\s]+'
match = re.search(url_pattern, text)
if match:
return match.group(0)
return None
You will need to adjust this function to handle multiple URL outputs.
You can make a HTTP requst to retrieve your output, then use the IPython Audio
and display
to display it.
from IPython.display import Audio, display
import requests
def download_and_display_audio(url):
response = requests.get(url, stream=True)
if response.status_code == 200:
file_name = "downloaded_audio.mp3"
with open(file_name, 'wb') as f:
f.write(response.content)
display(Audio(file_name))
else:
print(f"Failed to download file: {response.status_code}")
output_text = team_agent_response['data']['output']
print("output_text:", output_text)
url = extract_url(output_text)
print("url:", url)
download_and_display_audio(url)
4.1 Short-Term Memory
Every agent response includes a session_id
value, which corresponds to a history.
To continue using a history, specify the session_id
from the first query as an input parameter for all subsequent queries.
agent_response = text_analysis_agent.run(
"Analyse the following text: The quick brown fox jumped over the lazy dog. Answer in text and audio."
)
display(agent_response)
session_id = response["data"]["session_id"]
print(f"Session id: {session_id}")
agent_response = text_analysis_agent.run(
"What makes this text interesting?",
session_id=session_id,
)
display(agent_response)