Skip to main content

Overview

Agents are autonomous systems designed to understand user instructions and perform actions to fulfill them. In artificial intelligence, agents can leverage the capabilities of Large Language Models (LLMs) combined with external tools, allowing them to go beyond processing language to perform tasks such as retrieving information, generating insights, automating workflows, or executing multi-step operations that require planning and decision-making.

In aiXplain, any LLM available in the marketplace can be enabled as an agent and empowered by any available models and pipelines.

Agents vs. LLMs

While LLMs are powerful, they are limited by their training data and lack of access to external tools or up-to-date information. Agents extend LLM capabilities by incorporating specialized tools to address such limitations, making them more effective in handling tasks like calculations or querying current databases.

Agents vs. Pipelines

Pipelines follow a pre-defined sequence of operations, ensuring consistency in execution, but they are less flexible than agents. Agents use LLMs to dynamically determine the best approach to a task, adapting to different situations and providing more variability in their outputs.

Core Features

  • Built-In Cognitive Capabilities: Agents can have advanced features like task planning, reasoning, and feedback-based improvements for more sophisticated actions.

  • Multi-Agent Orchestration: Supports multiple agents working together on distributed problem-solving tasks.

  • Configurable Architecture: Developers can tailor agent components or activate specific cognitive features as needed.

  • Integration with aiXplain Marketplace: Agents can leverage models and tools from the marketplace for faster development.

  • Workflow Management: Agents can incorporate pipelines when deterministic behavior is required.

  • Explainable Debugging: Provides transparency in agent decision-making to aid in debugging and quality control.

  • Managed Agent Lifecycle: The framework simplifies the deployment, operation, and scaling of agents.

Agent Architecture

The agent architecture in aiXplain is composed of four main building blocks, shared across all agent types. Depending on the agent's configuration, some components are disabled or pre-configured. The architecture combines the supervisor-worker framework with the Plan-and-Solve model described by Wang et al., (2023).

Below is a breakdown of each building block and its role in the agent's operations.

A. Cognitive Components

Cognitive components are pre-configured agents and algorithms that manage the cognitive tasks of the agent, enhancing quality, security, and long-term performance. These components include advanced functionalities such as task planning, task decomposition, reasoning, self-reflection, and self-improvement. Users can activate or deactivate components as needed and can customize them by fine-tuning or replacing them with tailored versions.

Cognitive Components Overview

Component NameCognitive CapabilityAdded Value
Mentalist AgentProcesses user input, asks clarifying questions if needed, decomposes complex tasks into plans, stores them in memory, replans as necessary.Allows agents to handle multi-step decision-making and planing to tackle complex problems.
Inspector AgentVerifies tool outputs against user criteria (quality, time, cost).Helps ensure high-quality responses that appropriately address user requests.
Bodyguard Agent (Coming soon)Enforces data access permissions and privacy guardrails.Maintains data security and enforces access and privacy policies for sensitive data interactions.

Advantages of Cognitive Component Design

  • Efficient Execution: The architecture uses smaller, specialized LLMs to handle specific tasks rather than a single large model. This approach improves execution efficiency and response quality.
  • Cross-Validation for Improved Quality: Multiple specialized models validate each other’s outputs, improving accuracy and reliability.
  • Expandable and Modular Design: The modular structure allows for the addition of new cognitive components and easy updates, providing flexibility and adaptability to changing requirements.

B. Orchestrator

The Orchestrator is the core LLM responsible for task execution. It follows the task plan generated by the cognitive components, selecting and invoking the appropriate tools, pipelines, or agents to fulfill the task's requirements.

Role of the Orchestrator:

  • Task Execution: Interprets the agent’s task plan and carries it out by invoking appropriate tools.
  • Coordination: Manages tool invocation and task completion in a logical and efficient sequence.
  • Adaptation: Reassesses the task plan based on real-time outcomes and adjusts by replanning if necessary.

C. Memory Structure

The memory structure facilitates communication between different components through a shared memory system, ensuring coordination within the agent.

Memory System Features:

  • Task Management: Stores task plans, intermediate results, and final outputs to maintain smooth interaction between components.
  • Contextual Awareness: Tracks past inputs and interactions, allowing for contextually relevant responses.
  • Component Coordination: Shares data and results across cognitive components and tools for cohesive operation.

D. Toolkit

The toolkit contains models, pipelines, tools, and agents specialized for predefined tasks. The Orchestrator uses these tools to execute specific steps in the task plan.

Toolkit Features:

  • Predefined Tasks: Each tool is specialized for specific actions, providing precise task execution.
  • Dynamic Tool Use: The Orchestrator selects appropriate tools based on task requirements, using the most effective resources.
  • Pipeline Integration: Includes pre-configured pipelines for deterministic processes, supporting reliable execution of tasks.

Summary of Architecture Components

Building BlockFunction
Cognitive ComponentsHandle cognitive tasks like planning, task decomposition, reasoning, and self-improvement.
OrchestratorThe LLM that executes tasks by selecting and coordinating the appropriate tools.
Memory StructureA shared system that facilitates communication and coordination between components, routing information as needed.
ToolkitA collection of specialized models, tools, pipelines, and agents used to perform specific tasks in the execution plan.

Types of aiXplain Agents

  • Single-Task Agent: A lightweight agent with a specific focus, consisting of an LLM, a set of tools, and memory, without built-in cognitive capabilities.
  • Team Agent: A multi-agent system for handling complex tasks, utilizing specialized components like the Mentalist (task planning), Inspector (quality control), and Bodyguard (security). Each agent in this system works with one or more LLMs to tackle different aspects of a problem collaboratively. The Mentalist breaks down user instructions into a plan, which the Orchestrator assigns to worker agents. Once a task is completed, the Inspector reviews it and updates the plan, triggering the Orchestrator to handle any remaining steps until all tasks are finished and the final response is ready.
Docusaurus themed imageDocusaurus themed image
Agent
Another themed imageAnother themed image
Team Agent