Version: v2.0

Serverless Deployment

Serverless is the default deployment mode for aiXplain Agents on managed aiXplain infrastructure.

Use this mode when you want:

Fastest path from build to production.
No infrastructure management.
API and SDK access with autoscaling handled by aiXplain.

How serverless works

You build an agent in Studio or via SDK workflows.
The agent is hosted by aiXplain and exposed through run endpoints.
You invoke it with an API key and monitor runs in Studio.

There is no separate infrastructure provisioning step for serverless deployment.

Prerequisites

An aiXplain account.
A workspace API key.
An agent_id for the deployed agent.

Invoke a serverless agent (Python)

import os
import time
import requests

API_KEY = os.environ["AIXPLAIN_API_KEY"]
AGENT_ID = os.environ["AIXPLAIN_AGENT_ID"]

headers = {
    "x-api-key": API_KEY,
    "Content-Type": "application/json",
}

# Start run
run_resp = requests.post(
    f"https://platform-api.aixplain.com/v2/agents/{AGENT_ID}/run",
    headers=headers,
    json={"query": "Summarize this ticket and suggest next steps."},
    timeout=30,
)
run_resp.raise_for_status()
run_data = run_resp.json()
request_id = run_data.get("requestId")

# Poll result
while True:
    result_resp = requests.get(
        f"https://platform-api.aixplain.com/sdk/agents/{request_id}/result",
        headers=headers,
        timeout=30,
    )
    result_resp.raise_for_status()
    result = result_resp.json()
    if result.get("completed"):
        print(result)
        break
    time.sleep(2)

Observability in serverless

Studio Analytics: usage, latency, and cost dashboards.
Execution traces: step-level run details in Studio validation and trace views.

Code snippets for execution traces

1. Correlate every run with `requestId` (REST)

import os
import requests

API_KEY = os.environ["AIXPLAIN_API_KEY"]
AGENT_ID = os.environ["AIXPLAIN_AGENT_ID"]

headers = {
    "x-api-key": API_KEY,
    "Content-Type": "application/json",
}

run_resp = requests.post(
    f"https://platform-api.aixplain.com/v2/agents/{AGENT_ID}/run",
    headers=headers,
    json={"query": "Summarize the latest incident and suggest actions."},
    timeout=30,
)
run_resp.raise_for_status()
run_data = run_resp.json()
request_id = run_data.get("requestId")
print("request_id:", request_id)
# Persist request_id in your app logs for trace correlation with Studio views.

2. Pull step-level details from poll results (REST)

import time
import requests

def poll_until_complete(request_id: str, headers: dict) -> dict:
    while True:
        r = requests.get(
            f"https://platform-api.aixplain.com/sdk/agents/{request_id}/result",
            headers=headers,
            timeout=30,
        )
        r.raise_for_status()
        payload = r.json()
        if payload.get("completed"):
            return payload
        time.sleep(2)

result = poll_until_complete(request_id, headers)

# Response shape may vary by endpoint/version, so use safe fallbacks.
data = result.get("data", {}) if isinstance(result, dict) else {}
steps = data.get("steps") or result.get("steps") or data.get("intermediate_steps") or []

print("step_count:", len(steps))
for idx, step in enumerate(steps, start=1):
    if not isinstance(step, dict):
        print(f"[{idx}] {step}")
        continue
    name = step.get("name") or step.get("tool") or step.get("toolName") or "unknown"
    status = step.get("status") or "unknown"
    print(f"[{idx}] {name} -> {status}")

3. Stream execution progress in SDK (logs mode)

from aixplain import Aixplain

aix = Aixplain(api_key="YOUR_API_KEY")
agent = aix.Agent.get("YOUR_AGENT_ID")

response = agent.run(
    query="Investigate this alert and produce remediation steps.",
    progress_format="logs",
    progress_verbosity=2,
)

# When available, inspect structured step details:
try:
    print(response.data.steps)
except Exception:
    pass

Can you pull analytics from API calls?

From the documented API requests endpoints, you can retrieve request-level execution results by polling run IDs. A dedicated analytics endpoint is not documented in API requests.

Practical pattern:

Capture requestId for every call.
Store response metadata and status in your own logging/metrics system.
Use Studio Analytics for aggregated dashboards.

Production readiness checklist

Before exposing an agent to live traffic:

Confirm the agent behavior in aiXplain Studio validation traces.
Validate outputs on representative and adversarial inputs.
Configure Inspectors for safety, quality, and compliance checks.
Set API access and quotas via API keys.
Verify workspace permissions in Workspaces.
Confirm cost expectations in Credits and billing.
Decide memory posture per agent or session (enabled or disabled) based on privacy and retention needs.

Reliability patterns

For stable runtime behavior under load and failure scenarios:

Keep retries and fallback strategy enabled for model and tool failures.
Configure a clear primary/secondary fallback chain for critical model or tool dependencies.
Use deterministic task structure where strict execution order is required.
Set clear termination criteria to avoid runaway loops.
Guard external dependencies with timeout-aware tools and graceful fallback behavior.
Test degraded scenarios (tool unavailable, model timeout, malformed tool response).

When to use private instead

Use Private deployment if you need:

Air-gapped or strict network isolation.
Full data residency in your own infrastructure.
Customer-managed compute and security controls.

How serverless works​

Prerequisites​

Invoke a serverless agent (Python)​

Observability in serverless​

Code snippets for execution traces​

1. Correlate every run with requestId (REST)​

2. Pull step-level details from poll results (REST)​

3. Stream execution progress in SDK (logs mode)​

Can you pull analytics from API calls?​

Production readiness checklist​

Reliability patterns​

When to use private instead​