Skip to main content
Version: 2.0

Serverless deployment

Serverless is the default deployment mode for aiXplain Agents on managed aiXplain infrastructure.

Use this mode when you want:

  • Fastest path from build to production.
  • No infrastructure management.
  • API and SDK access with autoscaling handled by aiXplain.

How serverless works

  • You build an agent in Studio or via SDK workflows.
  • The agent is hosted by aiXplain and exposed through run endpoints.
  • You invoke it with an API key and monitor runs in Studio.

There is no separate infrastructure provisioning step for serverless deployment.

Prerequisites

  • An aiXplain account.
  • A workspace API key.
  • An agent_id for the deployed agent.

Invoke a serverless agent (Python)

import os
import time
import requests

API_KEY = os.environ["AIXPLAIN_API_KEY"]
AGENT_ID = os.environ["AIXPLAIN_AGENT_ID"]

headers = {
"x-api-key": API_KEY,
"Content-Type": "application/json",
}

# Start run
run_resp = requests.post(
f"https://platform-api.aixplain.com/v2/agents/{AGENT_ID}/run",
headers=headers,
json={"query": "Summarize this ticket and suggest next steps."},
timeout=30,
)
run_resp.raise_for_status()
run_data = run_resp.json()
request_id = run_data.get("requestId")

# Poll result
while True:
result_resp = requests.get(
f"https://platform-api.aixplain.com/sdk/agents/{request_id}/result",
headers=headers,
timeout=30,
)
result_resp.raise_for_status()
result = result_resp.json()
if result.get("completed"):
print(result)
break
time.sleep(2)

Observability in serverless

  • Studio Analytics: usage, latency, and cost dashboards.
  • Execution traces: step-level run details in Studio validation and trace views.

Code snippets for execution traces

1. Correlate every run with requestId (REST)

import os
import requests

API_KEY = os.environ["AIXPLAIN_API_KEY"]
AGENT_ID = os.environ["AIXPLAIN_AGENT_ID"]

headers = {
"x-api-key": API_KEY,
"Content-Type": "application/json",
}

run_resp = requests.post(
f"https://platform-api.aixplain.com/v2/agents/{AGENT_ID}/run",
headers=headers,
json={"query": "Summarize the latest incident and suggest actions."},
timeout=30,
)
run_resp.raise_for_status()
run_data = run_resp.json()
request_id = run_data.get("requestId")
print("request_id:", request_id)
# Persist request_id in your app logs for trace correlation with Studio views.

2. Pull step-level details from poll results (REST)

import time
import requests

def poll_until_complete(request_id: str, headers: dict) -> dict:
while True:
r = requests.get(
f"https://platform-api.aixplain.com/sdk/agents/{request_id}/result",
headers=headers,
timeout=30,
)
r.raise_for_status()
payload = r.json()
if payload.get("completed"):
return payload
time.sleep(2)

result = poll_until_complete(request_id, headers)

# Response shape may vary by endpoint/version, so use safe fallbacks.
data = result.get("data", {}) if isinstance(result, dict) else {}
steps = data.get("steps") or result.get("steps") or data.get("intermediate_steps") or []

print("step_count:", len(steps))
for idx, step in enumerate(steps, start=1):
if not isinstance(step, dict):
print(f"[{idx}] {step}")
continue
name = step.get("name") or step.get("tool") or step.get("toolName") or "unknown"
status = step.get("status") or "unknown"
print(f"[{idx}] {name} -> {status}")

3. Stream execution progress in SDK (logs mode)

from aixplain import Aixplain

aix = Aixplain(api_key="YOUR_API_KEY")
agent = aix.Agent.get("YOUR_AGENT_ID")

response = agent.run(
query="Investigate this alert and produce remediation steps.",
progress_format="logs",
progress_verbosity=2,
)

# When available, inspect structured step details:
try:
print(response.data.steps)
except Exception:
pass

Can you pull analytics from API calls?

From the documented API requests endpoints, you can retrieve request-level execution results by polling run IDs. A dedicated analytics endpoint is not documented in API requests.

Practical pattern:

  • Capture requestId for every call.
  • Store response metadata and status in your own logging/metrics system.
  • Use Studio Analytics for aggregated dashboards.

When to use private instead

Use Private deployment if you need:

  • Air-gapped or strict network isolation.
  • Full data residency in your own infrastructure.
  • Customer-managed compute and security controls.