Rate Limiting
Rate limits control how many requests and tokens a given API key can use per minute or per day for specific models. An admin key sets these limits on member keys; member keys cannot modify limits or inspect other keys.
Setup
pip install aixplain
from aixplain import Aixplain
# Member key — for inference and monitoring your own usage
aix = Aixplain(api_key="YOUR_MEMBER_API_KEY")
# Admin key — for creating and managing limits on other keys
aix_admin = Aixplain(api_key="YOUR_ADMIN_API_KEY")
Create an admin key in Console → Settings → API Keys. Admin keys cannot be used for inference.
Search existing keys
Member keys cannot list other keys — calling search() with a member key raises Forbidden. Admin keys can list all keys in the workspace.
# Admin can list all keys
result = aix_admin.APIKey.search()
for key in result["results"]:
print(key.name, key.id)
Use get_by_access_key() when you have the key string itself (not the ID) and need to inspect or update it.
target = aix_admin.APIKey.get_by_access_key("TARGET_MEMBER_API_KEY")
print("id:", target.id)
print("name:", target.name)
print("is_admin:", target.is_admin)
print("global_limits:", target.global_limits.to_dict() if target.global_limits else None)
for limit in target.asset_limits:
print("asset limit:", limit.to_dict())
Create a key with rate limits
Create a member key and attach per-model and global limits in one call.
from datetime import datetime
from aixplain.v2.api_key import APIKeyLimits
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
new_key = aix_admin.APIKey(
name=f"member-key-{timestamp}",
asset_limits=[
APIKeyLimits(
model="openai/gpt-5/openai",
token_per_minute=10,
token_per_day=30,
request_per_minute=2,
request_per_day=2,
)
],
global_limits=APIKeyLimits(
token_per_minute=100,
token_per_day=1000,
request_per_minute=100,
request_per_day=1000,
),
budget=1000,
expires_at=datetime(2030, 1, 1),
).save()
print(new_key.to_dict())
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
name | str | ✅ | — | Label for the key. |
asset_limits | list[APIKeyLimits] | — | [] | Per-model limits. |
global_limits | APIKeyLimits | — | None | Limits across all models on this key. |
budget | int | — | None | Total credits available. Increase the value to top up. |
expires_at | datetime | — | None | Expiry date. Omit for a non-expiring key. |
Global limits and asset-specific limits are both enforced. Global limits do not override per-model limits — whichever is stricter applies first.
Update rate limits
Modify limits on an existing key and call save(). Updated limits take effect at the start of the next timeframe (next minute for per-minute limits, next day for per-day limits).
If you have the member key string (not the ID), use get_by_access_key() to fetch it first:
from aixplain.v2.api_key import APIKeyLimits
key = aix_admin.APIKey.get_by_access_key("TARGET_MEMBER_API_KEY")
key.asset_limits = [
APIKeyLimits(
model="669a63646eb56306647e1091",
request_per_minute=2,
request_per_day=5,
token_per_minute=500,
token_per_day=5000,
)
]
key.save()
Or fetch by listing all keys if you have the ID:
from aixplain.v2.api_key import APIKeyLimits, TokenType
# Fetch the key to update
result = aix_admin.APIKey.search()
key = result["results"][0]
# Update budget
key.budget = 1200
# Update global limits
key.global_limits.token_per_day = 50
key.global_limits.token_per_minute = 500
# Replace asset limits
key.asset_limits = [
APIKeyLimits(
model="openai/gpt-5/openai",
token_per_minute=20 * 15000,
token_per_day=8 * 60 * (20 * 15000),
request_per_minute=60,
request_per_day=8 * 60 * 60,
token_type=TokenType.OUTPUT,
),
APIKeyLimits(
model="openai/gpt-5.1/openai",
token_per_minute=60 * 8000,
token_per_day=3 * 60 * (60 * 8000),
request_per_minute=200,
request_per_day=8 * 60 * 200,
token_type=TokenType.OUTPUT,
),
]
key.save()
print(key.to_dict())
token_type=TokenType.OUTPUT counts only output tokens against the limit. Omit it to count all tokens (input + output).
Monitor usage
get_usage_limits() returns daily consumption and configured limits. Members call it on aix.APIKey to check their own key. Admins call it on any key object returned from search() or get_by_access_key().
# All usage rows for your own key (member)
rows = aix.APIKey.get_usage_limits()
for row in rows:
print(row)
A row with model=None is the global scope — None counts and limits mean no global cap is configured on this key. Ignore it unless you explicitly set global limits.
Pass model to filter to one model:
MODEL_ID = "669a63646eb56306647e1091" # use IDs, not paths
rows = aix.APIKey.get_usage_limits(model=MODEL_ID)
for row in rows:
if row.model is None:
print("No model-specific limits configured")
else:
print(
f"Model {row.model}: "
f"{row.daily_request_count}/{row.daily_request_limit} daily requests, "
f"{row.daily_token_count}/{row.daily_token_limit} daily tokens"
)
Admins can call get_usage_limits() on any key object returned from search() or get_by_access_key().
Alert on threshold
Poll usage and fire an alert when consumption crosses a threshold.
import time
def usage_alerts(rows, threshold=0.8):
alerts = []
for row in rows:
req_count = getattr(row, "daily_request_count", 0) or 0
req_limit = getattr(row, "daily_request_limit", 0) or 0
tok_count = getattr(row, "daily_token_count", 0) or 0
tok_limit = getattr(row, "daily_token_limit", 0) or 0
label = getattr(row, "model", None) or "global"
if req_limit and req_count / req_limit >= threshold:
alerts.append(f"{label}: requests {req_count}/{req_limit}")
if tok_limit and tok_count / tok_limit >= threshold:
alerts.append(f"{label}: tokens {tok_count}/{tok_limit}")
return alerts
for _ in range(3):
rows = aix.APIKey.get_usage_limits(model=MODEL_ID)
alerts = usage_alerts(rows, threshold=0.8)
if alerts:
for alert in alerts:
print("ALERT:", alert)
else:
print("Usage within threshold")
time.sleep(60)
Detect rate-limit errors
Rate-limit enforcement surfaces as HTTP 497 (aiXplain per-minute limit) or 429 (standard). Check for both when calling models on a rate-limited key.
from aixplain import Aixplain
aix = Aixplain(api_key="YOUR_MEMBER_API_KEY")
model = aix.Model.get("669a63646eb56306647e1091")
try:
result = model.run(text="Hello")
print(result.data)
except Exception as exc:
code = getattr(exc, "status_code", None)
if code in (429, 497):
print("Rate limit hit — back off and retry")
else:
raise
Delete a key
key.delete()
Deletion is immediate and irreversible. Any application using the key will receive authentication errors.