How to configure model rate limiting

This guide provides detailed instructions for utilizing the Model Rate Limiting feature in aiXplain to manage API usage effectively. Follow these steps to integrate and configure rate limits in your workflow.

Overview

The Model Rate Limiting feature enables administrators to:

Set rate limits for text generation models (token-based models).
Monitor and update these limits for specific API keys.
Enforce constraints on API calls to manage resource usage efficiently.

Step 1: Insert Admin Access Key

Admin access keys are required to configure and monitor rate limits. These keys are solely for management and cannot be used for inference. Create an admin key via the Integration page on aiXplain's Studio.

import os
os.environ["TEAM_API_KEY"] = "ADMIN_API_KEY" # Admin API key

Step 2: Set Rate Limits

Creating a Member API Key with Rate Limits

Create a member access key with a specific name or label.
Define asset-specific rate limits for individual models accessible by this API key.
Define global rate limits applicable across all accessible assets by this API key.

Supported rate limits include:

Tokens per minute/day
Requests per minute/day

You can optionally, add a budget specifying the total credits available or update the value to add more credits.

note

Tokens include both input and output tokens. Global limits do not override asset-specific limits; both are enforced together.

from aixplain.factories import APIKeyFactory
from aixplain.modules import APIKey, APIKeyGlobalLimits
from datetime import datetime

api_key = APIKeyFactory.create(
    name="Test API Key",
    asset_limits=[
        APIKeyGlobalLimits(
            model="6661df926d36df3b878e0697", # The ID of the model to be rate-limited.
            token_per_minute=10,
            token_per_day=30,
            request_per_minute=2,
            request_per_day=2
        )
    ],
    global_limits=APIKeyGlobalLimits(
        token_per_minute=100,
        token_per_day=1000,
        request_per_day=1000,
        request_per_minute=100
    ),
    budget=1000, #optional
    expires_at=datetime(2024, 11, 29) # Set expiration date based on midnight (UTC) 
)
api_key.__dict__

Show output

Step 3: View Rate Limits

To review rate limits for an API key, use the code below.

api_key_info = APIKeyFactory.get(api_key=api_key.access_key).__dict__
api_key_info

Show output

global_limits = api_key_info['global_limits']
print("Global limits:", global_limits.__dict__)

# Loop through each asset limit
for asset_limit in api_key_info['asset_limits']:
   print("Asset limits:", asset_limit.__dict__)  # Print the details

Show output

Step 4: Monitor API Key Usage

Monitor overall API key usage

usage = api_key.get_usage()
for key in usage:
  print(key.__dict__)

Monitor usage for a specific model

from aixplain.factories import APIKeyFactory

api_limit = APIKeyFactory.get_usage_limits(api_key="Key",asset_id="6661df926d36df3b878e0697")
for key in api_limit:
  print(key.__dict__)

Step 5: Update Rate Limits

You can update an existing API key’s limits and budget as needed.

note

Updating limits of an existing API key will apply to the next timeframe (day or minute) as per the limit type.

api_key_temp = APIKeyFactory.get(api_key.access_key)
print('Updating key limits: ' + api_key_temp.access_key)

# Update budget
api_key_temp.budget = 200

# Update global rate limits
api_key_temp.global_limits.token_per_day = 50
api_key_temp.global_limits.token_per_minute = 500

# Update rate limits of a specific asset
for i, asset_limit in enumerate(api_key_temp.asset_limits):
  if asset_limit.model.id == "6414bd3cd09663e9225130e8":
    api_key_temp.asset_limits[i].token_per_minute = 6000
    break

api_key_temp = APIKeyFactory.update(api_key_temp)


print("Budget: ", api_key_temp.budget)
print("Global limits:", api_key_temp.global_limits.__dict__)
for asset in api_key_temp.asset_limits:
  print("Asset limits:", asset.__dict__)

Show output

Step 6: Delete an API Key

To delete an API key when no longer in use, use the following code.

api_key.delete()

Tips for Effective Rate Limiting

Use admin keys exclusively for configuration and monitoring.
Regularly check usage to adjust limits as per operational needs.
Apply both global and asset-specific limits for detailed control.

By leveraging the Model Rate Limiting feature, you can efficiently manage API usage and resource allocation for your models, enhancing the efficiency of your agents and workflows.

Overview​

Step 1: Insert Admin Access Key​

Step 2: Set Rate Limits​

Creating a Member API Key with Rate Limits​

Step 3: View Rate Limits​

Step 4: Monitor API Key Usage​

Monitor overall API key usage​

Monitor usage for a specific model​

Step 5: Update Rate Limits​

Step 6: Delete an API Key​

Tips for Effective Rate Limiting​

Overview

Step 1: Insert Admin Access Key

Step 2: Set Rate Limits

Creating a Member API Key with Rate Limits

Step 3: View Rate Limits

Step 4: Monitor API Key Usage

Monitor overall API key usage

Monitor usage for a specific model

Step 5: Update Rate Limits

Step 6: Delete an API Key

Tips for Effective Rate Limiting