aixplain.modules.model.llm_model
__author__
Copyright 2024 The aiXplain SDK authors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: Thiago Castro Ferreira, Shreyas Sharma and Lucas Pavanelli Date: June 4th 2024 Description: Large Language Model Class
LLM Objects
class LLM(Model)
Ready-to-use LLM model. This model can be run in both synchronous and asynchronous manner.
Attributes:
id
Text - ID of the Modelname
Text - Name of the Modeldescription
Text, optional - description of the model. Defaults to "".api_key
Text, optional - API key of the Model. Defaults to None.url
Text, optional - endpoint of the model. Defaults to config.MODELS_RUN_URL.supplier
Union[Dict, Text, Supplier, int], optional - supplier of the asset. Defaults to "aiXplain".version
Text, optional - version of the model. Defaults to "1.0".function
Text, optional - model AI function. Defaults to None.url
str - URL to run the model.backend_url
str - URL of the backend.name
0 Dict, optional - model price. Defaults to None.name
1 FunctionType, optional - type of the function. Defaults to FunctionType.AI.name
2 - Any additional Model info to be saved
__init__
def __init__(id: Text,
name: Text,
description: Text = "",
api_key: Optional[Text] = None,
supplier: Union[Dict, Text, Supplier, int] = "aiXplain",
version: Optional[Text] = None,
function: Optional[Function] = None,
is_subscribed: bool = False,
cost: Optional[Dict] = None,
temperature: float = 0.001,
function_type: Optional[FunctionType] = FunctionType.AI,
**additional_info) -> None
Initialize a new LLM instance.
Arguments:
id
Text - ID of the LLM model.name
Text - Name of the LLM model.description
Text, optional - Description of the model. Defaults to "".api_key
Text, optional - API key for the model. Defaults to None.supplier
Union[Dict, Text, Supplier, int], optional - Supplier of the model. Defaults to "aiXplain".version
Text, optional - Version of the model. Defaults to "1.0".function
Function, optional - Model's AI function. Must be Function.TEXT_GENERATION.is_subscribed
bool, optional - Whether the user is subscribed. Defaults to False.cost
Dict, optional - Cost of the model. Defaults to None.temperature
float, optional - Default temperature for text generation. Defaults to 0.001.name
0 FunctionType, optional - Type of the function. Defaults to FunctionType.AI.name
1 - Any additional model info to be saved.
Raises:
name
2 - If function is not Function.TEXT_GENERATION.
run
def run(data: Text,
context: Optional[Text] = None,
prompt: Optional[Text] = None,
history: Optional[List[Dict]] = None,
temperature: Optional[float] = None,
max_tokens: int = 128,
top_p: float = 1.0,
name: Text = "model_process",
timeout: float = 300,
parameters: Optional[Dict] = None,
wait_time: float = 0.5,
stream: bool = False) -> Union[ModelResponse, ModelResponseStreamer]
Run the LLM model synchronously to generate text.
This method runs the LLM model to generate text based on the provided input. It supports both single-turn and conversational interactions, with options for streaming responses.
Arguments:
data
Text - The input text or last user utterance for text generation.context
Optional[Text], optional - System message or context for the model. Defaults to None.prompt
Optional[Text], optional - Prompt template or prefix to prepend to the input. Defaults to None.history
Optional[List[Dict]], optional - Conversation history in OpenAI format (e.g., [{"role": "assistant", "content": "Hello!"}, ...]). Defaults to None.temperature
Optional[float], optional - Sampling temperature for text generation. Higher values make output more random. If None, uses the model's default. Defaults to None.max_tokens
int, optional - Maximum number of tokens to generate. Defaults to 128.top_p
float, optional - Nucleus sampling parameter. Only tokens with cumulative probability < top_p are considered. Defaults to 1.0.name
Text, optional - Identifier for this model run. Useful for logging. Defaults to "model_process".timeout
float, optional - Maximum time in seconds to wait for completion. Defaults to 300.parameters
Optional[Dict], optional - Additional model-specific parameters. Defaults to None.context
0 float, optional - Time in seconds between polling attempts. Defaults to 0.5.context
1 bool, optional - Whether to stream the model's output tokens. Defaults to False.
Returns:
Union[ModelResponse, ModelResponseStreamer]: If stream=False, returns a ModelResponse containing the complete generated text and metadata. If stream=True, returns a ModelResponseStreamer that yields tokens as they're generated.
run_async
def run_async(data: Text,
context: Optional[Text] = None,
prompt: Optional[Text] = None,
history: Optional[List[Dict]] = None,
temperature: Optional[float] = None,
max_tokens: int = 128,
top_p: float = 1.0,
name: Text = "model_process",
parameters: Optional[Dict] = None) -> ModelResponse
Run the LLM model asynchronously to generate text.
This method starts an asynchronous text generation task and returns immediately with a response containing a polling URL. The actual result can be retrieved later using the polling URL.
Arguments:
data
Text - The input text or last user utterance for text generation.context
Optional[Text], optional - System message or context for the model. Defaults to None.prompt
Optional[Text], optional - Prompt template or prefix to prepend to the input. Defaults to None.history
Optional[List[Dict]], optional - Conversation history in OpenAI format (e.g., [{"role": "assistant", "content": "Hello!"}, ...]). Defaults to None.temperature
Optional[float], optional - Sampling temperature for text generation. Higher values make output more random. If None, uses the model's default. Defaults to None.max_tokens
int, optional - Maximum number of tokens to generate. Defaults to 128.top_p
float, optional - Nucleus sampling parameter. Only tokens with cumulative probability < top_p are considered. Defaults to 1.0.name
Text, optional - Identifier for this model run. Useful for logging. Defaults to "model_process".parameters
Optional[Dict], optional - Additional model-specific parameters. Defaults to None.
Returns:
ModelResponse
- A response object containing:- status (ResponseStatus): Status of the request (e.g., IN_PROGRESS)
- url (str): URL to poll for the final result
- data (str): Empty string (result not available yet)
- details (Dict): Additional response details
- completed (bool): False (task not completed yet)
- error_message (str): Error message if request failed Other fields may be present depending on the response.