aixplain.modules.model.llm_model
__author__
Copyright 2024 The aiXplain SDK authors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: Thiago Castro Ferreira, Shreyas Sharma and Lucas Pavanelli Date: June 4th 2024 Description: Large Language Model Class
LLM Objects
class LLM(Model)
Ready-to-use LLM model. This model can be run in both synchronous and asynchronous manner.
Attributes:
idText - ID of the ModelnameText - Name of the ModeldescriptionText, optional - description of the model. Defaults to "".api_keyText, optional - API key of the Model. Defaults to None.urlText, optional - endpoint of the model. Defaults to config.MODELS_RUN_URL.supplierUnion[Dict, Text, Supplier, int], optional - supplier of the asset. Defaults to "aiXplain".versionText, optional - version of the model. Defaults to "1.0".functionText, optional - model AI function. Defaults to None.urlstr - URL to run the model.backend_urlstr - URL of the backend.name0 Dict, optional - model price. Defaults to None.name1 FunctionType, optional - type of the function. Defaults to FunctionType.AI.name2 - Any additional Model info to be saved
__init__
def __init__(id: Text,
name: Text,
description: Text = "",
api_key: Optional[Text] = None,
supplier: Union[Dict, Text, Supplier, int] = "aiXplain",
version: Optional[Text] = None,
function: Optional[Function] = None,
is_subscribed: bool = False,
cost: Optional[Dict] = None,
temperature: float = 0.001,
function_type: Optional[FunctionType] = FunctionType.AI,
**additional_info) -> None
Initialize a new LLM instance.
Arguments:
idText - ID of the LLM model.nameText - Name of the LLM model.descriptionText, optional - Description of the model. Defaults to "".api_keyText, optional - API key for the model. Defaults to None.supplierUnion[Dict, Text, Supplier, int], optional - Supplier of the model. Defaults to "aiXplain".versionText, optional - Version of the model. Defaults to "1.0".functionFunction, optional - Model's AI function. Must be Function.TEXT_GENERATION.is_subscribedbool, optional - Whether the user is subscribed. Defaults to False.costDict, optional - Cost of the model. Defaults to None.temperaturefloat, optional - Default temperature for text generation. Defaults to 0.001.name0 FunctionType, optional - Type of the function. Defaults to FunctionType.AI.name1 - Any additional model info to be saved.
Raises:
name2 - If function is not Function.TEXT_GENERATION.
run
def run(data: Text,
context: Optional[Text] = None,
prompt: Optional[Text] = None,
history: Optional[List[Dict]] = None,
temperature: Optional[float] = None,
max_tokens: int = 128,
top_p: float = 1.0,
name: Text = "model_process",
timeout: float = 300,
parameters: Optional[Dict] = None,
wait_time: float = 0.5,
stream: bool = False) -> Union[ModelResponse, ModelResponseStreamer]
Run the LLM model synchronously to generate text.
This method runs the LLM model to generate text based on the provided input. It supports both single-turn and conversational interactions, with options for streaming responses.
Arguments:
dataText - The input text or last user utterance for text generation.contextOptional[Text], optional - System message or context for the model. Defaults to None.promptOptional[Text], optional - Prompt template or prefix to prepend to the input. Defaults to None.historyOptional[List[Dict]], optional - Conversation history in OpenAI format (e.g., [{"role": "assistant", "content": "Hello!"}, ...]). Defaults to None.temperatureOptional[float], optional - Sampling temperature for text generation. Higher values make output more random. If None, uses the model's default. Defaults to None.max_tokensint, optional - Maximum number of tokens to generate. Defaults to 128.top_pfloat, optional - Nucleus sampling parameter. Only tokens with cumulative probability < top_p are considered. Defaults to 1.0.nameText, optional - Identifier for this model run. Useful for logging. Defaults to "model_process".timeoutfloat, optional - Maximum time in seconds to wait for completion. Defaults to 300.parametersOptional[Dict], optional - Additional model-specific parameters. Defaults to None.context0 float, optional - Time in seconds between polling attempts. Defaults to 0.5.context1 bool, optional - Whether to stream the model's output tokens. Defaults to False.
Returns:
Union[ModelResponse, ModelResponseStreamer]: If stream=False, returns a ModelResponse containing the complete generated text and metadata. If stream=True, returns a ModelResponseStreamer that yields tokens as they're generated.
run_async
def run_async(data: Text,
context: Optional[Text] = None,
prompt: Optional[Text] = None,
history: Optional[List[Dict]] = None,
temperature: Optional[float] = None,
max_tokens: int = 128,
top_p: float = 1.0,
name: Text = "model_process",
parameters: Optional[Dict] = None) -> ModelResponse
Run the LLM model asynchronously to generate text.
This method starts an asynchronous text generation task and returns immediately with a response containing a polling URL. The actual result can be retrieved later using the polling URL.
Arguments:
dataText - The input text or last user utterance for text generation.contextOptional[Text], optional - System message or context for the model. Defaults to None.promptOptional[Text], optional - Prompt template or prefix to prepend to the input. Defaults to None.historyOptional[List[Dict]], optional - Conversation history in OpenAI format (e.g., [{"role": "assistant", "content": "Hello!"}, ...]). Defaults to None.temperatureOptional[float], optional - Sampling temperature for text generation. Higher values make output more random. If None, uses the model's default. Defaults to None.max_tokensint, optional - Maximum number of tokens to generate. Defaults to 128.top_pfloat, optional - Nucleus sampling parameter. Only tokens with cumulative probability < top_p are considered. Defaults to 1.0.nameText, optional - Identifier for this model run. Useful for logging. Defaults to "model_process".parametersOptional[Dict], optional - Additional model-specific parameters. Defaults to None.
Returns:
ModelResponse- A response object containing:- status (ResponseStatus): Status of the request (e.g., IN_PROGRESS)
- url (str): URL to poll for the final result
- data (str): Empty string (result not available yet)
- details (Dict): Additional response details
- completed (bool): False (task not completed yet)
- error_message (str): Error message if request failed Other fields may be present depending on the response.