Skip to main content

Finetune an LLM

How to fine-tune a model

This guide walks you through the process of fine-tuning models using the aiXplain SDK. Learn how to select datasets and configure fine-tuning settings.

Generic Example (Template)

from aixplain.factories import DatasetFactory, ModelFactory, FinetuneFactory

dataset = DatasetFactory.get("...") # specify Data ID
model = ModelFactory.get("...") # specify Model ID

finetune = FinetuneFactory.create(
"finetuned_model",
[dataset],
model
)

finetuned_model = finetune.start()

finetuned_model.check_finetune_status()

FineTune Examples

The following examples cover the four supported FineTune use cases. View them in this documentation or by opening their corresponding Google Colab notebooks.

Text generation (passthrough)
Open In Colab

Text generation (hosted)
Open In Colab

Translation
Open In Colab

note

Passthrough: models hosted on third-party infrastructure
Hosted: models hosted on aiXplain's infrastructure

Imports

from aixplain.factories import DatasetFactory, ModelFactory, FinetuneFactory
from aixplain.enums import Function, Language # for search
from aixplain.modules.finetune import Hyperparameters # for hosted models

Select Model & Datasets

info

Datasets are currently private, so you must first onboard the datasets in the examples below (or similar) to follow along.
See our guide on How to upload a dataset.

Model

# Choose 'exactly one' model
model_list = ModelFactory.list(
function=Function.TEXT_GENERATION,
is_finetunable=True
)["results"]

for model in model_list:
print(model.__dict__)
Show output
selected_model = ModelFactory.get("640b517694bf816d35a59125")
selected_model.__dict__
Show output

Dataset

# Choose 'one or more' datasets
dataset_list = DatasetFactory.list(
function=Function.TEXT_GENERATION,
page_size=5
)["results"]

for dataset in dataset_list:
print(dataset.__dict__)
Show output
selected_dataset = DatasetFactory.get("6501ea64b61fed7fe5976c49")
selected_dataset.__dict__
Show output

Create a FineTune

Use FinetuneFactory to create a FineTune object and the cost method to check the estimated training, hosting and inference costs.

finetune = FinetuneFactory.create(
"<UNIQUE_FINETUNE_NAME>",
[selected_dataset],
selected_model
)

finetune.__dict__
Show output

Cost

finetune.cost.to_dict()
Show output

Starting a FineTune

Call the start method to begin fine-tuning and the check_finetune_status method to check its status.

finetune_model = finetune.start()
status = finetune_model.check_finetune_status()

Status can be one of the following: onboarding, onboarded, hidden, training, deleted, enabling, disabled, failed, deleting.

tip

You can use a loop to check the status.

import time

while status != "onboarded":
status = finetune_model.check_finetune_status()
print(f"Current status: {status}")
time.sleep(10)

Once onboarded, you are ready to use the model as any other which can be integrated into your agents, providing customized solutions! 🥳