module aixplain.modules.dataset
class Dataset
Dataset is a collection of data intended to be used for a specific function. Different from corpus, a dataset is a representative sample of a specific phenomenon to a specific AI task. aiXplain also counts with an extensive collection of datasets for training, infer and benchmark various tasks like Translation, Speech Recognition, Diacritization, Sentiment Analysis, and much more.
Attributes:
id
(Text): Dataset IDname
(Text): Dataset Namedescription
(Text): Dataset descriptionfunction
(Function): Function for which the dataset is intented tosource_data
(Dict[Any, Data]): List of input Data to the functiontarget_data
(Dict[Any, List[Data]]): List of Multi-reference Data which is expected to be outputted by the functiononboard_status
(OnboardStatus): onboard statushypotheses
(Dict[Any, Data], optional): dataset's hypotheses, i.e. model outputs based on the source data. Defaults to .metadata
(Dict[Any, Data], optional): dataset's metadata. Defaults to .tags
(List[Text], optional): tags that describe the dataset. Defaults to [].license
(Optional[License], optional): Dataset License. Defaults to None.privacy
(Privacy, optional): Dataset Privacy. Defaults to Privacy.PRIVATE.supplier
(Text, optional): Dataset Supplier. Defaults to "aiXplain".version
(Text, optional): Dataset Version. Defaults to "1.0".
method __init__
__init__(
id: str,
name: str,
description: str,
function: Function,
source_data: Dict[Any, Data],
target_data: Dict[Any, List[Data]],
onboard_status: OnboardStatus,
hypotheses: Dict[Any, Data] = {},
metadata: Dict[Any, Data] = {},
tags: List[str] = [],
license: Optional[License] = None,
privacy: Privacy = <Privacy.PRIVATE: 'Private'>,
supplier: str = 'aiXplain',
version: str = '1.0',
length: Optional[int] = None,
**kwargs
) → None
Dataset Class.
Description: Dataset is a collection of data intended to be used for a specific function. Different from corpus, a dataset is a representative sample of a specific phenomenon to a specific AI task. aiXplain also counts with an extensive collection of datasets for training, infer and benchmark various tasks like Translation, Speech Recognition, Diacritization, Sentiment Analysis, and much more.
Args:
id
(Text): Dataset IDname
(Text): Dataset Namedescription
(Text): Dataset descriptionfunction
(Function): Function for which the dataset is intented tosource_data
(Dict[Any, Data]): List of input Data to the functiontarget_data
(Dict[Any, List[Data]]): List of Multi-reference Data which is expected to be outputted by the functiononboard_status
(OnboardStatus): onboard statushypotheses
(Dict[Any, Data], optional): dataset's hypotheses, i.e. model outputs based on the source data. Defaults to .metadata
(Dict[Any, Data], optional): dataset's metadata. Defaults to .tags
(List[Text], optional): tags that describe the dataset. Defaults to [].license
(Optional[License], optional): Dataset License. Defaults to None.privacy
(Privacy, optional): Dataset Privacy. Defaults to Privacy.PRIVATE.supplier
(Text, optional): Dataset Supplier. Defaults to "aiXplain".version
(Text, optional): Dataset Version. Defaults to "1.0".length
(Optional[int], optional): Number of rows in the Dataset. Defaults to None.
method delete
delete() → None
Delete Dataset service