Skip to main content

module aixplain.modules.dataset


class Dataset

Dataset is a collection of data intended to be used for a specific function. Different from corpus, a dataset is a representative sample of a specific phenomenon to a specific AI task. aiXplain also counts with an extensive collection of datasets for training, infer and benchmark various tasks like Translation, Speech Recognition, Diacritization, Sentiment Analysis, and much more.

Attributes:

  • id (Text): Dataset ID
  • name (Text): Dataset Name
  • description (Text): Dataset description
  • function (Function): Function for which the dataset is intented to
  • source_data (Dict[Any, Data]): List of input Data to the function
  • target_data (Dict[Any, List[Data]]): List of Multi-reference Data which is expected to be outputted by the function
  • onboard_status (OnboardStatus): onboard status
  • hypotheses (Dict[Any, Data], optional): dataset's hypotheses, i.e. model outputs based on the source data. Defaults to .
  • metadata (Dict[Any, Data], optional): dataset's metadata. Defaults to .
  • tags (List[Text], optional): tags that describe the dataset. Defaults to [].
  • license (Optional[License], optional): Dataset License. Defaults to None.
  • privacy (Privacy, optional): Dataset Privacy. Defaults to Privacy.PRIVATE.
  • supplier (Text, optional): Dataset Supplier. Defaults to "aiXplain".
  • version (Text, optional): Dataset Version. Defaults to "1.0".

method __init__

__init__(
id: str,
name: str,
description: str,
function: Function,
source_data: Dict[Any, Data],
target_data: Dict[Any, List[Data]],
onboard_status: OnboardStatus,
hypotheses: Dict[Any, Data] = {},
metadata: Dict[Any, Data] = {},
tags: List[str] = [],
license: Optional[License] = None,
privacy: Privacy = <Privacy.PRIVATE: 'Private'>,
supplier: str = 'aiXplain',
version: str = '1.0',
length: Optional[int] = None,
**kwargs
)None

Dataset Class.

Description: Dataset is a collection of data intended to be used for a specific function. Different from corpus, a dataset is a representative sample of a specific phenomenon to a specific AI task. aiXplain also counts with an extensive collection of datasets for training, infer and benchmark various tasks like Translation, Speech Recognition, Diacritization, Sentiment Analysis, and much more.

Args:

  • id (Text): Dataset ID
  • name (Text): Dataset Name
  • description (Text): Dataset description
  • function (Function): Function for which the dataset is intented to
  • source_data (Dict[Any, Data]): List of input Data to the function
  • target_data (Dict[Any, List[Data]]): List of Multi-reference Data which is expected to be outputted by the function
  • onboard_status (OnboardStatus): onboard status
  • hypotheses (Dict[Any, Data], optional): dataset's hypotheses, i.e. model outputs based on the source data. Defaults to .
  • metadata (Dict[Any, Data], optional): dataset's metadata. Defaults to .
  • tags (List[Text], optional): tags that describe the dataset. Defaults to [].
  • license (Optional[License], optional): Dataset License. Defaults to None.
  • privacy (Privacy, optional): Dataset Privacy. Defaults to Privacy.PRIVATE.
  • supplier (Text, optional): Dataset Supplier. Defaults to "aiXplain".
  • version (Text, optional): Dataset Version. Defaults to "1.0".
  • length (Optional[int], optional): Number of rows in the Dataset. Defaults to None.

method delete

delete()None

Delete Dataset service