aixplain.modules.dataset
__author__
Copyright 2022 The aiXplain SDK authors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: Duraikrishna Selvaraju, Thiago Castro Ferreira, Shreyas Sharma and Lucas Pavanelli Date: October 28th 2022 Description: Datasets Class
Dataset Objects
class Dataset(Asset)
Dataset is a collection of data intended to be used for a specific function. Different from corpus, a dataset is a representative sample of a specific phenomenon to a specific AI task. aiXplain also counts with an extensive collection of datasets for training, infer and benchmark various tasks like Translation, Speech Recognition, Diacritization, Sentiment Analysis, and much more.
Attributes:
id
Text - Dataset IDname
Text - Dataset Namedescription
Text - Dataset descriptionfunction
Function - Function for which the dataset is intented tosource_data
Dict[Any, Data] - List of input Data to the functiontarget_data
Dict[Any, List[Data]] - List of Multi-reference Data which is expected to be outputted by the functiononboard_status
OnboardStatus - onboard statushypotheses
Dict[Any, Data], optional - dataset's hypotheses, i.e. model outputs based on the source data. Defaults to {}.metadata
Dict[Any, Data], optional - dataset's metadata. Defaults to {}.tags
List[Text], optional - tags that describe the dataset. Defaults to [].name
0 Optional[License], optional - Dataset License. Defaults to None.name
1 Privacy, optional - Dataset Privacy. Defaults to Privacy.PRIVATE.name
2 Text, optional - Dataset Supplier. Defaults to "aiXplain".name
3 Text, optional - Dataset Version. Defaults to "1.0".
__init__
def __init__(id: Text,
name: Text,
description: Text,
function: Function,
source_data: Dict[Any, Data],
target_data: Dict[Any, List[Data]],
onboard_status: OnboardStatus,
hypotheses: Dict[Any, Data] = {},
metadata: Dict[Any, Data] = {},
tags: List[Text] = [],
license: Optional[License] = None,
privacy: Privacy = Privacy.PRIVATE,
supplier: Text = "aiXplain",
version: Text = "1.0",
length: Optional[int] = None,
**kwargs) -> None
Dataset Class.
Description: Dataset is a collection of data intended to be used for a specific function. Different from corpus, a dataset is a representative sample of a specific phenomenon to a specific AI task. aiXplain also counts with an extensive collection of datasets for training, infer and benchmark various tasks like Translation, Speech Recognition, Diacritization, Sentiment Analysis, and much more.
Arguments:
id
Text - Dataset IDname
Text - Dataset Namedescription
Text - Dataset descriptionfunction
Function - Function for which the dataset is intented tosource_data
Dict[Any, Data] - List of input Data to the functiontarget_data
Dict[Any, List[Data]] - List of Multi-reference Data which is expected to be outputted by the functiononboard_status
OnboardStatus - onboard statushypotheses
Dict[Any, Data], optional - dataset's hypotheses, i.e. model outputs based on the source data. Defaults to {}.metadata
Dict[Any, Data], optional - dataset's metadata. Defaults to {}.tags
List[Text], optional - tags that describe the dataset. Defaults to [].name
0 Optional[License], optional - Dataset License. Defaults to None.name
1 Privacy, optional - Dataset Privacy. Defaults to Privacy.PRIVATE.name
2 Text, optional - Dataset Supplier. Defaults to "aiXplain".name
3 Text, optional - Dataset Version. Defaults to "1.0".name
4 Optional[int], optional - Number of rows in the Dataset. Defaults to None.
__repr__
def __repr__() -> str
Return a string representation of the Dataset instance.
Returns:
str
- A string in the format "<Dataset: name>".
delete
def delete() -> None
Delete this dataset from the aiXplain platform.
This method permanently removes the dataset from the platform. The operation can only be performed by the dataset owner.
Returns:
None
Raises:
Exception
- If the deletion fails, either because:- The dataset doesn't exist
- The user is not the owner
- There's a network/server error