aixplain.modules.data
__author__
Copyright 2022 The aiXplain SDK authors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: aiXplain team Date: March 20th 2023 Description: Data Class
Data Objects
class Data()
A class representing a collection of data samples of the same type and genre.
This class provides functionality for managing data in the aiXplain platform, supporting various data types, languages, and storage formats. It can handle both structured (e.g., CSV) and unstructured data files.
Attributes:
idText - ID of the data collection.nameText - Name of the data collection.dtypeDataType - Type of data (e.g., text, audio, image).privacyPrivacy - Privacy settings for the data.onboard_statusOnboardStatus - Current onboarding status.data_columnOptional[Any] - Column identifier where data is stored in structured files.start_columnOptional[Any] - Column identifier for start indexes in structured files.end_columnOptional[Any] - Column identifier for end indexes in structured files.filesList[File] - List of files containing the data instances.languagesList[Language] - List of languages present in the data.name0 DataSubtype - Subtype categorization of the data.name1 Optional[int] - Number of samples/rows in the data collection.name2 dict - Additional keyword arguments for extensibility.
__init__
def __init__(id: Text,
name: Text,
dtype: DataType,
privacy: Privacy,
onboard_status: OnboardStatus,
data_column: Optional[Any] = None,
start_column: Optional[Any] = None,
end_column: Optional[Any] = None,
files: List[File] = [],
languages: List[Language] = [],
dsubtype: DataSubtype = DataSubtype.OTHER,
length: Optional[int] = None,
**kwargs) -> None
Initialize a new Data instance.
Arguments:
idText - ID of the data collection.nameText - Name of the data collection.dtypeDataType - Type of data (e.g., text, audio, image).privacyPrivacy - Privacy settings for the data.onboard_statusOnboardStatus - Current onboarding status of the data.data_columnOptional[Any], optional - Column identifier where data is stored in structured files (e.g., CSV). If None, defaults to the value of name.start_columnOptional[Any], optional - Column identifier where start indexes are stored in structured files. Defaults to None.end_columnOptional[Any], optional - Column identifier where end indexes are stored in structured files. Defaults to None.filesList[File], optional - List of files containing the data instances. Defaults to empty list.languagesList[Language], optional - List of languages present in the data. Can be provided as Language enums or language codes. Defaults to empty list.name0 DataSubtype, optional - Subtype categorization of the data (e.g., age, topic, race, split). Defaults to DataSubtype.OTHER.name1 Optional[int], optional - Number of samples/rows in the data collection. Defaults to None.name2 - Additional keyword arguments for extensibility.