Skip to main content

module aixplain.factories.corpus_factory


class CorpusFactory


classmethod create

create(
name: str,
description: str,
license: License,
content_path: Union[str, Path, List[Union[str, Path]]],
schema: List[Union[Dict, MetaData]],
ref_data: List[Any] = [],
tags: List[str] = [],
functions: List[Function] = [],
privacy: Privacy = <Privacy.PRIVATE: 'Private'>,
error_handler: ErrorHandler = <ErrorHandler.SKIP: 'skip'>,
api_key: Optional[str] = None
) → Dict

Asynchronous call to Upload a corpus to the user's dashboard.

Args:

  • name (Text): corpus name
  • description (Text): corpus description
  • license (License): corpus license
  • content_path (Union[Union[Text, Path], List[Union[Text, Path]]]): path to .csv files containing the data
  • schema (List[Union[Dict, MetaData]]): meta data
  • ref_data (Optional[List[Union[Text, Data]]], optional): referencing data which already exists and should be part of the corpus. Defaults to [].
  • tags (Optional[List[Text]], optional): tags that explain the corpus. Defaults to [].
  • functions (Optional[List[Function]], optional): AI functions for which the corpus may be used. Defaults to [].
  • privacy (Optional[Privacy], optional): visibility of the corpus. Defaults to Privacy.PRIVATE.
  • error_handler (ErrorHandler, optional): how to handle failed rows in the data asset. Defaults to ErrorHandler.SKIP.
  • api_key (Optional[Text]): team api key. Defaults to None.

Returns:

  • Dict: response dict

classmethod create_asset_from_id

create_asset_from_id(corpus_id: str) → Corpus

classmethod get

get(corpus_id: str) → Corpus

Create a 'Corpus' object from corpus id

Args:

  • corpus_id (Text): Corpus ID of required corpus.

Returns:

  • Corpus: Created 'Corpus' object

classmethod get_assets_from_page

get_assets_from_page(
page_number: int = 1,
task: Optional[Function] = None,
language: Optional[str] = None
) → List[Corpus]

Get the list of corpora from a given page. Additional task and language filters can be also be provided

Args:

  • page_number (int, optional): Page from which corpora are to be listed. Defaults to 1.
  • task (Function, optional): Task of listed corpora. Defaults to None.
  • language (Text, optional): language of listed corpora. Defaults to None.

Returns:

  • List[Corpus]: List of corpora based on given filters

classmethod list

list(
query: Optional[str] = None,
function: Optional[Function] = None,
language: Optional[Language, List[Language]] = None,
data_type: Optional[DataType] = None,
license: Optional[License] = None,
page_number: int = 0,
page_size: int = 20
) → Dict

Corpus Listing

Args:

  • query (Optional[Text], optional): search query. Defaults to None.
  • function (Optional[Function], optional): function filter. Defaults to None.
  • language (Optional[Union[Language, List[Language]]], optional): language filter. Defaults to None.
  • data_type (Optional[DataType], optional): data type filter. Defaults to None.
  • license (Optional[License], optional): license filter. Defaults to None.
  • page_number (int, optional): page number. Defaults to 0.
  • page_size (int, optional): page size. Defaults to 20.

Returns:

  • Dict: list of corpora in agreement with the filters, page number, page total and total elements