Skip to main content

module aixplain.processes.data_onboarding.process_media_files

Global Variables

  • AUDIO_MAX_SIZE
  • IMAGE_TEXT_MAX_SIZE

function compress_folder

compress_folder(folder_path: str)

function run

run(
metadata: MetaData,
paths: List,
folder: Path,
batch_size: int = 100
) → Tuple[List[File], int, int, int, int]

Process a list of local media files, compress and upload them to pre-signed URLs in S3

Explanation: Each media on "paths" is processed. If the media is in a public link, this link is added into an index CSV file. If the media is in a local path, it will be copied into a local folder and its path will be added to the index CSV file. The medias are processed in batches such that at each "batch_size" medias, the index CSV file is uploaded into a pre-signed URL in s3 and reset. If the medias are stored locally, the local folder is compressed into a .tgz file and also uploaded into S3.

Args:

  • metadata (MetaData): meta data of the asset
  • paths (List): list of paths to local files
  • folder (Path): local folder to save compressed files before upload them to s3.

Returns:

  • Tuple[List[File], int, int, int]: list of s3 links; data, start and end columns index, and number of rows