module aixplain.processes.data_onboarding.process_media_files
Global Variables
- AUDIO_MAX_SIZE
- IMAGE_TEXT_MAX_SIZE
function compress_folder
compress_folder(folder_path: str)
function run
run(
metadata: MetaData,
paths: List,
folder: Path,
batch_size: int = 100
) → Tuple[List[File], int, int, int, int]
Process a list of local media files, compress and upload them to pre-signed URLs in S3
Explanation: Each media on "paths" is processed. If the media is in a public link, this link is added into an index CSV file. If the media is in a local path, it will be copied into a local folder and its path will be added to the index CSV file. The medias are processed in batches such that at each "batch_size" medias, the index CSV file is uploaded into a pre-signed URL in s3 and reset. If the medias are stored locally, the local folder is compressed into a .tgz file and also uploaded into S3.
Args:
metadata
(MetaData): meta data of the assetpaths
(List): list of paths to local filesfolder
(Path): local folder to save compressed files before upload them to s3.
Returns:
Tuple[List[File], int, int, int]
: list of s3 links; data, start and end columns index, and number of rows