module aixplain.modules.pipeline.pipeline
class ObjectDetectionInputs
method __init__
__init__(node=None)
class ObjectDetectionOutputs
method __init__
__init__(node=None)
class ObjectDetection
Object Detection is a computer vision technology that identifies and locates objects within an image, typically by drawing bounding boxes around the detected objects and classifying them into predefined categories.
InputType: video OutputType: text
class LanguageIdentificationInputs
method __init__
__init__(node=None)
class LanguageIdentificationOutputs
method __init__
__init__(node=None)
class LanguageIdentification
Language Identification is the process of automatically determining the language in which a given piece of text is written.
InputType: text OutputType: text
class OcrInputs
method __init__
__init__(node=None)
class OcrOutputs
method __init__
__init__(node=None)
class Ocr
OCR, or Optical Character Recognition, is a technology that converts different types of documents, such as scanned paper documents, PDFs, or images captured by a digital camera, into editable and searchable data by recognizing and extracting text from the images.
InputType: image OutputType: text
class ScriptExecutionInputs
method __init__
__init__(node=None)
class ScriptExecutionOutputs
method __init__
__init__(node=None)
class ScriptExecution
Script Execution refers to the process of running a set of programmed instructions or code within a computing environment, enabling the automated performance of tasks, calculations, or operations as defined by the script.
InputType: text OutputType: text
class ImageLabelDetectionInputs
method __init__
__init__(node=None)
class ImageLabelDetectionOutputs
method __init__
__init__(node=None)
class ImageLabelDetection
Image Label Detection is a function that automatically identifies and assigns descriptive tags or labels to objects, scenes, or elements within an image, enabling easier categorization, search, and analysis of visual content.
InputType: image OutputType: label
class ImageCaptioningInputs
method __init__
__init__(node=None)
class ImageCaptioningOutputs
method __init__
__init__(node=None)
class ImageCaptioning
Image Captioning is a process that involves generating a textual description of an image, typically using machine learning models to analyze the visual content and produce coherent and contextually relevant sentences that describe the objects, actions, and scenes depicted in the image.
InputType: image OutputType: text
class AudioLanguageIdentificationInputs
method __init__
__init__(node=None)
class AudioLanguageIdentificationOutputs
method __init__
__init__(node=None)
class AudioLanguageIdentification
Audio Language Identification is a process that involves analyzing an audio recording to determine the language being spoken.
InputType: audio OutputType: label
class AsrAgeClassificationInputs
method __init__
__init__(node=None)
class AsrAgeClassificationOutputs
method __init__
__init__(node=None)
class AsrAgeClassification
The ASR Age Classification function is designed to analyze audio recordings of speech to determine the speaker's age group by leveraging automatic speech recognition (ASR) technology and machine learning algorithms.
InputType: audio OutputType: label
class BenchmarkScoringMtInputs
method __init__
__init__(node=None)
class BenchmarkScoringMtOutputs
method __init__
__init__(node=None)
class BenchmarkScoringMt
Benchmark Scoring MT is a function designed to evaluate and score machine translation systems by comparing their output against a set of predefined benchmarks, thereby assessing their accuracy and performance.
InputType: text OutputType: label
class AsrGenderClassificationInputs
method __init__
__init__(node=None)
class AsrGenderClassificationOutputs
method __init__
__init__(node=None)
class AsrGenderClassification
The ASR Gender Classification function analyzes audio recordings to determine and classify the speaker's gender based on their voice characteristics.
InputType: audio OutputType: label
class BaseModelInputs
method __init__
__init__(node=None)
class BaseModelOutputs
method __init__
__init__(node=None)
class BaseModel
The Base-Model function serves as a foundational framework designed to provide essential features and capabilities upon which more specialized or advanced models can be built and customized.
InputType: text OutputType: text
class LanguageIdentificationAudioInputs
method __init__
__init__(node=None)
class LanguageIdentificationAudioOutputs
method __init__
__init__(node=None)
class LanguageIdentificationAudio
The Language Identification Audio function analyzes audio input to determine and identify the language being spoken.
InputType: audio OutputType: label
class LoglikelihoodInputs
method __init__
__init__(node=None)
class LoglikelihoodOutputs
method __init__
__init__(node=None)
class Loglikelihood
The Log Likelihood function measures the probability of observing the given data under a specific statistical model by taking the natural logarithm of the likelihood function, thereby transforming the product of probabilities into a sum, which simplifies the process of optimization and parameter estimation.
InputType: text OutputType: number
class VideoEmbeddingInputs
method __init__
__init__(node=None)
class VideoEmbeddingOutputs
method __init__
__init__(node=None)
class VideoEmbedding
Video Embedding is a process that transforms video content into a fixed- dimensional vector representation, capturing essential features and patterns to facilitate tasks such as retrieval, classification, and recommendation.
InputType: video OutputType: embedding
class TextSegmenationInputs
method __init__
__init__(node=None)
class TextSegmenationOutputs
method __init__
__init__(node=None)
class TextSegmenation
Text Segmentation is the process of dividing a continuous text into meaningful units, such as words, sentences, or topics, to facilitate easier analysis and understanding.
InputType: text OutputType: text
class ImageEmbeddingInputs
method __init__
__init__(node=None)
class ImageEmbeddingOutputs
method __init__
__init__(node=None)
class ImageEmbedding
Image Embedding is a process that transforms an image into a fixed-dimensional vector representation, capturing its essential features and enabling efficient comparison, retrieval, and analysis in various machine learning and computer vision tasks.
InputType: image OutputType: text
class ImageManipulationInputs
method __init__
__init__(node=None)
class ImageManipulationOutputs
method __init__
__init__(node=None)
class ImageManipulation
Image Manipulation refers to the process of altering or enhancing digital images using various techniques and tools to achieve desired visual effects, correct imperfections, or transform the image's appearance.
InputType: image OutputType: image
class ImageToVideoGenerationInputs
method __init__
__init__(node=None)
class ImageToVideoGenerationOutputs
method __init__
__init__(node=None)
class ImageToVideoGeneration
The Image To Video Generation function transforms a series of static images into a cohesive, dynamic video sequence, often incorporating transitions, effects, and synchronization with audio to create a visually engaging narrative.
InputType: image OutputType: video
class AudioForcedAlignmentInputs
method __init__
__init__(node=None)
class AudioForcedAlignmentOutputs
method __init__
__init__(node=None)
class AudioForcedAlignment
Audio Forced Alignment is a process that synchronizes a given audio recording with its corresponding transcript by precisely aligning each spoken word or phoneme to its exact timing within the audio.
InputType: audio OutputType: audio
class BenchmarkScoringAsrInputs
method __init__
__init__(node=None)
class BenchmarkScoringAsrOutputs
method __init__
__init__(node=None)
class BenchmarkScoringAsr
Benchmark Scoring ASR is a function that evaluates and compares the performance of automatic speech recognition systems by analyzing their accuracy, speed, and other relevant metrics against a standardized set of benchmarks.
InputType: audio OutputType: label
class VisualQuestionAnsweringInputs
method __init__
__init__(node=None)
class VisualQuestionAnsweringOutputs
method __init__
__init__(node=None)
class VisualQuestionAnswering
Visual Question Answering (VQA) is a task in artificial intelligence that involves analyzing an image and providing accurate, contextually relevant answers to questions posed about the visual content of that image.
InputType: image OutputType: video
class DocumentImageParsingInputs
method __init__
__init__(node=None)
class DocumentImageParsingOutputs
method __init__
__init__(node=None)
class DocumentImageParsing
Document Image Parsing is the process of analyzing and converting scanned or photographed images of documents into structured, machine-readable formats by identifying and extracting text, layout, and other relevant information.
InputType: image OutputType: text
class DocumentInformationExtractionInputs
method __init__
__init__(node=None)
class DocumentInformationExtractionOutputs
method __init__
__init__(node=None)
class DocumentInformationExtraction
Document Information Extraction is the process of automatically identifying, extracting, and structuring relevant data from unstructured or semi-structured documents, such as invoices, receipts, contracts, and forms, to facilitate easier data management and analysis.
InputType: image OutputType: text
class DepthEstimationInputs
method __init__
__init__(node=None)
class DepthEstimationOutputs
method __init__
__init__(node=None)
class DepthEstimation
Depth estimation is a computational process that determines the distance of objects from a viewpoint, typically using visual data from cameras or sensors to create a three-dimensional understanding of a scene.
InputType: image OutputType: text
class VideoGenerationInputs
method __init__
__init__(node=None)
class VideoGenerationOutputs
method __init__
__init__(node=None)
class VideoGeneration
Video Generation is the process of creating video content through automated or semi-automated means, often utilizing algorithms, artificial intelligence, or software tools to produce visual and audio elements that can range from simple animations to complex, realistic scenes.
InputType: text OutputType: video
class ReferencelessAudioGenerationMetricInputs
method __init__
__init__(node=None)
class ReferencelessAudioGenerationMetricOutputs
method __init__
__init__(node=None)
class ReferencelessAudioGenerationMetric
The Referenceless Audio Generation Metric is a tool designed to evaluate the quality of generated audio content without the need for a reference or original audio sample for comparison.
InputType: text OutputType: text
class MultiClassImageClassificationInputs
method __init__
__init__(node=None)
class MultiClassImageClassificationOutputs
method __init__
__init__(node=None)
class MultiClassImageClassification
Multi Class Image Classification is a machine learning task where an algorithm is trained to categorize images into one of several predefined classes or categories based on their visual content.
InputType: image OutputType: label
class SemanticSegmentationInputs
method __init__
__init__(node=None)
class SemanticSegmentationOutputs
method __init__
__init__(node=None)
class SemanticSegmentation
Semantic segmentation is a computer vision process that involves classifying each pixel in an image into a predefined category, effectively partitioning the image into meaningful segments based on the objects or regions they represent.
InputType: image OutputType: label
class InstanceSegmentationInputs
method __init__
__init__(node=None)
class InstanceSegmentationOutputs
method __init__
__init__(node=None)
class InstanceSegmentation
Instance segmentation is a computer vision task that involves detecting and delineating each distinct object within an image, assigning a unique label and precise boundary to every individual instance of objects, even if they belong to the same category.
InputType: image OutputType: label
class ImageColorizationInputs
method __init__
__init__(node=None)
class ImageColorizationOutputs
method __init__
__init__(node=None)
class ImageColorization
Image colorization is a process that involves adding color to grayscale images, transforming them from black-and-white to full-color representations, often using advanced algorithms and machine learning techniques to predict and apply the appropriate hues and shades.
InputType: image OutputType: image
class AudioGenerationMetricInputs
method __init__
__init__(node=None)
class AudioGenerationMetricOutputs
method __init__
__init__(node=None)
class AudioGenerationMetric
The Audio Generation Metric is a quantitative measure used to evaluate the quality, accuracy, and overall performance of audio generated by artificial intelligence systems, often considering factors such as fidelity, intelligibility, and similarity to human-produced audio.
InputType: text OutputType: text
class ImageImpaintingInputs
method __init__
__init__(node=None)
class ImageImpaintingOutputs
method __init__
__init__(node=None)
class ImageImpainting
Image inpainting is a process that involves filling in missing or damaged parts of an image in a way that is visually coherent and seamlessly blends with the surrounding areas, often using advanced algorithms and techniques to restore the image to its original or intended appearance.
InputType: image OutputType: image
class StyleTransferInputs
method __init__
__init__(node=None)
class StyleTransferOutputs
method __init__
__init__(node=None)
class StyleTransfer
Style Transfer is a technique in artificial intelligence that applies the visual style of one image (such as the brushstrokes of a famous painting) to the content of another image, effectively blending the artistic elements of the first image with the subject matter of the second.
InputType: image OutputType: image
class MultiClassTextClassificationInputs
method __init__
__init__(node=None)
class MultiClassTextClassificationOutputs
method __init__
__init__(node=None)
class MultiClassTextClassification
Multi Class Text Classification is a natural language processing task that involves categorizing a given text into one of several predefined classes or categories based on its content.
InputType: text OutputType: label
class TextEmbeddingInputs
method __init__
__init__(node=None)
class TextEmbeddingOutputs
method __init__
__init__(node=None)
class TextEmbedding
Text embedding is a process that converts text into numerical vectors, capturing the semantic meaning and contextual relationships of words or phrases, enabling machines to understand and analyze natural language more effectively.
InputType: text OutputType: text
class MultiLabelTextClassificationInputs
method __init__
__init__(node=None)
class MultiLabelTextClassificationOutputs
method __init__
__init__(node=None)
class MultiLabelTextClassification
Multi Label Text Classification is a natural language processing task where a given text is analyzed and assigned multiple relevant labels or categories from a predefined set, allowing for the text to belong to more than one category simultaneously.
InputType: text OutputType: label
class TextReconstructionInputs
method __init__
__init__(node=None)
class TextReconstructionOutputs
method __init__
__init__(node=None)
class TextReconstruction
Text Reconstruction is a process that involves piecing together fragmented or incomplete text data to restore it to its original, coherent form.
InputType: text OutputType: text
class FactCheckingInputs
method __init__
__init__(node=None)
class FactCheckingOutputs
method __init__
__init__(node=None)
class FactChecking
Fact Checking is the process of verifying the accuracy and truthfulness of information, statements, or claims by cross-referencing with reliable sources and evidence.
InputType: text OutputType: label
class SpeechClassificationInputs
method __init__
__init__(node=None)
class SpeechClassificationOutputs
method __init__
__init__(node=None)
class SpeechClassification
Speech Classification is a process that involves analyzing and categorizing spoken language into predefined categories or classes based on various features such as tone, pitch, and linguistic content.
InputType: audio OutputType: label
class IntentClassificationInputs
method __init__
__init__(node=None)
class IntentClassificationOutputs
method __init__
__init__(node=None)
class IntentClassification
Intent Classification is a natural language processing task that involves analyzing and categorizing user text input to determine the underlying purpose or goal behind the communication, such as booking a flight, asking for weather information, or setting a reminder.
InputType: text OutputType: label
class PartOfSpeechTaggingInputs
method __init__
__init__(node=None)
class PartOfSpeechTaggingOutputs
method __init__
__init__(node=None)
class PartOfSpeechTagging
Part of Speech Tagging is a natural language processing task that involves assigning each word in a sentence its corresponding part of speech, such as noun, verb, adjective, or adverb, based on its role and context within the sentence.
InputType: text OutputType: label
class MetricAggregationInputs
method __init__
__init__(node=None)
class MetricAggregationOutputs
method __init__
__init__(node=None)
class MetricAggregation
Metric Aggregation is a function that computes and summarizes numerical data by applying statistical operations, such as averaging, summing, or finding the minimum and maximum values, to provide insights and facilitate analysis of large datasets.
InputType: text OutputType: text
class DialectDetectionInputs
method __init__
__init__(node=None)
class DialectDetectionOutputs
method __init__
__init__(node=None)
class DialectDetection
Dialect Detection is a function that identifies and classifies the specific regional or social variations of a language spoken or written by an individual, enabling the recognition of distinct linguistic patterns and nuances associated with different dialects.
InputType: audio OutputType: text
class InverseTextNormalizationInputs
method __init__
__init__(node=None)
class InverseTextNormalizationOutputs
method __init__
__init__(node=None)
class InverseTextNormalization
Inverse Text Normalization is the process of converting spoken or written language in its normalized form, such as numbers, dates, and abbreviations, back into their original, more complex or detailed textual representations.
InputType: text OutputType: label
class TextToAudioInputs
method __init__
__init__(node=None)
class TextToAudioOutputs
method __init__
__init__(node=None)
class TextToAudio
The Text to Audio function converts written text into spoken words, allowing users to listen to the content instead of reading it.
InputType: text OutputType: audio
class FillTextMaskInputs
method __init__
__init__(node=None)
class FillTextMaskOutputs
method __init__
__init__(node=None)
class FillTextMask
The "Fill Text Mask" function takes a text input with masked or placeholder characters and replaces those placeholders with specified or contextually appropriate characters to generate a complete and coherent text output.
InputType: text OutputType: text
class VideoContentModerationInputs
method __init__
__init__(node=None)
class VideoContentModerationOutputs
method __init__
__init__(node=None)
class VideoContentModeration
Video Content Moderation is the process of reviewing, analyzing, and filtering video content to ensure it adheres to community guidelines, legal standards, and platform policies, thereby preventing the dissemination of inappropriate, harmful, or illegal material.
InputType: video OutputType: label