aixplain.modules.pipeline.pipeline
ObjectDetection Objects
class ObjectDetection(AssetNode[ObjectDetectionInputs,
ObjectDetectionOutputs])
Object Detection is a computer vision technology that identifies and locates objects within an image, typically by drawing bounding boxes around the detected objects and classifying them into predefined categories.
InputType: video OutputType: text
TextEmbedding Objects
class TextEmbedding(AssetNode[TextEmbeddingInputs, TextEmbeddingOutputs])
Text embedding is a process that converts text into numerical vectors, capturing the semantic meaning and contextual relationships of words or phrases, enabling machines to understand and analyze natural language more effectively.
InputType: text OutputType: text
SemanticSegmentation Objects
class SemanticSegmentation(AssetNode[SemanticSegmentationInputs,
SemanticSegmentationOutputs])
Semantic segmentation is a computer vision process that involves classifying each pixel in an image into a predefined category, effectively partitioning the image into meaningful segments based on the objects or regions they represent.
InputType: image OutputType: label
ReferencelessAudioGenerationMetric Objects
class ReferencelessAudioGenerationMetric(
BaseMetric[ReferencelessAudioGenerationMetricInputs,
ReferencelessAudioGenerationMetricOutputs])
The Referenceless Audio Generation Metric is a tool designed to evaluate the quality of generated audio content without the need for a reference or original audio sample for comparison.
InputType: text OutputType: text
ScriptExecution Objects
class ScriptExecution(AssetNode[ScriptExecutionInputs,
ScriptExecutionOutputs])
Script Execution refers to the process of running a set of programmed instructions or code within a computing environment, enabling the automated performance of tasks, calculations, or operations as defined by the script.
InputType: text OutputType: text
ImageImpainting Objects
class ImageImpainting(AssetNode[ImageImpaintingInputs,
ImageImpaintingOutputs])
Image inpainting is a process that involves filling in missing or damaged parts of an image in a way that is visually coherent and seamlessly blends with the surrounding areas, often using advanced algorithms and techniques to restore the image to its original or intended appearance.
InputType: image OutputType: image
ImageEmbedding Objects
class ImageEmbedding(AssetNode[ImageEmbeddingInputs, ImageEmbeddingOutputs])
Image Embedding is a process that transforms an image into a fixed-dimensional vector representation, capturing its essential features and enabling efficient comparison, retrieval, and analysis in various machine learning and computer vision tasks.
InputType: image OutputType: text
MetricAggregation Objects
class MetricAggregation(BaseMetric[MetricAggregationInputs,
MetricAggregationOutputs])
Metric Aggregation is a function that computes and summarizes numerical data by applying statistical operations, such as averaging, summing, or finding the minimum and maximum values, to provide insights and facilitate analysis of large datasets.
InputType: text OutputType: text
SpeechTranslation Objects
class SpeechTranslation(AssetNode[SpeechTranslationInputs,
SpeechTranslationOutputs])
Speech Translation is a technology that converts spoken language in real-time from one language to another, enabling seamless communication between speakers of different languages.
InputType: audio OutputType: text
DepthEstimation Objects
class DepthEstimation(AssetNode[DepthEstimationInputs,
DepthEstimationOutputs])
Depth estimation is a computational process that determines the distance of objects from a viewpoint, typically using visual data from cameras or sensors to create a three-dimensional understanding of a scene.
InputType: image OutputType: text
NoiseRemoval Objects
class NoiseRemoval(AssetNode[NoiseRemovalInputs, NoiseRemovalOutputs])
Noise Removal is a process that involves identifying and eliminating unwanted random variations or disturbances from an audio signal to enhance the clarity and quality of the underlying information.
InputType: audio OutputType: audio
Diacritization Objects
class Diacritization(AssetNode[DiacritizationInputs, DiacritizationOutputs])
Adds diacritical marks to text, essential for languages where meaning can change based on diacritics.
InputType: text OutputType: text
AudioTranscriptAnalysis Objects
class AudioTranscriptAnalysis(AssetNode[AudioTranscriptAnalysisInputs,
AudioTranscriptAnalysisOutputs])
Analyzes transcribed audio data for insights, patterns, or specific information extraction.
InputType: audio OutputType: text
ExtractAudioFromVideo Objects
class ExtractAudioFromVideo(AssetNode[ExtractAudioFromVideoInputs,
ExtractAudioFromVideoOutputs])
Isolates and extracts audio tracks from video files, aiding in audio analysis or transcription tasks.
InputType: video OutputType: audio
AudioReconstruction Objects
class AudioReconstruction(BaseReconstructor[AudioReconstructionInputs,
AudioReconstructionOutputs])
Audio Reconstruction is the process of restoring or recreating audio signals from incomplete, damaged, or degraded recordings to achieve a high-quality, accurate representation of the original sound.
InputType: audio OutputType: audio
ClassificationMetric Objects
class ClassificationMetric(BaseMetric[ClassificationMetricInputs,
ClassificationMetricOutputs])
A Classification Metric is a quantitative measure used to evaluate the quality and effectiveness of classification models.
InputType: text OutputType: text