Metrics

aiXplain has an impressive library of metrics for various machine learning tasks like Translation, Speech Recognition, Diacritization, and Sentiment Analysis. Metrics can be used in Benchmark and Design (via metric nodes).

We have reference similarity metrics, human evaluation estimation metrics, and referenceless metrics. We provide a wide range of evaluation metrics, catering to many tasks and modalities. Below are some examples.

Text Generation

BLEU (Papineni et al., 2002)
WER (Woodard and Nelson)
chrF (Popovic´, 2015)
Comet DA (Reiet al., 2020)
Nisqa (Mittag et al., 2021)
Comet QE (Reiet al., 2021)

Speech Recognition

WIL, MER (Morris et al., 2004)

Machine Translation

TER (Snover et al., 2006)
METEOR (Banerjee and Lavie, 2005)

Speech Synthesis

PESQ (Rix et al., 2001)
DNSMOS (Reddy et al., 2021)