README / README.md
sasha's picture
sasha HF staff
Update README.md
10730a2
|
raw
history blame
1 kB
---
title: README
emoji: πŸ€—
colorFrom: green
colorTo: purple
sdk: static
pinned: false
tags:
- evaluate
- measurement
---
πŸ€— Evaluate provides access to a wide range of evaluation tools. It covers a range of modalities such as text, computer vision, audio, etc. as well as tools to evaluate models or datasets.
It has three types of evaluations:
- **Metric**: measures the performance of a model on a given dataset, usually by comparing the model's predictions to some ground truth labels.
- **Comparison**: used useful to compare the performance of two or more models on a single test dataset., e.g. by comparing their predictions to ground truth labels and computing their agreement.
- **Measurement**: for gaining more insights on datasets and model predictions based on their properties and characteristics.
All three types of evaluation supported by the πŸ€— Evaluate library are meant to be mutually complementary, and help our community carry out more mindful and responsible evaluation!