metadata
language: en
pipeline_tag: zero-shot-classification
tags:
- distilbert
datasets:
- multi_nli
metrics:
- accuracy
DistilBERT base model (uncased)
This is the uncased DistilBERT model fine-tuned on Multi-Genre Natural Language Inference (MNLI) dataset for the zero-shot classification task. The model is not case-sensitive, i.e., it does not make a difference between "english" and "English".
Training
Training is done on a p3.2xlarge AWS EC2 instance (1 NVIDIA Tesla V100 GPUs), with the following hyperparameters:
$ run_glue.py \
--model_name_or_path distilbert-base-uncased \
--task_name mnli \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 16 \
--learning_rate 2e-5 \
--num_train_epochs 5 \
--output_dir /tmp/distilbert-base-uncased_mnli/
Evaluation results
Task | MNLI | MNLI-mm |
---|---|---|
82.0 | 82.0 |