Create deberta base mnli fine-tuned model

Files changed (5) hide show

README.md ADDED Viewed

+---
+thumbnail: https://huggingface.co/front/thumbnails/microsoft.png
+license: mit
+---
+## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
+[DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa on a majority of NLU tasks with 80GB training data.
+Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
+This model is the base DeBERTa model fine-tuned with MNLI task
+#### Fine-tuning on NLU tasks
+We present the dev results on SQuAD 1.1/2.0 and MNLI tasks.
+| Model             | SQuAD 1.1 | SQuAD 2.0 | MNLI-m |
+|-------------------|-----------|-----------|--------|
+| RoBERTa-base      | 91.5/84.6 | 83.7/80.5 | 87.6   |
+| XLNet-Large       | -/-       | -/80.2    | 86.8   |
+| **DeBERTa-base**  | 93.1/87.2 | 86.2/83.1 | 88.8   |
+### Citation
+If you find DeBERTa useful for your work, please cite the following paper:
+``` latex
+@misc{he2020deberta,
+    title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention},
+    author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
+    year={2020},
+    eprint={2006.03654},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+		}
+```

bpe_encoder.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:e7c6f9eecb461c01e09c00656ccf3e27944b9e74bfe29e51632b13d3cd9d6c8e
+size 3917897

config.json ADDED Viewed

+{
+	"attention_probs_dropout_prob": 0.1,
+	"hidden_act": "gelu",
+	"hidden_dropout_prob": 0.1,
+	"hidden_size": 768,
+	"initializer_range": 0.02,
+	"intermediate_size": 3072,
+	"max_position_embeddings": 512,
+	"relative_attention": true,
+	"pos_att_type": "c2p|p2c",
+	"layer_norm_eps": 1e-7,
+	"max_relative_positions": -1,
+	"position_biased_input": false,
+	"num_attention_heads": 12,
+	"num_hidden_layers": 12,
+	"type_vocab_size": 0,
+	"vocab_size": 50265
+}

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f82b3a368bf628768567c1ce03b354eea1299b87c42ded71a56e4cdd3ea0039
+size 556811547

tokenizer_config.json ADDED Viewed

+{
+  "do_lower_case": false
+}