jupyterjazz commited on
Commit
e3fca02
1 Parent(s): 3b9e730

Signed-off-by: jupyterjazz <[email protected]>

Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -1,7 +1,12 @@
1
  Core implementation of Jina XLM-RoBERTa
2
 
3
- # Converting Weights
4
 
5
- ```
6
- python3 -m "xlm-roberta-flash-implementation".convert_roberta_weights_to_flash --output pytorch_model_xlmr_flash.bin
7
- ```
 
 
 
 
 
 
1
  Core implementation of Jina XLM-RoBERTa
2
 
3
+ This implementation is adapted from [XLM-Roberta](https://huggingface.co/docs/transformers/en/model_doc/xlm-roberta). In contrast to the original implementation, this model uses Rotary positional encodings and supports flash-attention 2.
4
 
5
+ ### Models that use this implementation
6
+
7
+ to be added soon
8
+
9
+
10
+ ### Converting weights
11
+
12
+ Weights from an [original XLMRoberta model](https://huggingface.co/FacebookAI/xlm-roberta-large) can be converted using the `convert_roberta_weights_to_flash.py` script in the model repository.