napolitan commited on
Commit
7957d35
1 Parent(s): 424c359

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LS-LLaMA: Label Supervised LLaMA Finetuning
2
+
3
+ <h2>📢: For convenience, we build a bi-directional LLMs toolkit <a href='https://github.com/WhereIsAI/BiLLM'>BiLLM</a> for language understanding. Welcome to use it.</h2>
4
+
5
+ <p align="center">
6
+
7
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/label-supervised-llama-finetuning/named-entity-recognition-on-conll03-4)](https://paperswithcode.com/sota/named-entity-recognition-on-conll03-4?p=label-supervised-llama-finetuning)
8
+
9
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/label-supervised-llama-finetuning/named-entity-recognition-on-ontonotes-5-0-1)](https://paperswithcode.com/sota/named-entity-recognition-on-ontonotes-5-0-1?p=label-supervised-llama-finetuning)
10
+ </p>
11
+
12
+
13
+ <p align='center'>
14
+ <img src='./docs/lsllama.png'/>
15
+ </p>
16
+
17
+ ## Usage
18
+
19
+ Our implementation currently supports the following sequence classification benchmarks:
20
+ 1. SST2 (2 classes) / SST5 (5 classes)
21
+ 2. AGNews (4 classes)
22
+ 3. Twitter Financial News Sentiment (twitterfin, 3 classes)
23
+
24
+ and token classification benchmarks for named entity recognition (NER): CoNLL2003 and OntonotesV5.
25
+
26
+ Commands for training LS-LLaMA and LS-unLLaMA on different tasks can follow the templates below:
27
+ ```console
28
+ foo@bar:~$ CUDA_VISIBLE_DEVICES=0 python file_name.py dataset_name model_size
29
+ ```
30
+
31
+ `file_name.py` can be one of `unllama_seq_clf.py`, `unllama_token_clf.py`, `llama_seq_clf.py`, and `llama_token_clf.py`, for training LS-LLaMA and LS-unLLaMA on sequence- and token-level classification.
32
+
33
+ `dataset_name` can be one of `sst2`, `sst5`, `agnews`, `twitterfin`, `conll03`, and `ontonotesv5`.
34
+
35
+ `model_size` can be `7b` or `13b`, corresponding to LLaMA-2-7B and LLaMA-2-13B.
36
+
37
+ For example, the following command will train LS-unLLaMA based on LLaMA-2-7B on AGNews for sequence classification:
38
+ ```console
39
+ foo@bar:~$ CUDA_VISIBLE_DEVICES=0 python unllama_seq_clf.py agnews 7b
40
+ ```
41
+
42
+ ## Implementations
43
+
44
+ Load Pretrained Models
45
+
46
+ ```python
47
+ from transformers import AutoTokenizer
48
+ from modeling_llama import (
49
+ LlamaForSequenceClassification, LlamaForTokenClassification,
50
+ UnmaskingLlamaForSequenceClassification, UnmaskingLlamaForTokenClassification,
51
+ )
52
+
53
+
54
+ model_id = 'meta-llama/Llama-2-7b'
55
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
56
+ model = LlamaForSequenceClassification.from_pretrained(model_id).bfloat16()
57
+ model = LlamaForTokenClassification.from_pretrained(model_id).bfloat16()
58
+ model = UnmaskingLlamaForSequenceClassification.from_pretrained(model_id).bfloat16()
59
+ model = UnmaskingLlamaForTokenClassification.from_pretrained(model_id).bfloat16()
60
+ ```
61
+
62
+ For more usage, please refer to `unllama_seq_clf.py`, `unllama_token_clf.py`, `llama_seq_clf.py`, `llama_token_clf.py`.
63
+
64
+ # Citation
65
+
66
+ ```
67
+ @article{li2023label,
68
+ title={Label supervised llama finetuning},
69
+ author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
70
+ journal={arXiv preprint arXiv:2310.01208},
71
+ year={2023}
72
+ }
73
+ ```