sylwia-kuros commited on
Commit
6d32b9d
1 Parent(s): dc44eb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -26
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  language:
3
  - en
4
  tags:
@@ -25,37 +26,106 @@ model-index:
25
  value: 0.91284
26
  ---
27
 
28
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
- should probably proofread and complete it, then remove this comment. -->
30
 
31
- # Joint magnitude pruning, quantization and distillation on BERT-base/SST-2
 
32
 
33
- This model conducts unstructured magnitude pruning, quantization and distillation at the same time on BERT-base when finetuning on the GLUE SST2 dataset.
 
 
34
  It achieves the following results on the evaluation set:
35
- - Torch accuracy: 0.9128
36
- - OpenVINO IR accuracy: 0.9128
37
- - Sparsity in transformer block linear layers: 0.80
 
 
 
 
38
 
39
- ## Setup
40
 
41
  ```
42
- conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
43
- pip install optimum[openvino,nncf]==1.7.0
44
- pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
45
- pip install wandb # optional
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ```
47
 
48
- ## Training script
49
 
50
- See https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080
51
 
 
52
 
53
- ## Run
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
- We use one card for training.
56
 
57
- ```bash
58
- NNCFCFG=/path/to/nncf_config/json
59
  python run_glue.py \
60
  --lr_scheduler_type cosine_with_restarts \
61
  --cosine_lr_scheduler_cycles 11 6 \
@@ -89,12 +159,21 @@ python run_glue.py \
89
  --seed 1
90
  ```
91
 
92
- ### Framework versions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
- - Transformers 4.26.0
95
- - Pytorch 1.13.1+cu116
96
- - Datasets 2.8.0
97
- - Tokenizers 0.13.2
98
- - Optimum 1.6.3
99
- - Optimum-intel 1.7.0
100
- - NNCF 2.4.0
 
1
  ---
2
+ license: apache-2.0
3
  language:
4
  - en
5
  tags:
 
26
  value: 0.91284
27
  ---
28
 
29
+ # bert-base-uncased-sst2-unstructured80-int8-ov
 
30
 
31
+ * Model creator: [Google](https://huggingface.co/google-bert)
32
+ * Original model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
33
 
34
+ ## Description
35
+
36
+ This model conducts unstructured magnitude pruning, quantization and distillation at the same time on [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) when finetuning on the GLUE SST2 dataset.
37
  It achieves the following results on the evaluation set:
38
+ - Torch accuracy: **0.9128**
39
+ - OpenVINO IR accuracy: **0.9128**
40
+ - Sparsity in transformer block linear layers: **0.80**
41
+
42
+ The model was converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
43
+
44
+ ## Optimization Parameters
45
 
46
+ Optimization was performed using `nncf` with the following `nncf_config.json` file:
47
 
48
  ```
49
+ [
50
+ {
51
+ "algorithm": "quantization",
52
+ "preset": "mixed",
53
+ "overflow_fix": "disable",
54
+ "initializer": {
55
+ "range": {
56
+ "num_init_samples": 300,
57
+ "type": "mean_min_max"
58
+ },
59
+ "batchnorm_adaptation": {
60
+ "num_bn_adaptation_samples": 0
61
+ }
62
+ },
63
+ "scope_overrides": {
64
+ "activations": {
65
+ "{re}.*matmul_0": {
66
+ "mode": "symmetric"
67
+ }
68
+ }
69
+ },
70
+ "ignored_scopes": [
71
+ "{re}.*Embeddings.*",
72
+ "{re}.*__add___[0-1]",
73
+ "{re}.*layer_norm_0",
74
+ "{re}.*matmul_1",
75
+ "{re}.*__truediv__*"
76
+ ]
77
+ },
78
+ {
79
+ "algorithm": "magnitude_sparsity",
80
+ "ignored_scopes": [
81
+ "{re}.*NNCFEmbedding.*",
82
+ "{re}.*LayerNorm.*",
83
+ "{re}.*pooler.*",
84
+ "{re}.*classifier.*"
85
+ ],
86
+ "sparsity_init": 0.0,
87
+ "params": {
88
+ "power": 3,
89
+ "schedule": "polynomial",
90
+ "sparsity_freeze_epoch": 10,
91
+ "sparsity_target": 0.8,
92
+ "sparsity_target_epoch": 9,
93
+ "steps_per_epoch": 2105,
94
+ "update_per_optimizer_step": true
95
+ }
96
+ }
97
+ ]
98
  ```
99
 
100
+ For more information on optimization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html).
101
 
102
+ ## Compatibility
103
 
104
+ The provided OpenVINO™ IR model is compatible with:
105
 
106
+ * Transformers 4.26.0
107
+ * Pytorch 1.13.1+cu116
108
+ * Datasets 2.8.0
109
+ * Tokenizers 0.13.2
110
+ * Optimum 1.6.3
111
+ * Optimum-intel 1.7.0
112
+ * NNCF 2.4.0
113
+
114
+ ## Running Model Training
115
+
116
+ 1. Install required packages:
117
+
118
+ ```
119
+ conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
120
+ pip install optimum[openvino,nncf]
121
+ pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
122
+ pip install wandb # optional
123
+ ```
124
 
125
+ 2. Run model training:
126
 
127
+ ```
128
+ NNCFCFG=/path/to/nncf_config.json
129
  python run_glue.py \
130
  --lr_scheduler_type cosine_with_restarts \
131
  --cosine_lr_scheduler_cycles 11 6 \
 
159
  --seed 1
160
  ```
161
 
162
+ For more details, refer to the [training configuration and script](https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080).
163
+
164
+ ## Usage examples
165
+
166
+ * [OpenVINO notebooks](https://github.com/openvinotoolkit/openvino_notebooks):
167
+ - [Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sparsity-optimization/sparsity-optimization.ipynb)
168
+
169
+ ## Limitations
170
+
171
+ Check the original model card for [limitations](https://huggingface.co/google-bert/bert-base-uncased).
172
+
173
+ ## Legal information
174
+
175
+ The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) model card.
176
+
177
+ ## Disclaimer
178
 
179
+ Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.