yujiepan commited on
Commit
7a57de2
1 Parent(s): e93b11f

upload readme

Browse files
Files changed (1) hide show
  1. README.md +103 -0
README.md ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - image-classification
5
+ - vision
6
+ - generated_from_trainer
7
+ datasets:
8
+ - food101
9
+ metrics:
10
+ - accuracy
11
+ model-index:
12
+ - name: swin-food101-jpqd
13
+ results:
14
+ - task:
15
+ name: Image Classification
16
+ type: image-classification
17
+ dataset:
18
+ name: food101
19
+ type: food101
20
+ config: default
21
+ split: validation
22
+ args: default
23
+ metrics:
24
+ - name: Accuracy
25
+ type: accuracy
26
+ value: 0.9055049504950495
27
+ ---
28
+
29
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
30
+ should probably proofread and complete it, then remove this comment. -->
31
+
32
+ # swin-food101-jpqd
33
+
34
+ This model is a fine-tuned version of [microsoft/swin-base-patch4-window7-224](https://huggingface.co/microsoft/swin-base-patch4-window7-224) on the food101 dataset.
35
+ It achieves the following results on the evaluation set:
36
+ - Loss: 0.3497
37
+ - Accuracy: 0.9055
38
+
39
+ This model is quantized. Structured sparsity in transformer linear layers: 40%.
40
+
41
+ ## Model description
42
+
43
+ More information needed
44
+
45
+ ## Intended uses & limitations
46
+
47
+ More information needed
48
+
49
+ ## Training and evaluation data
50
+
51
+ More information needed
52
+
53
+ ## Training procedure
54
+
55
+ ### Training hyperparameters
56
+
57
+ The following hyperparameters were used during training:
58
+ - learning_rate: 5e-05
59
+ - train_batch_size: 16
60
+ - eval_batch_size: 128
61
+ - seed: 42
62
+ - gradient_accumulation_steps: 4
63
+ - total_train_batch_size: 64
64
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
+ - lr_scheduler_type: linear
66
+ - lr_scheduler_warmup_ratio: 0.1
67
+ - num_epochs: 10.0
68
+
69
+ ### Training results
70
+
71
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
72
+ |:-------------:|:-----:|:-----:|:---------------:|:--------:|
73
+ | 2.2676 | 0.42 | 500 | 2.1087 | 0.7947 |
74
+ | 0.6823 | 0.84 | 1000 | 0.5127 | 0.8818 |
75
+ | 0.816 | 1.27 | 1500 | 0.3944 | 0.8954 |
76
+ | 0.5272 | 1.69 | 2000 | 0.3310 | 0.9050 |
77
+ | 12.263 | 2.11 | 2500 | 12.0040 | 0.9057 |
78
+ | 48.9519 | 2.54 | 3000 | 48.4500 | 0.8597 |
79
+ | 75.576 | 2.96 | 3500 | 75.5765 | 0.6951 |
80
+ | 93.7523 | 3.38 | 4000 | 93.3753 | 0.5992 |
81
+ | 103.7155 | 3.8 | 4500 | 103.5301 | 0.5622 |
82
+ | 107.7993 | 4.23 | 5000 | 108.0881 | 0.5636 |
83
+ | 109.6831 | 4.65 | 5500 | 109.2205 | 0.5844 |
84
+ | 1.8848 | 5.07 | 6000 | 0.9807 | 0.8315 |
85
+ | 1.0668 | 5.49 | 6500 | 0.6050 | 0.8740 |
86
+ | 0.7951 | 5.92 | 7000 | 0.5151 | 0.8838 |
87
+ | 0.7402 | 6.34 | 7500 | 0.4843 | 0.8906 |
88
+ | 0.7319 | 6.76 | 8000 | 0.4494 | 0.8933 |
89
+ | 0.5683 | 7.19 | 8500 | 0.4378 | 0.8953 |
90
+ | 0.496 | 7.61 | 9000 | 0.4115 | 0.8981 |
91
+ | 0.6174 | 8.03 | 9500 | 0.3952 | 0.9005 |
92
+ | 0.4921 | 8.45 | 10000 | 0.3765 | 0.9026 |
93
+ | 0.5843 | 8.88 | 10500 | 0.3678 | 0.9035 |
94
+ | 0.5485 | 9.3 | 11000 | 0.3576 | 0.9039 |
95
+ | 0.4337 | 9.72 | 11500 | 0.3512 | 0.9057 |
96
+
97
+
98
+ ### Framework versions
99
+
100
+ - Transformers 4.26.0
101
+ - Pytorch 1.13.1+cu116
102
+ - Datasets 2.8.0
103
+ - Tokenizers 0.13.2