File size: 1,189 Bytes
a9629b2
 
 
 
fd7d5a2
a9629b2
 
 
 
c13dca6
a9629b2
c809817
 
a9629b2
c809817
 
559e8f3
a9629b2
61a4d04
c809817
 
1c293e2
c809817
 
a9629b2
c809817
a9629b2
c809817
 
a9629b2
c809817
a9629b2
c809817
a9629b2
c809817
a9629b2
c809817
 
a9629b2
c809817
a9629b2
c809817
a9629b2
c809817
a9629b2
c809817
a9629b2
c809817
c006bc2
c809817
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
base_model: westlake-repl/SaProt_35M_AF2
library_name: peft
---
# Base model: [westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
This model is used to predict protein stability (ΔΔG) for mutant amino acid sequence. 

### Task type
protein level regression

### Dataset description
The dataset is from [Mega-scale experimental analysis of protein folding stability in biology and design](https://www.nature.com/articles/s41586-023-06328-6).
We collect all protein sequences that have ΔΔG value.

Label is the ΔΔG (kcal/mol) value, the positive value means stable and the negetive value represents unstable, ranging from minus infinity to positive infinity.
### Model input type
Amino acid sequence

### Performance
test_loss: 0.18

test_spearman: 0.92

### LoRA config
lora_dropout: 0.0

lora_alpha: 16

target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]

modules_to_save: ["classifier"]

### Training config
class: AdamW

betas: (0.9, 0.98)

weight_decay: 0.01

learning rate: 1e-4

epoch: 20

batch size: 64

precision: 16-mixed