File size: 1,240 Bytes
2729347 7988e91 2729347 a17e71e 4b70583 2729347 4b70583 85dc257 4b70583 85dc257 337abc1 2729347 4b70583 2729347 4b70583 6dd6dd9 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 2729347 4b70583 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
base_model: westlake-repl/SaProt_35M_AF2
library_name: peft
---
# Base model: [westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This model is used to predict fitness of GB1 protein variants.
### Task type
protein level regression
### Dataset description
The dataset is from:
Nicholas C Wu, Lei Dai, C Anders Olson, James O Lloyd-Smith, Ren Sun (2016) Adaptation in protein fitness landscapes is facilitated by indirect paths eLife 5:e16965
https://doi.org/10.7554/eLife.16965
Label is the fitness of mutant protein. The fitness of each variant can be viewed as the fitness relative to wildtype,
such that = 1. Therefore all labels are larger than 0, if label >1 means high fitness compare to wildtype.
### Model input type
Amino acid sequence
### Performance
test_spearman: 0.54
test_pearson: 0.98
### LoRA config
lora_dropout: 0.0
lora_alpha: 16
target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]
modules_to_save: ["classifier"]
### Training config
class: AdamW
betas: (0.9, 0.98)
weight_decay: 0.01
learning rate: 1e-3
epoch: 20
batch size: 1000
precision: 16-mixed |