joethequant
commited on
Commit
•
7a46da1
1
Parent(s):
dfaaf0f
Update README.md
Browse filesAdded additional Information
README.md
CHANGED
@@ -63,6 +63,55 @@ Performance and analytics:
|
|
63 |
## How to Use
|
64 |
Instructions on how to use the model, including example prompts and API documentation, are available in the [Code Repository](https://github.com/joethequant/docker_streamlit_antibody_protein_generation).
|
65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
## Limitations and Future Work
|
67 |
- Predictions require experimental validation for practical use.
|
68 |
- Future improvements will focus on incorporating diverse training data and enhancing prediction accuracy for the efficacy of generated antibodies.
|
|
|
63 |
## How to Use
|
64 |
Instructions on how to use the model, including example prompts and API documentation, are available in the [Code Repository](https://github.com/joethequant/docker_streamlit_antibody_protein_generation).
|
65 |
|
66 |
+
### Example Code
|
67 |
+
|
68 |
+
```python
|
69 |
+
from models.progen.modeling_progen import ProGenForCausalLM
|
70 |
+
import torch
|
71 |
+
from tokenizers import Tokenizer
|
72 |
+
import json
|
73 |
+
|
74 |
+
# Define the model identifier from Hugging Face's model hub
|
75 |
+
model_path = 'AntibodyGeneration/fine-tuned-progen2-small'
|
76 |
+
|
77 |
+
# Load the model and tokenizer
|
78 |
+
model = ProGenForCausalLM.from_pretrained(model_path)
|
79 |
+
tokenizer = Tokenizer.from_file('tokenizer.json')
|
80 |
+
|
81 |
+
# Define your sequence and other parameters
|
82 |
+
target_sequence = 'MQIPQAPWPVVWAVLQLGWRPGWFLDSPDRPWNPPTFSPALLVVTEGDNATFTCSFSNTSESFVLNWYRMSPSNQTDKLAAFPEDRSQPGQDCRFRVTQLPNGRDFHMSVVRARRNDSGTYLCGAISLAPKAQIKESLRAELRVTERRAEVPTAHPSPSPRPAGQFQTLVVGVVGGLLGSLVLLVWVLAVICSRAARGTIGARRTGQPLKEDPSAVPVFSVDYGELDFQWREKTPEPPVPCVPEQTEYATIVFPSGMGTSSPARRGSADGPRSAQPLRPEDGHCSWPL'
|
83 |
+
number_of_sequences = 2
|
84 |
+
|
85 |
+
# Tokenize the sequence
|
86 |
+
tokenized_sequence = tokenizer(target_sequence, return_tensors="pt")
|
87 |
+
|
88 |
+
# Move model and tensors to CUDA if available
|
89 |
+
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
|
90 |
+
model = model.to(device)
|
91 |
+
tokenized_sequence = tokenized_sequence.to(device)
|
92 |
+
|
93 |
+
# Generate sequences
|
94 |
+
with torch.no_grad():
|
95 |
+
output = model.generate(**tokenized_sequence, max_length=1024, pad_token_id=tokenizer.pad_token_id, do_sample=True, top_p=0.9, temperature=0.8, num_return_sequences=number_of_sequences)
|
96 |
+
|
97 |
+
# Decoding the output to get generated sequences
|
98 |
+
generated_sequences = [tokenizer.decode(output_seq, skip_special_tokens=True) for output_seq in output]
|
99 |
+
```
|
100 |
+
|
101 |
+
## Links:
|
102 |
+
|
103 |
+
- [Huggingface Model Repository](https://huggingface.co/AntibodyGeneration)
|
104 |
+
- [Web Demo](https://orca-app-ygzbp.ondigitalocean.app/Demo_Antibody_Generator)
|
105 |
+
- [OpenSource RunPod Severless Rest API](https://github.com/joethequant/docker_protein_generator)
|
106 |
+
- [The Code for this App](https://github.com/joethequant/docker_streamlit_antibody_protein_generation)
|
107 |
+
|
108 |
+
## Additional Resources and Links
|
109 |
+
- [Progen Foundation Models](https://github.com/salesforce/progen)
|
110 |
+
- [ANARCI Github](https://github.com/oxpig/ANARCI)
|
111 |
+
- [ANARCI Webserver](http://opig.stats.ox.ac.uk/webapps/anarci/)
|
112 |
+
- [TAP: Therapeutic Antibody Profiler](https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabpred/tap)
|
113 |
+
- [ESM Fold](https://esmatlas.com/resources?action=fold)
|
114 |
+
|
115 |
## Limitations and Future Work
|
116 |
- Predictions require experimental validation for practical use.
|
117 |
- Future improvements will focus on incorporating diverse training data and enhancing prediction accuracy for the efficacy of generated antibodies.
|