# **ESM zero-shot variant prediction** this was inspired from this [paper](https://doi.org/10.1101/2021.07.09.450648) and adaptated from [this repo](https://github.com/facebookresearch/esm/tree/main/esm) #### **Instructions** - in the 'sequence' text box the protein full amino acid sequence that is to be analysed must be given, jolly charachters (e.g. -X.B) are supported (but at the moment the visualisation does not show the correct results) - there's three running modes that can be chosen, depending on the input in the 'substitution' box: - if another sequence is given, the positions that are different between the two will be evaluated (NB the sequences must be of the same length) and their score returned - if a list of integers is given, a deep mutational scan will be performed at those positions in the input sequence and the scores for the amino acids, different from the original one, will be returned - if a single substitution or a list thereof is given (in the form of **B008S**), the single substitution score is returned - you can choose which ESM model to use for the calculations, these models are the ones that are available at runtime on Hugging Face Model Hub - there's 2 scoring strategies available: wt-marginals and masked marginals; the first one is faster, but less accurate, the second one considers the sequence context more thoroughly, but is sensibly slower (the run time scales linearly with sequence length) - the results will be shown in a table, with color coding and sorted by fitness (if performing a deep mutational scan) - the output data is available for download from the box at the bottom as a CSV file