<aside> <img src="/icons/push-pin_green.svg" alt="/icons/push-pin_green.svg" width="40px" /> Key Links: http://docs.google.com/spreadsheets/d/1AsYRLlrRLd6I8abxNHfuz1OtFTSqYZ87_kefBMsxhMo/edit?gid=0#gid=0

</aside>

Part A (From Pranam)

<aside> ⚠️ Optional for MIT/Harvard Students, mandatory for Committed Listeners. Due at the start of class March 11

</aside>

  1. Sign up for HuggingFace (we will be using PepMLM: https://huggingface.co/ChatterjeeLab/PepMLM-650M)

    1. Once you login, go to the page (https://huggingface.co/settings/tokens). Click +Create new token.
    2. Make sure you type the full name ChatterjeeLab/PepMLM-650M when searching for repos. Click save token and you will see the newly token (copy that).
    3. Go to the page (https://huggingface.co/ChatterjeeLab/PepMLM-650M) and find their Colab Notebook (link).
    4. Make a copy to your Google Drive, choose T4 GPU and run each block.
    5. When running into the block Input HF token , a pop-up will show Enter your token (input will not be visible):. Paste your token and Add token as git credential? (Y/n) choose n.

    (Token wasnt needed in the above process, completed the rest without as issue as I am well versed with huggingface and google colab)

  2. Find the amino acid sequence for SOD1 in UniProt (ID: P00441), a protein when mutated, can cause Amyotrophic lateral sclerosis (ALS). In fact, the A4V (when you change position 4 from Alanine to Valine) causes the most aggressive form of ALS, so make that change in the sequence

  3. Enter your mutated SOD1 sequence into the PepMLM inference API and generate 4 peptides of length 12 amino acids (Step 5 takes a while so you can also just pick 1 or 2 peptides)

    image.png

    image.png

  4. To your list, add this known SOD1-binding peptide to your list: FLYRWLPSRRGG [from -https://genesdev.cshlp.org/content/22/11/1451]

  5. Go to AlphaFold-Multimer (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). This is similar to what you did for homework last week but instead for a protein-peptide complex

    1. Set model_type: alphafold2_multimer_v3 (this model has been shown to recapitulate peptide-protein binding accurately: https://www.frontiersin.org/articles/10.3389/fbinf.2022.959160/full). * Add your query sequence - Its the SOD1Sequence:PeptideSequence.

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

    image.png

  6. After running AlphaFold-Multimer with your 5 peptides alongside your mutated SOD1 sequence, plot the ipTM scores, which measures the relative confidence of the binding region.

    image.png

  7. Provide a 1 paragraph write-up of your results

    Ans: The results show us that AlphaFold was not very confident with it predictions which remind us that these are just predictions and reality is probably something different.

Part B (Final Project: L-Protein Mutants)

<aside> ⚠️ Mandatory for MIT/Harvard Students and Committed Listeners. Due at the start of class March 12

</aside>

<aside> <img src="/icons/exclamation-mark_red.svg" alt="/icons/exclamation-mark_red.svg" width="40px" /> This homework requires computation that might take you a while to run. So please get started early.

</aside>

<aside> <img src="/icons/push-pin_green.svg" alt="/icons/push-pin_green.svg" width="40px" /> Key Links: You can read more about the final project in the Final Project Page.

</aside>

Ans:

Group Project → Chimeric E-L

Bacteriophage MS2 is a single stranded RNA virus whose genome only encodes 4 proteins -the maturation protein (A-protein), the lysis (L-Protein) protein, the coat protein (cp), and the replicase (rep) protein. Bacteriophages infect E-coli. Upon infection, the L-Protein forms pores in the E-coli cell membrane which eventually leads to breakdown of the membrane (Lysis). DnaJ is a chaperone protein in E-coli (chaperone proteins are proteins that assist during protein folding). It is thought to be involved in the lysis mechanism. In this homework, we will explore if computational models we learnt about in the last class are useful for designing variants/mutants of the lysis protein sequence. We will study the effects of L-protein mutants on the bacteriophage infectivity.

source - https://www.oaepublish.com/articles/mrr.2023.28