Assignment: Identify potential protein-protein interactions

Assignment: Identify potential protein-protein interactions

4200 Paper Outline Assignment

Identification-Identify a “hypothetical protein” in the genome of an organism of interest

Homology-Using BLAST, identify potential well-characterized homologs

Evolution-Build a phylogenetic tree of appropriate sequences

4200 Paper Outline Assignment

Structure- Analyze predicted secondary and tertiary structure


Expression-Identify potential transcription dynamics

Interactions-Identify potential protein-protein interactions

Pathways-Identify potential involvement of the protein in cellular pathways

Next steps-Design an experiment to test the hypothesis you data suggests

This project will be submitted as both a research paper and presented to the class. The format for the presentation will mirror the paper. The research paper should be 15-20 pages, not including references. The presentation must include screen shots of the actual data and analysis. Assignment: Identify potential protein-protein interactions

4200 Paper Outline Assignment

This isn’t the only way to arrange the information.  However it is important to include all of these data.

Citation should be included

  1. Introduction
    1. Basic bioinformatics
    2. Why gene annotation has to have humans plus computers
    3. Quick summary of what we did
      • Picked a gene of interest. (Fabry Diseases) from (


  • Found hypothetical Protein( the unknown) using kegg (



  • Used NCBI Taxonomy find my 35 sequence.
  • Perform Blastx for the 35 sequence for the protein that were similar to the gene of interest and homolog gene
  • Made phylogenetic tree for all the identical protein
  • EST for protein expression. 4200 Paper Outline Assignment.
  • Used hhpred and swiss model for protein structure
  • Used Biogrid and string to check for the protein interaction


  1. Gene investigated
    1. How you chose the “unknown” or putative protein
      • We worked backward my known was used to find the unknown
      • My known gene from OMIM (fabry diseases- GLA gene)

GLA gene was placed on NCBI protein to look for the gene sequence-


  • Kegg was used to find the unknown ( Assignment: Identify potential protein-protein interactions
  • put on kegg and I found on 


  1. What is the homolog or “known”
    • What does the literature say it does (summarize 3 important papers on this)- discuss in detail- (
  2. Fabry diseases what is it , what the gene does etc. talk in details, Talk about the crystal structure ,protein structure, protein interaction etc
  • Evolution
    1. Describe alignments -(MSA alignment from the textbook -already provided under upload file)
      1. What are the strengths and weaknesses of the programs we used
    2. Describe how you chose your 35 sequences (including any problems along the way)-I did my 35 sequence three times, taxonomy didn’t work for me because the information doesn’t match when I put the Multiple alignment sequence it shows that all my sequence are different. I also used NCBI Protein to look for my unknown protein I search N-acetyltransferase 8 (pan Paniscus) and looked for only eukaryote, that didn’t work as well I had the same problem putting it on MSA , the sequence doesn’t match. I finally used generous software to delete some of the sequence, that also didn’t work I made generous blast it for me that also didn’t work , I had to go back to NCBI protein and now search for just N-acetyltransferase and then I put the 35 sewuence I found from NCBI on generous, cut some sequence our and then I have a perfect sequence I used to make a tree. The original sequence was put on MSA alignment for Clustal omega and Muscle. 4200 Paper Outline Assignment.
    3. Describe your alignment results (what is conserved and what is variable)
      1. Does this support the idea that your putative functions like the homolog?- describe the alignment of both MSA sequence Clustal and Muscle Assignment: Identify potential protein-protein interactions

Clustal Mview ( File will be send separately)

T-coffee( file will be send separately)

  1. Describe the different phylogeny programs we used, strengths and weaknesses. I used jukes cantor on Geneious and also baysian and parsimony- talk about the strength and weaknesss of the three
  2. Describe your trees
    1. Does this support the idea that your putative functions like the homolog? Anwer this question in detail and conparing this three tree together


Tree from genious


Talk about thes tree. Extraction of the sequence was used to make the tree. the smallest sequence was used as an outgroup. This tree should be compare to baysian 28 sequence(no extraction) tree below. Also compare baysian (no extraction to pasimony no extraction) (4 pages) 4200 Paper Outline Assignment.

Baysian –

Parsimony tree


  1. Expression
    1. Describe how expression data is collected

GLA was placed into NCBI unigene then EST profile and Graph was chosen from GLA Homosapien link  gene profile. Talk about the graph in details

  1. Describe the expression data you got for your homolog- Please use link for more details about the gene expression and talk in details , used some research paper to back it up -Please cite

 after talking about expression, also talk about induction of the GLA gene

  1. Structure
    1. Describe how the structure prediction programs we used work- Describe Swiss model program and hhpred-
    2. Describe your results (regions that match and don’t and what those might do to the function.)
      1. Does this support the idea that your putative functions like the homolog?

Explaining Protein structure of  Swiss model and hhpred and  comparing the real sequence 3tts to the computer suggestion, what are the different

Do sepratly for hhpred and swiss model

A)   from keg unknown


  1. Interactions
    1. Describe different databases we used and how they get their data- (string and biogrid was used)


  1. Describe the interactions your putative would engage in if it functions like the homolog.

 Write about protein interaction of this two

Protein interaction- Explaining Protein interaction of String- GLA gene and biogrid and  comparing  both….. unknown protein sequence

  • Summary and experimental
    1. Summarize your data above in a paragraph or two
    2. In order to determine if my putative functions like the homology the following experiment should be conducted. Assignment: Identify potential protein-protein interactions