The document provides an overview of protein structure, types, and methods for predicting protein structure, including threading and homology modeling. It details the processes involved in each method, their advantages and disadvantages, and highlights the importance of computational tools like Rosetta for structure prediction. Additionally, it discusses the applications and limitations of these prediction methods in understanding protein functions and interactions.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0 ratings0% found this document useful (0 votes)
3 views22 pages
Computation prediction protein structure
The document provides an overview of protein structure, types, and methods for predicting protein structure, including threading and homology modeling. It details the processes involved in each method, their advantages and disadvantages, and highlights the importance of computational tools like Rosetta for structure prediction. Additionally, it discusses the applications and limitations of these prediction methods in understanding protein functions and interactions.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 22
> Contents
Introduction of Protein
Types of Protein
Methods of Protein Structure Prediction
~ Threading
~ Homology Modelling
Rosetta
Application and Lit
RNA: IF you cant hep moke profens, References
get out of the kitchen,
onvoarze ‘COMPUTATIONAL PREDICTION OF PROTEIN
‘STRUGTURE> Introduction
* Protein : A polymeric chain of amino acid residues constitutes proteins,
* Protein structure : Protein sequence folds into a unique shape that
minimizes its free potential energy.
+ A protein's structure is primarily made up of long chains of amino acids
+ The arrangement and placement of amino acids give proteins certain characteristics,
* Poly,
ide chains are synthesized by linking together amino a
* A protein is created when one or more of these chains fold in a specific way.
oxvoarze ‘COMPUTATIONAL PREDICTION OF PROTEIN 3
‘STRUGTURE> Types of Protein
+ The structure of the protein is classified at 4 levels O
Protein Structure
ad
CT Gad
acl =
Balad
Gi-helices f= sheets
aawoarze ‘COMPUTATIONAL PREDICTION OF PROTEIN.
‘STRUGTURE> Methods of Protein Structure Prediction
+ There are mainly two methods : 1) Computational methods
2) Experimental methods
Methods
Computational Methods
Experimental Methods
+ Threading Method
* Homology Modeling
Method
* X-ray Crystallography
+ NMR Spectroscopy
+ Cryo-EM
avoarze ‘COMPUTATIONAL PREDICTION OF PROTEIN
‘STRUCTURE
OA) Threading Method
+ Also called as Fold Recognition Method.
* Searched for a structural fold database to find the best matching structural fold using
energy based criteria,
+ Ttuses dynamic programming and heuristic approaches to calculate energy of raw model.
+ Lowest energy fold that correspond to the structurally a group of most compatible fold.
hk
a
re
=>
f
acioarze ‘COMPUTATIONAL PREDICTION OF PROTEIN 6
‘STRUCTURE.Conti.....
he
onvoarze
“* Steps involved in threading method
Coens
2 The design of scoring function
Cnet |
‘COMPUTATIONAL PREDICTION OF PROTEIN
‘STRUGTUREConti.....
1) Construction of structure template database
* Select the protein structures from a protein structure database.
* When selecting the protein structure, we need to make sure that we avoid protein structures
which have high sequence similarity,
* PDB, FSSP, SCOP and CATH
2) The design of the scoring function
* Mutation potential, pairwise potential, secondary structure compatibi
s and gap
penalties are the main things a scoring function should consist of,
+ Ascoring function should measure the fitness between the template and the target
sequence.
‘COMPUTATIONAL PREDICTION OF PROTEIN 8
‘STRUGTURE.Conti.....
3) Threading alignment
+ The scoring function has to be optimized so that the target sequence can be aligned with
each of the structure templates
+ In threading- based structure prediction program, both sequence are refined to account for
potential errors and inaccuracies.
4) Threading prediction
* Choose a threading alignment that is statistically likely as the threading protein,
+ Structural model is constructed by using the selected structural template and placing the
backbone atoms of the target sequence on the structural templates aligned back one
positions.
agioarze ‘COMPUTATIONAL PREDICTION OF PROTEIN 9
‘STRUCTUREConti.....
Advantages
* Itis moderately successful.
+ Itis good for proteins with less than 100 residues,
Disadvantages
* This methodology assumes that the current structural library consists of all the possible
confirmation that could possibly be deciphered experimentally.
* High computational cost is involved to screen a Library of thousands of possible folds.
+ Energy functions are simplified for efficient calculations and results are therefore
compromised.
‘Threading softwares : Modeller, Hhpred, RaptorX, Phyre and Phyre2, FALCON, MUSTER
‘COMPUTATIONAL PREDICTION OF PROTEIN 10
‘STRUGTUREConti...
B) Homology Modeling Method
+ Also called comparative modeling.
* Predict protein structure based on sequence
homology with known structure.
+ If two proteins share a high enough sequence
similarity they are likely to have very similar three
dimensional structure.
* Modeling server :- modbase, Swiss- model, etc,
aawoarze ‘COMPUTATIONAL PREDICTION OF PROTEIN "
‘STRUGTUREConti
Steps involved in Homology Modeling method
Template recognition and Initial alignment
Alignment correction.
Loop modeling
Side chain modelling
Model optimization
Model validation
avoarze ‘COMPUTATIONAL PREDICTION OF PROTEIN
‘STRUCTUREConti.....
1) Template Recognition and Initial Alignment
* Searching of the PDB for homologous protein which consists of determined structures.
+ BLAST Seareh (Basic Local Alignment Search Tool)
Ivis a search algorithm which can compare our query sequence with the sequence present
in a Library or database. And it gives us the sequence which match our query sequence.
It can search amino acid sequence of protein, nucleotides of DNA and RNA sequences.
+ FASTA Search(Fast-A)
It is a software which is primarily used for DNA and protein sequence alignment.
Ithas.a high speed of execution, First, it marks the potential matches using the pattern of
word hits and word-to-word matches of a given length,
It is also famous for its FASTA format which has become widely used in bioinformatics domain.
acioarze ‘COMPUTATIONAL PREDICTION OF PROTEIN 8
‘STRUCTURE,Conti.
2) Alignment Correction
* The alignment of two sequences of two proteins can be difficult, where the percentage
sequence identity is low.
* One can then use the other sequences present from the homologous protein to find a solution.
+ Ex - It is nearly impossible for the sequence LTLTLTLT to be aligned with sequence
YAYAYAYAYA. So we find a third sequence TYTYTYTYT which can be easily aligned with
both of them,
3) Loop Modeling
+ After the sequence alignment step, duc to insertion and deletion there are often regions
created that lead to gaps in the alignment. Loop modeling is used to fill in these gaps, but
it is comparatively less accurate
‘COMPUTATIONAL PREDICTION OF PROTEIN “
‘STRUCTUREConti...
+ The bwo main techniques are used to approach the problem
a) The database searching method : It involves going through the database of known
protein structures and finding loops and then superimposing these loops onto the two
stem region of the target regions of the target protein,
b) The ab initio method ; In this method, various loops are generated randomly. And
then, it searches of a loop in those loops which has reasonably low energy.
+ ModLoop is.a web server for automated modeling of loops in protein structures,
4) Model Optimization
* Itis used to evaluate the protein-ligand interactions at the active sites and also the
protein-protein interactions at the contact interface,
aurea ‘COMPUTATIONAL PREDICTION OF PROTEIN. %
‘STRUGTURE.Conti.....
By searching every possible combination for every torsion angle of the side chain, we can
select the one which has the lowest interaction energy with neighboring atoms.
* Model optimization is done so that the overall conformation of the molecule has the lowest
possible energy potential.
* This is done by adjusting the relative position of the atoms which leads to a better chance of
finding the true structure.
Energy = Stretching Energy + Bending Energy + Torsion Energy
+ Non-Bonded Interaction Energy
‘COMPUTATIONAL PREDICTION OF PROTEIN
acioarze
‘STRUGTURE.Conti.....
6) Model Validation
Every homology model contains errors for which there are two main reasons.
The first error occurs due to sequence identity, if it is more than 90% or less than 30% of
the template and target protein,
If the sequence identity is more than 90%, than it can be compared with the structures
which are determined through crystallography.
Else if it is less than 30%, then major errors can occur.
The second major reason for errors is the errors present in the templates itself.
Advantages
It can find the location of alpha carbons of key residues inside the folded protein.
It can hypothesize strueture-functions relationship.
‘COMPUTATIONAL PREDICTION OF PROTEIN "
‘STRUCTUREConti
3.
4
Mutagenesis experiments can be guided using this.
The putative active sites, binding pockets and ligands can be identified using the position
of the conserved regions of the protein surface.
Disadvantages
‘The side-chain positions and predicting conformations or deletions are not possible with
homology models.
Homology models are only useful for drug designing and development process only if the
sequence identity of the target protein is more than 70% with the template. Because
template with less than 70% identity cannot be used in modeling and ligand docking
studies,
‘STRUCTUREConti.....
* Error Sources In Homology Modelling
Incorrect sequence alignment is among the most devastating error in homology
modelling.
Incorrect choice of template may happen, especially for multidomain proteins.
The loop regions can be built incorrectly.
‘The person doing the modeling can also make some errors, but this type of errors can
be hard to predict and it can include any kind of errors.
There can also be errors present in the template itself.
Homology Modelling Softwares : Int\FOLD, RaptorX, Biskit, MODELLER, SWISS-
MODEL.
acioarze ‘COMPUTATIONAL PREDICTION OF PROTEIN. 19
‘STRUCTURE,> Rosetta
It is a Web server for protein 3d structure prediction.
Rosetta development began in the laboratory of Dr. David Baker at the University of
ion tool.
Washington as a structure predi
Mini threading method
Breaks down the query sequence into many short segments (3 to 9)
Predicts the secondary structure of small segments using HMM
Segments with assigned secondary structure are subsequently assemble into a 3D
configuration.
Random combination of fragments, a large number of models are built and their over
all energy potential calculated.
‘COMPUTATIONAL PREDICTION OF PROTEIN 20
‘STRUCTURE> Application and Limitation of Methods for structure prediction
These methods provide structural details to different extent,
* Homology Modeling can provide atomic level details of target protein, threading can
help only to judge the fold of the protein, High and medium level homology models
with sequence identity > 30% are useful in refining functional prediction such as ligand
binding.
+ Folds predicted by threading could be used in supporting site-directed mutagenesis
experiments, designing of stable crystallizable variants and in refining NMR
structures.
* In predicting functional relationships from structural similarity and for identification of
patches of conserved surface residues.
avoarza ‘COMPUTATIONAL PREDICTION OF PROTEIN 2
‘STRUCTURE,>» References
2.
Jisna, V.
& Jayaraj, P. B. (2021), Protein Structure Predietion: Conventional and
Deep Learning Perspectives. The protein journal, 40(4), 522-544.
htips://doi.org/10.1007/s10930- 021-10003-y
https://www.slideshare.net/ashharnomani/computational-predi
ion-of-protein| pptx#27
‘COMPUTATIONAL PREDICTION OF PROTEIN
‘STRUGTURETHANK
o YOU
i