0% found this document useful (0 votes)

21 views12 pages

Experiment-7(HOMOLOGY MODELING)

The document outlines a Bioinformatics Lab experiment focused on homology modeling for protein structure prediction. It details the theoretical background, types of modeling, and a six-step procedure for conducting the experiment using MODELLER software, including template selection, sequence alignment, and model evaluation. The document emphasizes the importance of template similarity and provides instructions for generating and validating protein models.

Uploaded by

bhimeswarmohanty39

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views12 pages

Experiment-7(HOMOLOGY MODELING)

Uploaded by

bhimeswarmohanty39

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

BIOINFORMATICS LAB BTech 6th Sem Dr. R. K.

Pradhan, OUTR Bhubaneswar

Bioinformatics Lab (2023-24)

Experiment - 7: Homology Modeling

Aim: To Perform Homology modeling for a given protein.

INSTRUCTIONS: In order to clearly understand each experiment and make best use of
content provided, we suggest you to proceed as per the following steps.

1. Initially start with the theory section, recall technical knowledge over each steps, go through
manual and define an overall design, workflow, to conduct a 2-DE experiment.
2. In protocol section learn the minute integrity required to perform the experiment by going
through the standardized protocol defined for each of the steps. ,
3. Practice; operate through each step of the experiment in simulator section to visualize the entire
process.

BASIC THEORY

Homology modeling is a computational approach for three-dimensional protein structure

modeling and prediction. Proteins whose structures are still uncharacterized can be modeled
using homology modeling. This method builds an atomic model based on experimentally
determined known structures that have sequence homology of more than 40% with the target
molecule. Modeling structures with less then 40% template similarity would result in less
reliable models and hence ignored. Homology modeling is also known as comparative
modeling.

The principle governing this approach is that if two proteins share a high sequence similarity,
they are more likely to have very similar three-dimensional structures. If one of the protein
sequences has a known structure, then this structure can be superimposed onto the unknown
protein with a high degree of confidence. Protein sequences are more conserved than DNA and
hence attribute to greater evolutionary significance.

While homology modeling predicts the positions of alpha carbons with moderate accuracy, it
is not quite reliable in predicting side chains and loops. The others approaches are threading
and ab- initio prediction.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

Types of modeling:
• Basic Modeling - Modeling using a template with very high similarity with the target sequence.

• Advanced Modeling - In this case, the target is modeled using more than one template such
that regions of the template proteins that share a high identity with portions of the target are
used individually to model these sections.

The overall homology modeling procedure consists of six steps-

Step I - Template Selection

Template selection involves searching the Protein Data Bank (PDB) for homologous proteins
with determined structures. The search can be performed using a heuristic pairwise alignment
search program like BLAST or FASTA. As a rule of thumb, a database protein should have at
least 40% sequence identity, high resolution and the most appropriate cofactors for it to be
considered as a template sequence. The protein sequence whose 3D structure is to be predicted
is called the "target sequence".

Step II – Sequence Alignment

Once the template is identified, the full-length sequences of the template and target proteins
need to be realigned using refined alignment algorithms to obtain optimal alignment. The
alignment gives specific alignment scores.

Step III - Backbone Model Building

Once optimal alignment is achieved the corresponding coordinate's residues from the template
proteins can be simply copied onto the target protein. If the two aligned residues are identical,
coordinates of the side chain atoms are copied along with the main chain atoms.
If multiple templates selected, then average coordinate values of the templates are used.

Step IV – Loop Modeling

After the sequence alignment, there are often regions created by insertions and deletions that
lead to gaps in alignment. These gaps are modeled by loop modeling, which is less accurate, a
major source of error. Currently, two main techniques are used to approach the problem:

• The database searching method - this involves finding loops from known protein structures and
superimposing them onto the two stem regions (main chains mostly) of the target protein. Some
specialized programs like FREAD and CODA can be used.
•
• The ab initio method - this generates many random loops and searches for one that has
reasonably low energy and φ and ψ angles in the allowable regions in the Ramachandran plot.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

Step V - Side Chain Refinement

After the main chain atoms are built, the positions of side chains must be determined. This is
important in evaluating protein–ligand interactions at active sites and protein–protein
interactions at the contact interface.
A side chain can be built by searching every possible conformation for every torsion angle of
the side chain to select the one that has the lowest interaction energy with neighboring atoms.
A rotamer library can also be used, which has all the favorable side chain torsion angles
extracted from known protein crystal structures can also be used for this purpose.

Step VI - Model Refinement and Model Evaluation

This step carries out the energy minimization procedure on the entire model, which adjusts the
relative position of the atoms so that the overall conformation of the molecule has the lowest
possible energy potential. The goal of energy minimization is to relieve steric collisions without
altering the overall structure. In these loop and side chain modeling steps, potential energy
calculations are applied to improve the model. Model refinement can also be done by Molecular
Dynamic simulation which moves the atoms toward a global minimum by applying various
stimulation conditions (heating, cooling, considering water molecules) thus having a better
chance at finding the true structure.

The final model has to be evaluated for checking the φ–ψ angles, chirality, bond lengths, close
contacts and also the stereo chemical properties. Various online protein validation software
packages are available such as Procheck, WHATIF, ANOLEA, Verify3D, PROSA.

• Various Comprehensive Modeling Programs are available like Modeller, SWISS MODEL,
Schrodinger, 3D- JIGSAW.

• A successful model depends on template selection, algorithm used and the validation of the
model.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

PROCEDURE
Materials Required:
System and Internet access
• Software: MODELLER9v8

Protein sequence whose 3D structure needs to be built is obtained from Uniprot

in .fasta format.
URL: http://www.uniprot.org/

Protein sequence retrieved from Uniprot in FASTA format

For this experiment, we have used MODELLER9v8 which is a Homology Modeling sofware.
All the input files for running the following software are available here. The Software accepts
the sequences with PIR format.
1) The above FASTA sequence is converted into PIR format to make it readable by the
software. This file must have the extension of .ali format.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

2) After the .ali file is created, “build_profile.py” is searched for in the available input files.
The “build_profile.py” file is obtained and the necessary changes are made.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

3) The script file is now run using the command interpreter as shown below:

On running the betalactoglobulin_build_profile.py python file, a .prf is obtained which is the

profile file of the query sequence and its homologs.

The most important columns in the profile.build() output are the II, XI and XII columns. The
II column reports the code of the PDB sequence that was compared with the target sequence.
The XI column reports the percentage sequence identities between betalactoglobulin and a
PDB sequence normalized by the lengths of the alignment. In general, a sequence identity value
above approximately 35% indicates a potential template. A better measure of the significance
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

of the alignment is given in the XII column by the e-value of the alignment. To select the most
appropriate template for our query sequence over the six similar structures, we will use the
alignment.compare_structures() command to assess the structural and sequence similarity
between the possible templates (file "betalactoglobulin_compare.py").

4). All the structural files for PDB database whose E value is 0.0 are obtained from the above
obtained .prf file and they are listed in the compare.py file for the identification of thye best
template.The Input file “alignment.compare() “ is selected and all the templates are added.

The selected PDB files are mentioned in the compare.py file as one of these could be the
probable template.
5). Now, the script file is run using the command interpreter as shown below.

6). After running the compare file, the following dendogram is displayed. The templates are
compared with each other. This comparison gives information about the crystallographic
resolution of the pdb structure. The structure with a better crystallographic R-factor and higher
overall sequence identity to the query sequence is selected to be aligned with the target
sequence. The comparison shown above is for both sequential and structural similarity. The
template with crystallographic R-factor should be in the range (1.5 to 2.5) and the sequence
similarity should be as high as possible.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

For example, the below shown situation 1bdma has an Rfactor @1.8 and sequence similarity
of >40%, therefore it can be considered a good template. When there are two templates with
optimal required parameters then both should be selected. The compare.log file is chosen to
build the structure of the unknown protein and prepare the align2d.py file.

7). The above selected template is the template according to which the model query will be
built. The align2d.py file is modified as shown below. Appending of files i.e. original .ali file
and the selected template will produce two alignment files .PIR and .PAP. After running the
aligned file, an alignment file is created which is used for modeling.
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

8). Now the script file is run using the command interpreter as shown below:

The following file is the alignment file produced by the software. In this file, the query
sequence is globally aligned with the template and the '*' indicates the perfect match in the
sequence. If the alignment results are not satisfactory, we can go back to the template selection
step and replace the template with another protein with comparable similarity.

9). Once a target-template alignment is constructed, software calculates a 3D model of the

target completely automatically, using its automodel class. The following script will generate
five similar models of betalactoglobulin based on the selected template structure and also based
on the alignment in file (file "betalactoglobulin-model-single.py").
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

The number of modes to be generated here must be mentioned, if the multiple models are
desired at the end of the session.

The Script file is run using the command interpreter as shown below:

10). The output file has a statistical score calculated called the DOPE (Discrete Optimized
Protein Energy) score and GA341 score. After the models are created, each of their DOPE
Scores and GA341 Score are observed to that rank the models on the basis of their energy
levels. Models with the lowest DOPE assessment score, or with the highest GA341 assessment
score have the most stable minimized energy, and therefore are picked from the rest of the
models for further protein structure analysis. Range of the GA341 scores are 0.0 (worst) to 1.0
(native-like) but DOPE score values are more reliable in distinguishing good models from the
bad ones. In the example shown they have selected the B99990001.pdb and B99990002.pdb
as the optimal models with the most stable structure and the least energy.

Two structures were built whose snapshots are provided below.

BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

The models created contain the co-ordinates of the model in .pdb format. These models can be
viewed by any molecular visualization software able to read .pdb files such as Rasmol,
Chimera, Pymol, etc.

Evaluation/Validation of the model generated:

11).Once the models are generated, they must be evaluated for any missing residues, side chain
and also to check for steric slashes. The model evaluation can be carried out by the software
itself or even with web based tool like ProCheck, ProSA, Verify3D.
For this example we have validated the model using the MODELLER software. We need to
obtain the model "evaluate_model.py" and mention the name of the pdb file to be evaluated.
We have mentioned the file name "betalactoglobulin.B99990002.pdb"

12). Now, the script file is run using the command interpreter as shown below:
BIOINFORMATICS LAB BTech 6th Sem Dr. R. K. Pradhan, OUTR Bhubaneswar

13). This profile is written to a file "betalactoglobulin.profile", which can be used as input to
a graphing program. For example, the plot shown below, it is plotted with GNUPLOT.
Alternatively, the plot_profiles.py script can be used which is included in the tutorial zip file
to plot profiles with the Python matplotlib package.

The above graph displays the difference in the DOPE score per residue of the template with
the model generated. This profile is written to a file "betalactoglobulin.profile", which can be
used as input to a graphing program. For example, the plot shown below, it is plotted
with GNUPLOT. By analyzing the above graph, we can see that the template residues which
are in the range of 50 - 80 and positions 240 - 270 are most conserved with the new generated
model. Further, some literature studies on the proteins also revealed one of the regions to be
active site of the protein. The variable regions of the model can be evaluated by more
comprehensive programs like Advance Modeling, Schrodinger, Robetta etc.

Note: For more details refer MODELLER Manual.

----- ALL THE BEST----
*********

Homology Modelling
No ratings yet
Homology Modelling
26 pages
Lecture for HND Homology Modelling
No ratings yet
Lecture for HND Homology Modelling
31 pages
Introduction to Bioinfo Lab
No ratings yet
Introduction to Bioinfo Lab
13 pages
2019 - Evaluating Protein Transfer Learning with TAPE
No ratings yet
2019 - Evaluating Protein Transfer Learning with TAPE
20 pages
Document (2) (14)
No ratings yet
Document (2) (14)
3 pages
Protein Modeling
No ratings yet
Protein Modeling
17 pages
Lab05 Manual
No ratings yet
Lab05 Manual
11 pages
modelling.ppt
No ratings yet
modelling.ppt
32 pages
2205.15019v1
No ratings yet
2205.15019v1
18 pages
The Asers
No ratings yet
The Asers
16 pages
Homology Modelling
No ratings yet
Homology Modelling
8 pages
Homology modeling
No ratings yet
Homology modeling
5 pages
Protein Structure Prediction Using Homology Modeling
No ratings yet
Protein Structure Prediction Using Homology Modeling
11 pages
Homology Modeling Tutorial
No ratings yet
Homology Modeling Tutorial
11 pages
Computation prediction protein structure
No ratings yet
Computation prediction protein structure
22 pages
Alpha Fold
No ratings yet
Alpha Fold
16 pages
BMS Lab 2
No ratings yet
BMS Lab 2
35 pages
BIF101 - II - Spring 2024
No ratings yet
BIF101 - II - Spring 2024
8 pages
3D Structure Prediction
No ratings yet
3D Structure Prediction
18 pages
Protein Structure Prediction.pptx
No ratings yet
Protein Structure Prediction.pptx
23 pages
Bif 401 100% Solved Final Term Paper by Sulman Ali
No ratings yet
Bif 401 100% Solved Final Term Paper by Sulman Ali
5 pages
De Novo Protein Design
No ratings yet
De Novo Protein Design
6 pages
Protein structure prediction and modeling
No ratings yet
Protein structure prediction and modeling
20 pages
Unit-5 Bioinformatics
No ratings yet
Unit-5 Bioinformatics
13 pages
Lecture 12 (Structural Bioinformatics) Cbdb30310921cec2c447276bb2d88a8f
No ratings yet
Lecture 12 (Structural Bioinformatics) Cbdb30310921cec2c447276bb2d88a8f
30 pages
Mol - Modelling of BCL-2 Final
No ratings yet
Mol - Modelling of BCL-2 Final
37 pages
Structural Bioinformatics and Protein Structure Prediction (1)
No ratings yet
Structural Bioinformatics and Protein Structure Prediction (1)
14 pages
DS20 ALL Overview
No ratings yet
DS20 ALL Overview
88 pages
Bioinformatics DA 2.1
No ratings yet
Bioinformatics DA 2.1
11 pages
Lec6-Protein Structure Prediction
No ratings yet
Lec6-Protein Structure Prediction
16 pages
Protein Side Chain Correction
No ratings yet
Protein Side Chain Correction
28 pages
Bioinformatics TM6
No ratings yet
Bioinformatics TM6
30 pages
Workshop Protein Modeling PDF
No ratings yet
Workshop Protein Modeling PDF
54 pages
Homology Modelling and Autodock
No ratings yet
Homology Modelling and Autodock
25 pages
2. Protein Structure Prediction
No ratings yet
2. Protein Structure Prediction
34 pages
Protein Modelling
No ratings yet
Protein Modelling
15 pages
Bioinformatics Notes - 17Bt54: Module - 4
No ratings yet
Bioinformatics Notes - 17Bt54: Module - 4
48 pages
Homo Logy
No ratings yet
Homo Logy
8 pages
Tutorials Molecular Bioinformatics
No ratings yet
Tutorials Molecular Bioinformatics
39 pages
Gene Pridiction and Orf
No ratings yet
Gene Pridiction and Orf
34 pages
Sanchez CurrOpinStructBiol 1997
No ratings yet
Sanchez CurrOpinStructBiol 1997
9 pages
Bioinformatics Workshop LDH Worksheet-1
No ratings yet
Bioinformatics Workshop LDH Worksheet-1
4 pages
Tertiary Structure Prediction Methods: Any Given Protein Sequence
No ratings yet
Tertiary Structure Prediction Methods: Any Given Protein Sequence
29 pages
Structural bioinformatics
No ratings yet
Structural bioinformatics
23 pages
3-D Structure of Proteins: Laws of Physics Theory of Evolution
No ratings yet
3-D Structure of Proteins: Laws of Physics Theory of Evolution
9 pages
Generation of 3D Structure of Protein (1)
No ratings yet
Generation of 3D Structure of Protein (1)
11 pages
Dr. Qudsia Yousafi
No ratings yet
Dr. Qudsia Yousafi
30 pages
Homology Modeling
No ratings yet
Homology Modeling
22 pages
Protein Modeling in Biochemistry
No ratings yet
Protein Modeling in Biochemistry
29 pages
Protein Modelling: (Building 3D Models of Proteins)
No ratings yet
Protein Modelling: (Building 3D Models of Proteins)
19 pages
Wallace Tree Multiplier
No ratings yet
Wallace Tree Multiplier
7 pages
Optics and Lasers in Engineering
No ratings yet
Optics and Lasers in Engineering
11 pages
Genome Sequencing Projects: Increase in The Number of Protein Sequences
No ratings yet
Genome Sequencing Projects: Increase in The Number of Protein Sequences
27 pages
Bif401 Solved Final Papers 2017
No ratings yet
Bif401 Solved Final Papers 2017
8 pages
Pre-Assessment Questions
No ratings yet
Pre-Assessment Questions
18 pages
Protein Modelling
No ratings yet
Protein Modelling
53 pages
Python for Chemistry: An introduction to Python algorithms, Simulations, and Programing for Chemistry (English Edition)
From Everand
Python for Chemistry: An introduction to Python algorithms, Simulations, and Programing for Chemistry (English Edition)
Dr. M. Kanagasabapathy
5/5 (1)
Homolgy Modeling
No ratings yet
Homolgy Modeling
19 pages
Protein Structure Prediction
No ratings yet
Protein Structure Prediction
17 pages
Homology Modeling, Also Known As Comparative Modeling of
No ratings yet
Homology Modeling, Also Known As Comparative Modeling of
19 pages
Homology Modeling: Ref: Structural Bioinformatics, P.E Bourne Molecular Modeling, Folkers
No ratings yet
Homology Modeling: Ref: Structural Bioinformatics, P.E Bourne Molecular Modeling, Folkers
16 pages
UNIT-5 - Part -1
No ratings yet
UNIT-5 - Part -1
82 pages
Homology Modelling
No ratings yet
Homology Modelling
29 pages
FP3 Matrices
No ratings yet
FP3 Matrices
8 pages
Brain Tumor Detection
No ratings yet
Brain Tumor Detection
25 pages
03 InformationGain
No ratings yet
03 InformationGain
20 pages
Vector Quantization: April 2006
No ratings yet
Vector Quantization: April 2006
25 pages
wst01 01 Que 20231013
No ratings yet
wst01 01 Que 20231013
13 pages
Mtech in Data Science and Machine Learning
No ratings yet
Mtech in Data Science and Machine Learning
16 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
Knowledge Graph in DT
No ratings yet
Knowledge Graph in DT
6 pages
Bond Valuation With Embedded Options
No ratings yet
Bond Valuation With Embedded Options
32 pages
Lesson 8 Association Rules
No ratings yet
Lesson 8 Association Rules
58 pages
DEEP LEARNING THEORY and LAB PROJECT2 PT2
No ratings yet
DEEP LEARNING THEORY and LAB PROJECT2 PT2
5 pages
MITx SCX KeyConcept SC0x FV PDF
No ratings yet
MITx SCX KeyConcept SC0x FV PDF
61 pages
Energy Minimization
No ratings yet
Energy Minimization
2 pages
Uji Likelihood Ratio
No ratings yet
Uji Likelihood Ratio
5 pages
14 - Pedestrian Detection Based On YOLO Network Model
No ratings yet
14 - Pedestrian Detection Based On YOLO Network Model
5 pages
Nonlinear Systems: Rooting-Finding Problem
No ratings yet
Nonlinear Systems: Rooting-Finding Problem
28 pages
RM - HW 3
No ratings yet
RM - HW 3
3 pages
True: Balicas, Jaypee M. Bsit - 3F
No ratings yet
True: Balicas, Jaypee M. Bsit - 3F
4 pages
G10 Old Exam
No ratings yet
G10 Old Exam
1 page
Week 8 and 9 - EH L5 Solutions of Tutorial and Past Year Questions
No ratings yet
Week 8 and 9 - EH L5 Solutions of Tutorial and Past Year Questions
7 pages
Plant Leaf Disease Detection Using Deep Learning: Mr. Thangavel. M - AP/ECE, Gayathri P K, Sabari K R, Prathiksha V
No ratings yet
Plant Leaf Disease Detection Using Deep Learning: Mr. Thangavel. M - AP/ECE, Gayathri P K, Sabari K R, Prathiksha V
4 pages
Pollard's Method
No ratings yet
Pollard's Method
4 pages
Fuzzy C-Mean Clustering Algorithm Modification and Adaptation For Applications
No ratings yet
Fuzzy C-Mean Clustering Algorithm Modification and Adaptation For Applications
4 pages
Kadi Sarva Vishwavidyalaya: B.E. Semester 7 Examination April - May, 2016
No ratings yet
Kadi Sarva Vishwavidyalaya: B.E. Semester 7 Examination April - May, 2016
2 pages
Individual Assignment
No ratings yet
Individual Assignment
5 pages
Scheduling 7 and Sequencing
No ratings yet
Scheduling 7 and Sequencing
25 pages
MAD Blooms Taxonomy Question Paper Format
No ratings yet
MAD Blooms Taxonomy Question Paper Format
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Experiment-7(HOMOLOGY MODELING)

Uploaded by

Experiment-7(HOMOLOGY MODELING)

Uploaded by

BIOINFORMATICS LAB BTech 6th Sem Dr. R. K.

Pradhan, OUTR Bhubaneswar

Bioinformatics Lab (2023-24)

Experiment - 7: Homology Modeling

Aim: To Perform Homology modeling for a given protein.

Homology modeling is a computational approach for three-dimensional protein structure

The overall homology modeling procedure consists of six steps-

Step I - Template Selection

Step II – Sequence Alignment

Step III - Backbone Model Building

Step IV – Loop Modeling

Step V - Side Chain Refinement

Step VI - Model Refinement and Model Evaluation

Protein sequence whose 3D structure needs to be built is obtained from Uniprot

Protein sequence retrieved from Uniprot in FASTA format

On running the betalactoglobulin_build_profile.py python file, a .prf is obtained which is the

9). Once a target-template alignment is constructed, software calculates a 3D model of the

Two structures were built whose snapshots are provided below.

Evaluation/Validation of the model generated:

Note: For more details refer MODELLER Manual.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.