0% found this document useful (0 votes)

40 views7 pages

Identification of Modelling Templates

The document describes the process of comparative protein modelling which uses known protein structures (templates) to generate 3D models of similar protein sequences (targets). It involves identifying suitable templates, aligning the target sequence to the templates, building a structural framework based on the templates, modelling any non-conserved loops and side chains, and optionally refining the model. The process is then demonstrated by modelling the 3D structure of the tumor protein p53 using two templates that share 62.6% sequence identity.

Uploaded by

Lokesh Adhikari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views7 pages

Identification of Modelling Templates

Uploaded by

Lokesh Adhikari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

INTRODUCTION

Insights into the three-dimensional (3D) structure of a protein are of great assistance when
planning experiments aimed at the understanding of protein function and during the drug
design process. The experimental elucidation of the 3D-structure of proteins is however
often hampered by difficulties in obtaining sufficient protein, diffracting crystals and many
other technical aspects. Therefore the number of solved 3D-structures increases only
slowly compared to the rate of sequencing of novel cDNAs, and no structural information
is available for the vast majority of the protein sequences registered in the SWISS-PROT
database (nearly 60'000 entries in release 34). In this context it is not surprising that
predictive methods have gained much interest.
Proteins from different sources and sometimes-diverse biological functions can
have similar sequences, and it is generally accepted that high sequence similarity is
reflected by distinct structure similarity. Indeed, the relative mean square deviation (rmsd)
of the alpha-carbon co-ordinates for protein cores sharing 50% residue identity is expected
to be around 1Å. This fact served as the premise for the development of comparative
protein modelling (also often called modelling by homology or knowledge-based
modelling), which is presently the most reliable method. Comparative model building
consists of the extrapolation of the structure for a new (target) sequence from the known
3D-structure of related family members (templates).
While the high precision structures required for detailed studies of protein-ligand
interaction can only be obtained experimentally, theoretical protein modelling provides the
molecular biologists with "low-resolution" models which hold enough essential
information about the spatial arrangement of important residues to guide the design of
experiments. The rational design of many site-directed mutagenesis experiments could
therefore be improved if more of these "low-resolution" theoretical model structures were
available.

Identification of modelling templates

Comparative protein modelling requires at least one sequence of known 3D-structure with
significant similarity to the target sequence. In order to determine if a modelling request
can be carried out, one compares the target sequence with a database of sequences derived
from the Brookhaven Protein Data Bank (PDB), using programs such as FastA and
BLAST. Sequences with a FastA score 10.0 standard deviations above the mean of the
random scores or a Poisson unlikelyhood probability P(N) lower than 10-5 (BLAST) can
be considered for the model building procedure. The choice of template structures can be
further restricted to those which share at least 30% residue identity as determined by SIM.

The above procedure might allow the selection of several suitable templates for a
given target sequence, and up to ten templates are used in the modelling process. The best
template structure - the one with the highest sequence similarity to the target - will serve
as the reference. All the other selected templates will be superimposed onto it in 3D. The
3D match is carried out by superimposing corresponding Ca atom pairs selected
automatically from the highest scoring local sequence alignment determined by SIM. This
superposition can then be optimised by maximising the number of Ca pairs in the common
core while minimising their relative mean square deviation. Each residue of the reference
structure is then aligned with a residue from every other available template structure if their
Ca atoms are located within 3.0 Å. This generates a structurally corrected multiple
sequence alignment.

Aligning the target sequence with the template sequence

The target sequence now needs to be aligned with the template sequence or, if
several templates were selected, with the structurally corrected multiple sequence
alignment. This can be achieved by using the best-scoring diagonals obtained by SIM.
Residues which should not be used for model building, for example those located in non-
conserved loops, will be ignored during the modelling process. Thus, the common core of
the target protein and the loops completely defined by at least one supplied template
structure will be built.

BUILDING THE MODEL

Framework construction

The next step is the construction of a framework, which is computed by averaging

the position of each atom in the target sequence, based on the location of the corresponding
atoms in the template. When more than one template is available, the relative contribution,
or weight, of each structure is determined by its local degree of sequence identity with the
target sequence.

Building non-conserved loops

Following framework generation, loops for which no structural information was

available in the template structures are not defined and therefore must be constructed.
Although most of the known 3D-structures available share no overall similarity with the
template, there may be similarities in the loop regions, and these can be inserted as loop
structure in the new protein model. Using a "spare part" algorithm, one searches for
fragments which could be accommodated onto the framework among the Brookhaven
Protein Data Bank (PDB) entries determined with a resolution better than 2.5 Å. Each loop
is defined by its length and its "stems", namely the alpha carbon (Ca) atom co-ordinates of
the four residues preceding and following the loop. The fragments which correspond to the
loop definition are extracted from the PDB entries and rejected if the relative mean square
deviation (rmsd) computed for their "stems" is greater than a specified cut-off value.
Furthermore, only fragments which do not overlap with neighbouring segments should be
retained. The accepted "spare parts" are sorted according to their rmsd, and a Ca framework
based on the five best fragments can be added to the model. In order to ensure that the best
possible fragments are used for loop rebuilding, the rmsd cut-off can be incremented from
0.2 onwards until all loops are rebuilt.

Completing the backbone

Since the loop building only adds Ca atoms, the backbone carbonyl and nitrogens
must be completed in these regions. This step can be performed by using a library of
pentapeptide backbone fragments derived from the PDB entries determined with a
resolution better than 2.0 Å. These fragments are then fitted to overlapping runs of five Ca
atoms of the target model. The co-ordinates of each central tripeptide are then averaged for
each target backbone atom (N, C, O) and added to the model. This process yields modelled
backbones that differ from experimental co-ordinates by approx. 0.2 Å rms.

Adding side chains

For many of the protein side chains there is no structural information available in
the templates. These cannot therefore be built during the framework generation and must
be added later. The number of side chains that need to be built is dictated by the degree of
sequence identity between target and template sequences. To this end one uses a table of
the most probable rotamers for each amino acid side chain depending on their backbone
conformation. All the allowed rotamers of the residues missing from the structure are
analysed to see if they are acceptable by a van der Waals exclusion test. The most favoured
rotamer is added to the model. The atoms defining the c1 and c2 angles of incomplete side
chains can be used to restrict the choice of rotamers to those fitting these angles. If some
side chains cannot be rebuilt in a first attempt, they will be assigned initially in a second
pass. This allows some side chains to be rebuilt even if the most probable allowed rotamer
of a neighbouring residue already occupies some of this portion of space. The latter may
then switch to a less probable but allowed rotamer. In case that not all of the side chains
can be added, an additional tolerance of 0.15 Å can be introduced in the van der Waals
exclusion test and the procedure repeated.

Model refinement

Idealisation of bond geometry and removal of unfavourable non-bonded contacts can be

performed by energy minimisation with force fields such as CHARMM, AMBER or
GROMOS. The refinement of a primary model should be performed by no more than 100
steps of steepest descent, followed by 200-300 steps of conjugate gradient energy
minimisation. Experience has shown models optimised that energy minimisation (or
molecular dynamics) usually move away from a control structure. It is thus necessary to
keep the number of minimisation steps to a minimum. Constraining the positions of
selected atoms (such as Ca, or using a B-factor based function) in each residue generally
helps avoiding excessive structural drift during force field computations

MOLECULAR MODELLING

PROTEIN - Cellular tumor antigen p53

Organism - Homo sapiens (Human)

Methods.

Amino acid sequence of this protein was found from www.pubmed.com and then
submitted to geno3D. Geno3D returned seven templates, out of these seven template two
templates with 62.6 % identity, were selected to modelling. Result of modelling obtained
via e-mail,which was analyzed further. Details of which are given underneath.

Amino acid Sequence of protein

MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG
GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

This chain was modeled using two templates

Template Alignment Secondary information (Sov) Identity
Pdb3q05a_0 ali_antheprot ali_clustalw -- 92.7%
Pdb3ts8A_0 ali_antheprot ali_clustalw -- 92.7%

Out of these twotemplates the last one “pdb3ts8A_0” was selected

Model building using software.

Model 1 was selected due to owing lowest energy and having highest number of amino
acids in core region. Wire frame model (Fig 2) and secondary structure showing model
with residue (Fig 3) was obtained using Rasmol software.
Fig 2: Wire frame structure of model-1, generated by Rasmol Software
Fig 3: Ribbon model of model-1, generated by Rasmol Software

Lab 4a
83% (6)
Lab 4a
5 pages
Pilkington Float Glass
No ratings yet
Pilkington Float Glass
27 pages
Refractory Materials
50% (2)
Refractory Materials
16 pages
Protein Modelling: (Building 3D Models of Proteins)
No ratings yet
Protein Modelling: (Building 3D Models of Proteins)
19 pages
Bioinformatics Notes - 17Bt54: Module - 4
No ratings yet
Bioinformatics Notes - 17Bt54: Module - 4
48 pages
Protein Modelling
No ratings yet
Protein Modelling
15 pages
Protein Structure Similarity: Mlesnick@stanford - Edu
No ratings yet
Protein Structure Similarity: Mlesnick@stanford - Edu
8 pages
Protein Side Chain Correction
No ratings yet
Protein Side Chain Correction
28 pages
Sanchez CurrOpinStructBiol 1997
No ratings yet
Sanchez CurrOpinStructBiol 1997
9 pages
Big Paper
No ratings yet
Big Paper
24 pages
Protein Modeling in Biochemistry
No ratings yet
Protein Modeling in Biochemistry
29 pages
Protein Threading
No ratings yet
Protein Threading
9 pages
Experiment-7 (HOMOLOGY MODELING)
No ratings yet
Experiment-7 (HOMOLOGY MODELING)
12 pages
Research Article: Vanitha NM ., Jayarama Reddy and Ranganathan T.V
No ratings yet
Research Article: Vanitha NM ., Jayarama Reddy and Ranganathan T.V
10 pages
Alignment Correction
No ratings yet
Alignment Correction
3 pages
Requency Domain Approach To Protein Sequence Similarity Analysis and Functional Classification
No ratings yet
Requency Domain Approach To Protein Sequence Similarity Analysis and Functional Classification
14 pages
Lec6-Protein Structure Prediction
No ratings yet
Lec6-Protein Structure Prediction
16 pages
Mol - Modelling of BCL-2 Final
No ratings yet
Mol - Modelling of BCL-2 Final
37 pages
Презентация Microsoft Office PowerPoint
No ratings yet
Презентация Microsoft Office PowerPoint
17 pages
Bioinformatics TM6
No ratings yet
Bioinformatics TM6
30 pages
Homolgy Modeling
No ratings yet
Homolgy Modeling
19 pages
Lecture For HND Homology Modelling
No ratings yet
Lecture For HND Homology Modelling
31 pages
Template Recognition and Initial Alignment
No ratings yet
Template Recognition and Initial Alignment
12 pages
Protein Structure and Sequence Generation With Equivariant Denoising Diffusion Probabilistic Models
No ratings yet
Protein Structure and Sequence Generation With Equivariant Denoising Diffusion Probabilistic Models
18 pages
Bioinformatics Definition
No ratings yet
Bioinformatics Definition
11 pages
Aligning Multiple Protein Structures Using
No ratings yet
Aligning Multiple Protein Structures Using
7 pages
Protein Structure Prediction
No ratings yet
Protein Structure Prediction
34 pages
Proposal
No ratings yet
Proposal
5 pages
Tertiary Structure Prediction Methods: Any Given Protein Sequence
No ratings yet
Tertiary Structure Prediction Methods: Any Given Protein Sequence
29 pages
Structural Bioinformatics
No ratings yet
Structural Bioinformatics
37 pages
Generation of 3D Structure of Protein
No ratings yet
Generation of 3D Structure of Protein
11 pages
Secondary Structure Prediction
No ratings yet
Secondary Structure Prediction
12 pages
Homology Modeling, Also Known As Comparative Modeling of
No ratings yet
Homology Modeling, Also Known As Comparative Modeling of
19 pages
Bioinformatics (STH Sir)
No ratings yet
Bioinformatics (STH Sir)
13 pages
Protein Structure Prediction and Modeling
No ratings yet
Protein Structure Prediction and Modeling
20 pages
Gene Pridiction and Orf
No ratings yet
Gene Pridiction and Orf
34 pages
Qas ( ( ( ( (
No ratings yet
Qas ( ( ( ( (
2 pages
BMC Bioinformatics: Robust Probabilistic Superposition and Comparison of Protein Structures
No ratings yet
BMC Bioinformatics: Robust Probabilistic Superposition and Comparison of Protein Structures
29 pages
Homology Modeling
No ratings yet
Homology Modeling
5 pages
Genome Sequencing Projects: Increase in The Number of Protein Sequences
No ratings yet
Genome Sequencing Projects: Increase in The Number of Protein Sequences
27 pages
Protein Tertiary Structures: Prediction From Amino Acid Sequences
No ratings yet
Protein Tertiary Structures: Prediction From Amino Acid Sequences
7 pages
Dr. Qudsia Yousafi
No ratings yet
Dr. Qudsia Yousafi
30 pages
Assignent-01/Abhishek Mishra/HBTI Kanpur Bioinformatics-Programs & Tools
No ratings yet
Assignent-01/Abhishek Mishra/HBTI Kanpur Bioinformatics-Programs & Tools
9 pages
Introduction To Structural Databases
No ratings yet
Introduction To Structural Databases
10 pages
Homology Modeling: Ref: Structural Bioinformatics, P.E Bourne Molecular Modeling, Folkers
No ratings yet
Homology Modeling: Ref: Structural Bioinformatics, P.E Bourne Molecular Modeling, Folkers
16 pages
Lecture3-Structural Bioinformatics-Secondary Resources
No ratings yet
Lecture3-Structural Bioinformatics-Secondary Resources
26 pages
Degenerate Primer
No ratings yet
Degenerate Primer
2 pages
Lec 2 Bioinformatics Glossary
No ratings yet
Lec 2 Bioinformatics Glossary
6 pages
IJETR2242
No ratings yet
IJETR2242
5 pages
Independent Study and Research
No ratings yet
Independent Study and Research
23 pages
PDBefold Tutorial
No ratings yet
PDBefold Tutorial
14 pages
SSRN 4541252
No ratings yet
SSRN 4541252
25 pages
Background:: A Modeling and Simulation Web Tool For Plant Biologists
No ratings yet
Background:: A Modeling and Simulation Web Tool For Plant Biologists
3 pages
Science Adf6591
No ratings yet
Science Adf6591
9 pages
Top-Down Design of Protein Architectures With Reinforcement Learning.
No ratings yet
Top-Down Design of Protein Architectures With Reinforcement Learning.
9 pages
Proteomics & Genomics
No ratings yet
Proteomics & Genomics
11 pages
Protein Contact Prediction From Amino Acid Co-Evolution Using Convolutional Networks For Graph-Valued Images
No ratings yet
Protein Contact Prediction From Amino Acid Co-Evolution Using Convolutional Networks For Graph-Valued Images
9 pages
Fusion of BLAST and Ensemble of Classifiers For Protein Secondary Structure Prediction
No ratings yet
Fusion of BLAST and Ensemble of Classifiers For Protein Secondary Structure Prediction
9 pages
Advanced Chemistryprize2024
No ratings yet
Advanced Chemistryprize2024
17 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Classical Approach to Constrained and Unconstrained Molecular Dynamics
From Everand
Classical Approach to Constrained and Unconstrained Molecular Dynamics
Ajith Gunaratne
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Reviews in Computational Chemistry, Volume 31
From Everand
Reviews in Computational Chemistry, Volume 31
Abby L. Parrill
No ratings yet
Caocl + H20 Ca (Oh) + CL: + 2ki 2Kcl + I + 2na + 2nai
No ratings yet
Caocl + H20 Ca (Oh) + CL: + 2ki 2Kcl + I + 2na + 2nai
5 pages
060 Alphabet A Gamma Decay
No ratings yet
060 Alphabet A Gamma Decay
3 pages
Ray Optics Assignment
100% (1)
Ray Optics Assignment
20 pages
Module 6 GCLP LABORATORY REAGENTS KITS MATERIALS
No ratings yet
Module 6 GCLP LABORATORY REAGENTS KITS MATERIALS
20 pages
Steel CH 4, Web Effects, 2009
No ratings yet
Steel CH 4, Web Effects, 2009
10 pages
Levenol Laundry 2011 Eng A4 2011-03
100% (1)
Levenol Laundry 2011 Eng A4 2011-03
8 pages
A600-40694 - Rev04 EN13000 - A600-1
100% (1)
A600-40694 - Rev04 EN13000 - A600-1
40 pages
Ref Manual - HA SFC Diff Cements - A Mishra PDF
No ratings yet
Ref Manual - HA SFC Diff Cements - A Mishra PDF
5 pages
The University of Trinidad & Tobago: Final Assessment/Examinations July 2011
No ratings yet
The University of Trinidad & Tobago: Final Assessment/Examinations July 2011
4 pages
PW 01 - 25.02-1.03.2019 - Physical Quantities and Error Calculations PDF
No ratings yet
PW 01 - 25.02-1.03.2019 - Physical Quantities and Error Calculations PDF
53 pages
Biaxial Testing of Sheet Metal: An Experimental-Numerical Analysis
No ratings yet
Biaxial Testing of Sheet Metal: An Experimental-Numerical Analysis
94 pages
Biaxial Bending of Long Columns
No ratings yet
Biaxial Bending of Long Columns
97 pages
Fenton's Unification Formula NOTES
No ratings yet
Fenton's Unification Formula NOTES
30 pages
Loadings and Structure Groups
No ratings yet
Loadings and Structure Groups
17 pages
Pre - Lab Assignment: Heat of Hydration
No ratings yet
Pre - Lab Assignment: Heat of Hydration
4 pages
Hnicem 2013 Format
No ratings yet
Hnicem 2013 Format
3 pages
481 Lecture 17
No ratings yet
481 Lecture 17
7 pages
1final Syllabus-CIVIL - (1st To 8th Semester)
No ratings yet
1final Syllabus-CIVIL - (1st To 8th Semester)
130 pages
Modern Steel Construction December 2019 Sanet - ST
100% (1)
Modern Steel Construction December 2019 Sanet - ST
70 pages
Rotor Flux Oriented Induction Machine As A DC Power Generator
No ratings yet
Rotor Flux Oriented Induction Machine As A DC Power Generator
8 pages
60 Practice Problems For CH 8
No ratings yet
60 Practice Problems For CH 8
8 pages
Mathematics and Reality
No ratings yet
Mathematics and Reality
27 pages
APM2513NU: Pin Description Features
No ratings yet
APM2513NU: Pin Description Features
11 pages
Heat of Fusion For Ice
No ratings yet
Heat of Fusion For Ice
8 pages
Life Energy Meter
100% (1)
Life Energy Meter
20 pages
Digital Image Processing
No ratings yet
Digital Image Processing
32 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Identification of Modelling Templates

Uploaded by

Identification of Modelling Templates

Uploaded by

INTRODUCTION

Identification of modelling templates

Aligning the target sequence with the template sequence

BUILDING THE MODEL

The next step is the construction of a framework, which is computed by averaging

Building non-conserved loops

Following framework generation, loops for which no structural information was

Completing the backbone

Adding side chains

Idealisation of bond geometry and removal of unfavourable non-bonded contacts can be

PROTEIN - Cellular tumor antigen p53

Organism - Homo sapiens (Human)

Amino acid Sequence of protein

This chain was modeled using two templates

Out of these twotemplates the last one “pdb3ts8A_0” was selected

Model building using software.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.