0% found this document useful (0 votes)

3 views

Structural bioinformatics

The document provides an overview of structural bioinformatics, focusing on protein structures, their determination methods, and the significance of their shapes in biological functions. It discusses various aspects such as the Protein Data Bank (PDB), visualization techniques, and the interactions that stabilize protein structures. Additionally, it highlights the importance of understanding protein structures for drug design and disease mechanisms.

Uploaded by

wemaxkrevskiy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Structural bioinformatics

Uploaded by

wemaxkrevskiy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Structural bioinformatics

Dr. Zsuzsanna Dosztányi

dosztanyi@caesar.elte.hu

7 October 2019
Basic features of protein structures
Structure determination methods
PDB database
Visualization structures
Analysis of protein structures
Practical
PDB
Visualization of protein structures
• “It has not escaped our notice that the specific pairing we
have postulated immediately suggests a possible copying
mechanism for the genetic material.”
Wide range of functions of proteins
Amino acid Peptide bond
Carboxyl group Amino acid 1 Amino acid 2

Amino group

Peptide bond

Side chain Condensation (water

Dipeptide
molecule release)

Amino acids

Polypeptide e

chain
tid
ep
yp
ol
P

Amino acid
20 Amino acids
How many different sequences can a
100 amino acid long protein have?

20100 ~10130

Number of protons in the observable

universe is around 1080

Proteins are usually longer

The longest one is around 30 000 AA
Globular proteins

They adopt a well-defined

compact structure
Why are protein structures
interesting?
Function is heavily dependent on the shape
of the protein

Atomic-level understanding of biological processes

(DNA, RNA, enzymes, hormones, receptors)
Understanding the molecular basis of diseases
Drug design, protein-drug interactions
“Protein structure is the three-dimensional
arrangement of atoms in a protein molecule”
Quaternary structure
Myoglobin
Monomer 

Hemoglobin
Multi subunit protein
complexes

– Homo and hetero 
oligomers

 
Interactions stabilizing proteins

Hydrogen bond
Between side chains and Ionic bond
the main chain
Hydrophobic effect
Van der Waals
interaction

Hydrogen bond
Between side chains
Disulphide
bridge
Van der Waals interaction

Between every atom pairs

Attractive and repulsive term

VdW radius
Water molecules
Hydrophobic effect

Hydrophobic effect: dominated by entropic factors

Hydrophobicity
Hydrophobic
residues

Hydrophilic residues

Protein in isolation Protein in aquaneous

environment
H-bond

Hydrogen bonds are formed by a H-atom bound in the structure with a high
electronegativity atom (F, N, O) from a different functional group, i.e. a
hydrogen atom establishes a bond between two other atoms.
H-bonds in biological systems
Electrostatics interactions

- ε a dielectric constant

- characterizes how much the

charges are shielded by the
environment

- ε is 1 in vacuum, around 4 inside a

protein, in water ~ 80
Strength of interactions
Main chain conformation
N-terminal C-terminal

Torsion angle
The main chain φ and ψ
clockwise (+)
counter-clockwise
torsion angles of a protein
(-) cannot take arbitrary values,
there are preferred
conformations.
Rotation around N-
C bond: phi

Rotation around C-

C' bond: psi
Ramachandran plot

● Using the φ, ψ angles we can

evaluate a structure
● Each secondary structure has a
distinct region
● Glycines and prolines are not
represented, they have special
conformational preferences
The (right-handed)  helix
Approx. 30% of globular proteins

5-40 residues in length (10 on

average)

Individual H-bonds are relatively

weak, they have a significant
contribution to helix stability
β sheet conformation
Approx. 30% of globular
proteins

Strands of 5-10 residues run

in parallel

Strands are held together by

H-bonds
Loops and turns
Typically have hydrophilic
characters. Occur on the
outer regions of the protein,
form H-bonds with water and
other molecules

Often form binding regions

and active sites in enzymes
and receptors
Domains: many proteins feature distinct compact
structural units

Src protein kinase

Domains
●Compact units with globular-like structures
●Domains are basic building blocks of proteins
●Typically fulfill a well-specified function
●Can appear in various biological contexts
SH2 domain
How to find a specific structure?
�Database
�Wordwide Protein Data Bank (wwPDB)
�3 entries:
�PDB Europe
�PDB Japan
�RSCB PDB
�Easy to use
�Search by name, ID
�Direct links
�Well maintained
Where does a structure come from?

X-ray crystallography

NMR

Electron microscopy
Where do these structures come from?

Most structures are solved by

x-ray crystallography (86%)
X-ray crystallography
X-ray:

- X-rays have short wave lengths

(approx. 1.5 Å) – needed to
measure the typical atom-atom
distances

- gives information about electron

density, the model has to be fit
into that

- crystallization artefacts

- non-physiological environment

- no information on hydrogens
NMR

- in solution
- usually yields a structural ensemble that fulfills the distance constraints
- only small proteins
- less precise model
- usable for flexible proteins as well
Cryo-EM (atomic resolution)
Nobel Prize in Chemistry 2017
PDB statistics
PDB ID: unique identifier

Each atomic coordinate file in the Protein Data Bank has a unique identifier
composed of exactly 4 characters. The first one is always a number, the rest
can be either a number or a letter.

There are over 400,000 possible 4-digit PDB IDs (419,904 or 466,560 if "0"
can also be the first character).

Examples:
• 1mbn - 1973, the first protein structure model, myoglobin
• 1tna - 1975, the first RNA structure, yeast phenylalanine transfer RNA
• 1bna - 1980, the first B-DNA double helix structure (determined using X-ray
27 years after the 1953 theoretically determined structural model of
Watson & Crick)
• 2hhd - human hemoglobin, (deoxy form)
• 9ins - insulin
The .pdb file format
HEADER EXTRACELLULAR MATRIX 22-JAN-98 1A3I
TITLE X-RAY CRYSTALLOGRAPHIC DETERMINATION OF A COLLAGEN-LIKE
TITLE 2 PEPTIDE WITH THE REPEATING SEQUENCE (PRO-PRO-GLY)
...
EXPDTA X-RAY DIFFRACTION
AUTHOR R.Z.KRAMER,L.VITAGLIANO,J.BELLA,R.BERISIO,L.MAZZARELLA,
AUTHOR 2 B.BRODSKY,A.ZAGARI,H.M.BERMAN
...
REMARK 350 BIOMOLECULE: 1
REMARK 350 APPLY THE FOLLOWING TO CHAINS: A, B, C
REMARK 350 BIOMT1 1 1.000000 0.000000 0.000000 0.00000
REMARK 350 BIOMT2 1 0.000000 1.000000 0.000000 0.00000
...
SEQRES 1 A 9 PRO PRO GLY PRO PRO GLY PRO PRO GLY
SEQRES 1 B 6 PRO PRO GLY PRO PRO GLY
SEQRES 1 C 6 PRO PRO GLY PRO PRO GLY
...
ATOM 1 N PRO A 1 8.316 21.206 21.530 1.00 17.44 N
ATOM 2 CA PRO A 1 7.608 20.729 20.336 1.00 17.44 C
ATOM 3 C PRO A 1 8.487 20.707 19.092 1.00 17.44 C
ATOM 4 O PRO A 1 9.466 21.457 19.005 1.00 17.44 O
ATOM 5 CB PRO A 1 6.460 21.723 20.211 1.00 22.26 C
...
HETATM 130 C ACY 401 3.682 22.541 11.236 1.00 21.19 C
HETATM 131 O ACY 401 2.807 23.097 10.553 1.00 21.19 O
HETATM 132 OXT ACY 401 4.306 23.101 12.291 1.00 21.19 O
...
The .pdb file format

atom 3D coordinates
number and name atom type
occupancy B-factor
residue type, chain ID, number
Model
All protein structures are models! Structures are not directly measured, but are
generated as models that best fit the collected experimental data.
How good is a structure?
How much is the resolution of the collected data? (In case of
X-ray)

R-factor/FreeR-factor (X-ray)
How well does the data fit to the experimental mesurements?

There is no standard for NMR structures

Ramachandran Plot
Geometry and stereo-chemistry

How similar are these to good known structures

Resolution (X-ray)

• Describes the reliability of

determined atomic coordinates

Very low:>4Å
Individual coordinates cannot be interpreted
Low: 3.0-4.0Å
The fold is recognizable
Average: 1.8-3.0Å
The majority of the structure is correct, with
incorrect rotamers and unreliable surface loop
conformations
Good: 1.0 – 1.8Å
Atomic level: <1.0A

Resolution can change for each

position!
Resolution
Cross-references with other
databases
Visualizing protein structures
1. PDB coordinate file

4. Molecule image
3. Computer

2. Visualization program

Eg.: Rasmol, Pymol, Chimera,

VMD, Jmol, Swiss PDB viewer
The inside of the protein is tightly
packed
Hydrophobic core
Hydrophobic side chains go into the core of the
molecule – but the main chain is highly polar.
The polar groups (C=O and NH) are neutralized
through formation of H-bonds.

Myoglobin

surface buried
Secondary structures are stabilized
by H-bonds
Secondary structure determination
Can be based on:
H-bond patterns
Dihedral angles

Automatic determination using algorithms

DSSP
STRIDE

3 (alpha, beta, coil)

or more categories (e.g. turn, other helix types)

Do not agree 100%

Globin evolution
Similarity between two
structures

Superposition: minimizing distances between positions

RMSD

The most commonly used function

for measuring structural similarity

RMSD is the average distance

between equivalent atoms of
superimposed structures
Structures of evolutionarily related
proteins are usually similar
1ebhA: enolase
1mns : mandelate racemase

Sequence identity: 25%

Active center is very similar
Simlar chemical reactions
Different substrate

The structure is usually more conserved than the sequence

Structural classification
We can group similar and evolutionarily related
protein structures using classification
Example
CATH
http://www.cathdb.info/
SCOP
http://scop2.mrc-lmb.cam.ac.uk/
Structural classification

CATH
Fold - topology

Proteins belonging to the same

fold contain roughly the same
secondary structure elements
in the same order and similar
spatial configuration.
Homolous and analogous structures
Homolous proteins evolved from a common
ancestor via divergence, and share the same
fold
Analogous proteins share the same fold but do
not have and evolutionary relationship (or it is
undetectable)
Some folds are more common (due to physical
effects)
The number of folds is limited
Membrane proteins

Important for:
Energy production
Transport
Cell-cell junction
Signaling

Drug targets
Hydrophobicity of membrane
proteins

Cross-section

Aquaporin Aquaporin
Structure determination of
transmembrane proteins

Approx. 2% of
PDB structures
Conformational changes
Proteins are dynamic
Proteins are dynamic molecules

X-ray NMR
B-factor Structural variability
Missing structure parts

tegniddsliggnasaegpegegtestv

Missing regions in the protein NMR structures with high structural

structure variability
P53 protein

From these parts we only have structures that are in complexes. Why?
Intrinsically disordered protein
●Proteins that do not form a well defined three
dimensional structure
●More than 50% of human proteins are either
fully of partially disordered
●Typical function: Regulations, apoptosis, signal
transduction, stress response
●Malfunctions can lead to diseases
− p53 → Cancer protein
− τ protein → Alzheimer
− Synuclein → Parkinson
Ensemble characterization for IDPs
Experimental methods cannot detect a single conformation, only
time or ensemble averages

Combination of methods are needed (NMR, SAXS)

+ molecular dynamics/modelling

Methods are used to characterize

Radius of gyration
Transient secondary structure elements
Transient long range contacts
Extended structure-function paradigm

2011 H2 Chemistry Paper 3 Suggested Solutions
0% (1)
2011 H2 Chemistry Paper 3 Suggested Solutions
7 pages
KVS Lucknow XII CHE QP & MS Pre-Board (23-24)
100% (1)
KVS Lucknow XII CHE QP & MS Pre-Board (23-24)
11 pages
2020 UST MBR Biochemistry
No ratings yet
2020 UST MBR Biochemistry
77 pages
Lecture 03 Sep09
No ratings yet
Lecture 03 Sep09
40 pages
Protein Structure and Function
No ratings yet
Protein Structure and Function
52 pages
Protein Folds and Structure
No ratings yet
Protein Folds and Structure
19 pages
Sanjana Synopsis
No ratings yet
Sanjana Synopsis
5 pages
Fat Noews Docx (3)
No ratings yet
Fat Noews Docx (3)
28 pages
Bioinfo - S1 2021 - L9 - Protein Structure - 1 Slide
No ratings yet
Bioinfo - S1 2021 - L9 - Protein Structure - 1 Slide
87 pages
Chapter 6 Proteins Three Dimensional Structure
No ratings yet
Chapter 6 Proteins Three Dimensional Structure
37 pages
Proteins DR Wurie
No ratings yet
Proteins DR Wurie
70 pages
Peptides and Proteins: M.Prasad Naidu MSC Medical Biochemistry, PH.D
No ratings yet
Peptides and Proteins: M.Prasad Naidu MSC Medical Biochemistry, PH.D
30 pages
Protein Structure Views by Different Softwares: Assignment Topic
No ratings yet
Protein Structure Views by Different Softwares: Assignment Topic
9 pages
Lecture 2 - Biomolecules
No ratings yet
Lecture 2 - Biomolecules
55 pages
Formats
No ratings yet
Formats
7 pages
INTRO TO A-ACIDS AND PROTEINS
No ratings yet
INTRO TO A-ACIDS AND PROTEINS
50 pages
03 - Intro UNIT 2
No ratings yet
03 - Intro UNIT 2
24 pages
Lect 3 - Conformation 3D Structure of Proteins
No ratings yet
Lect 3 - Conformation 3D Structure of Proteins
165 pages
_second_done_w12_13_Protein_structure_and_fold_prediction
No ratings yet
_second_done_w12_13_Protein_structure_and_fold_prediction
62 pages
Berg 8e Lecture Slides Ch02 2023 (1)
No ratings yet
Berg 8e Lecture Slides Ch02 2023 (1)
209 pages
Proteins
100% (5)
Proteins
65 pages
Motifs of Protein Structure
No ratings yet
Motifs of Protein Structure
41 pages
1 2 Protein
No ratings yet
1 2 Protein
71 pages
Adobe Scan Apr 10, 2024
No ratings yet
Adobe Scan Apr 10, 2024
17 pages
CH 02 Web
No ratings yet
CH 02 Web
96 pages
Chapter Four The Three-Dimensional Structure of Proteins: Mary K. Campbell Shawn O. Farrell
No ratings yet
Chapter Four The Three-Dimensional Structure of Proteins: Mary K. Campbell Shawn O. Farrell
45 pages
Lecture 5 - Proteins and Nucleic Acids PDF
No ratings yet
Lecture 5 - Proteins and Nucleic Acids PDF
49 pages
Lecture4-Protein Data Analysis
No ratings yet
Lecture4-Protein Data Analysis
26 pages
Protein Biochemistry: A Brief Introduction
No ratings yet
Protein Biochemistry: A Brief Introduction
22 pages
Unit 1: Structure Determination: Protein Structure Database PDB PDB File Format Ramachandran Plot
No ratings yet
Unit 1: Structure Determination: Protein Structure Database PDB PDB File Format Ramachandran Plot
33 pages
Models to Study Protein Structures (1)
No ratings yet
Models to Study Protein Structures (1)
11 pages
Bioc(I) Chapter-6
No ratings yet
Bioc(I) Chapter-6
45 pages
Lecture III_Proteins
No ratings yet
Lecture III_Proteins
13 pages
CS273 - Protein Structure Prediction
No ratings yet
CS273 - Protein Structure Prediction
39 pages
Protein Structure: Predictive Methods and Experimental Methodologies
No ratings yet
Protein Structure: Predictive Methods and Experimental Methodologies
33 pages
Manya Synopsis
No ratings yet
Manya Synopsis
5 pages
Bio 1.proteins
No ratings yet
Bio 1.proteins
88 pages
Zarmeena#07
No ratings yet
Zarmeena#07
15 pages
Lab Report 2 Bioinformatics
No ratings yet
Lab Report 2 Bioinformatics
17 pages
1A5 Proteins
No ratings yet
1A5 Proteins
19 pages
Protein 3d
No ratings yet
Protein 3d
86 pages
Proteins: Presented By: Deepika Kaithal M.Sc. Semester Iv Department of Applied Physics
No ratings yet
Proteins: Presented By: Deepika Kaithal M.Sc. Semester Iv Department of Applied Physics
23 pages
0. Proteins and nucleic acids
No ratings yet
0. Proteins and nucleic acids
30 pages
Proteins_I
No ratings yet
Proteins_I
49 pages
Protein Structure
No ratings yet
Protein Structure
52 pages
Methods of Isolation & Identification of Proteins
No ratings yet
Methods of Isolation & Identification of Proteins
55 pages
Protein Engineering
No ratings yet
Protein Engineering
45 pages
Protein English Aug 2006
No ratings yet
Protein English Aug 2006
18 pages
Proteins
No ratings yet
Proteins
33 pages
Chapt 1-2 Introduction, Structure
No ratings yet
Chapt 1-2 Introduction, Structure
49 pages
SEM-III H-Protein
No ratings yet
SEM-III H-Protein
9 pages
How To Use The PDB: 1) What Is Protein Data Bank (PDB) ?
No ratings yet
How To Use The PDB: 1) What Is Protein Data Bank (PDB) ?
4 pages
1. Introduction
No ratings yet
1. Introduction
32 pages
Biochemistry - Proteins-Structures
No ratings yet
Biochemistry - Proteins-Structures
34 pages
Unit I Protein Structure
No ratings yet
Unit I Protein Structure
66 pages
Biochem Module 4 - Proteins and Structure
No ratings yet
Biochem Module 4 - Proteins and Structure
15 pages
Molecular Life Sciences An Encyclopedic Reference Ellis Bell Eds pdf download
No ratings yet
Molecular Life Sciences An Encyclopedic Reference Ellis Bell Eds pdf download
76 pages
11-12 Protein Structure Function
No ratings yet
11-12 Protein Structure Function
85 pages
Amino_Acids_Phoenix
No ratings yet
Amino_Acids_Phoenix
6 pages
B1b Proteins & Enzymes SLIDES
No ratings yet
B1b Proteins & Enzymes SLIDES
114 pages
Proteins 23
No ratings yet
Proteins 23
38 pages
Scop & Cath: Dr. M.I. Hassan
No ratings yet
Scop & Cath: Dr. M.I. Hassan
50 pages
Utilizing Web-Based Search Engines for Analyzing Biological Macromolecules
From Everand
Utilizing Web-Based Search Engines for Analyzing Biological Macromolecules
Natalie Roberts
No ratings yet
Powerpoint in Conchem
No ratings yet
Powerpoint in Conchem
12 pages
Biotechnology Applications of Amino Acids in Protein Purification and Formulations Review Article
No ratings yet
Biotechnology Applications of Amino Acids in Protein Purification and Formulations Review Article
19 pages
Chapter 3 Proteins & Proteins Purification Techniques
No ratings yet
Chapter 3 Proteins & Proteins Purification Techniques
10 pages
A Topology-Based Network Tree For The Prediction of Protein-Protein Binding Affinity Changes Following Mutation
No ratings yet
A Topology-Based Network Tree For The Prediction of Protein-Protein Binding Affinity Changes Following Mutation
8 pages
Interleukin10 Francesco M Marincola instant download
No ratings yet
Interleukin10 Francesco M Marincola instant download
88 pages
Lecture 7 Protein Folding: James Chou BCMP201 Spring 2008
No ratings yet
Lecture 7 Protein Folding: James Chou BCMP201 Spring 2008
40 pages
Two Generous Categories of Protein
No ratings yet
Two Generous Categories of Protein
3 pages
GKL 789
No ratings yet
GKL 789
10 pages
BIOT F417 BITS Pilani Dubai Campus
No ratings yet
BIOT F417 BITS Pilani Dubai Campus
5 pages
CHON Activity
No ratings yet
CHON Activity
4 pages
Unit - 4 Proteins
No ratings yet
Unit - 4 Proteins
13 pages
bc4904 Essays
No ratings yet
bc4904 Essays
4 pages
Sameer Kumar Yadav, Ravishankar Jaiswal, Kishor Shende
100% (1)
Sameer Kumar Yadav, Ravishankar Jaiswal, Kishor Shende
7 pages
Protein Flexibility and Ligand Recognition: Challenges For Molecular Modeling
No ratings yet
Protein Flexibility and Ligand Recognition: Challenges For Molecular Modeling
19 pages
Chapter 29 - Amino Acids, Polypeptides and Proteins
No ratings yet
Chapter 29 - Amino Acids, Polypeptides and Proteins
120 pages
Protein Folding Models
No ratings yet
Protein Folding Models
9 pages
Molecular Cell Biology Lodish 6th Edition Test Bank instant download
100% (2)
Molecular Cell Biology Lodish 6th Edition Test Bank instant download
38 pages
9610 International as Level Biology Scheme of Works
No ratings yet
9610 International as Level Biology Scheme of Works
111 pages
Lecture 5 - Proteins 2023
No ratings yet
Lecture 5 - Proteins 2023
45 pages
3.3C: Protein Structure: Learning Objectives
No ratings yet
3.3C: Protein Structure: Learning Objectives
5 pages
BCCH 4
No ratings yet
BCCH 4
59 pages
E1 Paper I
No ratings yet
E1 Paper I
31 pages
Cell Macromoleculas
No ratings yet
Cell Macromoleculas
5 pages
Chapter 2 Chemical Basis of Life TAYAM
No ratings yet
Chapter 2 Chemical Basis of Life TAYAM
19 pages
Protein Structure
No ratings yet
Protein Structure
12 pages
Lenoir FlowProcessFold
No ratings yet
Lenoir FlowProcessFold
29 pages
00.protein Modification During Ingredient Preparation
No ratings yet
00.protein Modification During Ingredient Preparation
41 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Structural bioinformatics

Uploaded by

Structural bioinformatics

Uploaded by

Structural bioinformatics

Dr. Zsuzsanna Dosztányi

Side chain Condensation (water

Number of protons in the observable

Proteins are usually longer

They adopt a well-defined

Atomic-level understanding of biological processes

Between every atom pairs

Attractive and repulsive term

Hydrophobic effect: dominated by entropic factors

Protein in isolation Protein in aquaneous

- characterizes how much the

- ε is 1 in vacuum, around 4 inside a

Rotation around C-

● Using the φ, ψ angles we can

5-40 residues in length (10 on

Individual H-bonds are relatively

Strands of 5-10 residues run

Strands are held together by

Often form binding regions

Src protein kinase

Most structures are solved by

- X-rays have short wave lengths

- gives information about electron

There is no standard for NMR structures

How similar are these to good known structures

• Describes the reliability of

Resolution can change for each

Eg.: Rasmol, Pymol, Chimera,

Automatic determination using algorithms

3 (alpha, beta, coil)

Do not agree 100%

Superposition: minimizing distances between positions

The most commonly used function

RMSD is the average distance

Sequence identity: 25%

The structure is usually more conserved than the sequence

Proteins belonging to the same

Missing regions in the protein NMR structures with high structural

Combination of methods are needed (NMR, SAXS)

Methods are used to characterize

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.