Bio 1 25
Bio 1 25
(1)
INTRODUCTION
DR. IBRAHIM ZAGHLOUL
IMPORTANCE OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY
(Molecular) Bio-informatics
Bioinformatics is conceptualizing biology in terms of
molecules (in the sense of physical-chemistry),
and then applying “informatics” techniques (derived from
disciplines such as applied math, CS, and statistics),
to understand and organize the information associated with
these molecules, on a large-scale.
Bioinformatics is a practical discipline with many applications.
4
SCALES OF LIFE ANIMAL CELL
5
TWO KINDS OF CELLS
6
7
• Differences between organisms comes from different DNA.
8
DNA: THE SEQUENCE OF LIFE
10
11
12
13
14
15
16
GENES AND PROTEINS
17
GENES AND PROTEINS
https://www.youtube.com/watch?v=gG7uCskUOrA&t=161s
18
DNA TO PROTEIN
19
DNA TRANSCRIPTION
20
DNA TRANSCRIPTION
21
MRNA TRANSLATION
23
CODON TO AMINO ACID TABLE
24
CODON TO AMINO ACID
25
GENES AND PROTEINS
https://www.youtube.com/watch?v=gG7uCskUOrA&t=161s 26
AMINO ACIDS
27
FROM DNA TO PROTEINS
28
PROTEIN FOLDING AND STRUCTURE
https://www.rcsb.org/structure/3ERT 30
PROTEIN FOLDING AND STRUCTURE
31
GENETIC VARIANT EFFECT PREDICTION
32
GENETIC VARIANT EFFECT PREDICTION
33
GENETIC VARIANT EFFECT PREDICTION
34
GENETIC VARIATIONS
Causes of variations
1. Mistakes in DNA replication
2. Environmental agents (radiation, chemical agents)
3. Transposable elements (transposons)
A part of DNA is moved or copied to another location in genome
4. Horizontal transfer of DNA
• Organism obtains genetic material from another organism
that is not its parent
• Utilized in genetic engineering
GENETIC VARIATIONS CONT.
Types of variations:
1. SNP (Single Nucleotide Polymorphisms)
2. Indels (Insertion-Deletion)
3. Inversion
4. Duplication
Back
GENETIC VARIATIONS CONT.
SNP (Single Nucleotide Polymorphisms)
Ref
G A C T T C G A T C A
Sample
G A C G T C G A T C A
Back
Synonymous
GENETIC VARIATIONS Non-synonymous SNP SNP (sSNP)
(nsSNP)
Ref G A C T T C G A T C A G DFDQ
Sample G A C G T C G A T C A A DVDQ
Back
GENETIC VARIATIONS
Indels (Insertion-Deletion)
Insertion
Ref
G A C T - - - - T C G
Sample
G A C T C G A T T C G
GENETIC VARIATIONS CONT.
Deletion
Ref
G A C T T C G A T C A
Sample
G A C - - C G A T C A
GENETIC VARIATIONS CONT.
Inversion
REF
G A C T T C G A T C A
Sample
G A C G C T T A T C A
Back
GENETIC VARIATIONS CONT.
Duplication
REF
G A C G T C G A T C A
ReSeq
G A C G T C G T C C A
Back
DNA: THE SEQUENCE OF LIFE
46
PROTEINS
• Proteins are short strings (few hundred letters) where each letter is one of the 20 amino
acids.
• Bacteria make around 500 to 1500 proteins , while human genome makes around
100,000 proteins .
• Each protein is produced by a gene. A gene is a fragment of the DNA.
• Every three adjacent nucleotides of a gene produces one amino acid letter of the
corresponding proteins.
• Three adjacent nucleotides are called a codon. There are 43 = 64 possible codons.
• Since 64 > 20, several different codons may produce the same amino acid.
• Also, there is a special type of codons called stop codons which indicate the end of
protein. 47
BIOLOGICAL SEQUENCES
48
OBTAINING SEQUENCES
Gene Sequences
www.ensembl.org
https://www.ncbi.nlm.nih.gov/
49
OBTAINING SEQUENCES
Protein Sequences
https://www.uniprot.org/
50