0% found this document useful (0 votes)
19 views50 pages

Bio 1 25

The document provides an overview of bioinformatics and its significance in analyzing biological data through computational methods. It discusses the structure and function of DNA, genes, and proteins, as well as the processes of transcription and translation. Additionally, it highlights genetic variations and the importance of understanding biological sequences for research and applications in molecular biology.

Uploaded by

mahmoudweso2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views50 pages

Bio 1 25

The document provides an overview of bioinformatics and its significance in analyzing biological data through computational methods. It discusses the structure and function of DNA, genes, and proteins, as well as the processes of transcription and translation. Additionally, it highlights genetic variations and the importance of understanding biological sequences for research and applications in molecular biology.

Uploaded by

mahmoudweso2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

BIOINFORMATICS(BIOCOMPUTING)

(1)
INTRODUCTION
DR. IBRAHIM ZAGHLOUL
IMPORTANCE OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

Computational methods are used to study biological data because:


- Massive amount of biological data.
- Model the rules for biological processes.
- Identify important patterns.
- Visualize operations and data.
- Create simulations for biological processes.
- Predict effects and behaviors.
- Suggest mechanisms and actions
- Efficient, reduce experimental time and resources.
- Identify higher order relationships.
2
 A general introduction
 – what problems are people working on?
 – how people solve these problems?
 – what key computational techniques are needed?
 – how much help computing has provided to biological research?

 A way of thinking -- tackling “biological problems” computationally


 – how to look at a “biological problem” from a computational point of view?
 – how to formulate a computational problem to address a biological issue?
 – how to collect statistics from biological data?
 – how to build a “computational” model?
 – how to solve a computational modeling problem?
 – how to test and evaluate a computational algorithm?
3
WHAT IS BIOINFORMATICS?

 (Molecular) Bio-informatics
 Bioinformatics is conceptualizing biology in terms of
molecules (in the sense of physical-chemistry),
and then applying “informatics” techniques (derived from
disciplines such as applied math, CS, and statistics),
to understand and organize the information associated with
these molecules, on a large-scale.
 Bioinformatics is a practical discipline with many applications.
4
SCALES OF LIFE ANIMAL CELL

5
TWO KINDS OF CELLS

 Prokaryotes – no nucleus (bacteria)


– Their genomes are circular
 Eukaryotes – have nucleus (animal,plants)
– Linear genomes with multiple chromosomes in pairs. When pairing up, they look
like

6
7
• Differences between organisms comes from different DNA.
8
DNA: THE SEQUENCE OF LIFE

 What is DNA? It is a long molecule that contains our


unique genetic code.
 It holds the instructions for making all the proteins in
our bodies.
 Made of a chemical called deoxyribonucleic acid, or
DNA for short.
 DNA contains four basic building blocks or ‘bases’:
adenine (A), cytosine (C), guanine (G) and thymine (T)
9
MOLECULAR BIOLOGY INFORMATION - DNA

10
11
12
13
14
15
16
GENES AND PROTEINS

17
GENES AND PROTEINS

 Transcription: The enzyme RNA


polymerase uses DNA as a template
to produce a pre-mRNA transcript
and then a mature mRNA.
 Translation: mRNA is translated to
build the protein molecule
(polypeptide) encoded by the original
gene.
 The ribosome reads the mRNA to
produce a chain of amino acids.

https://www.youtube.com/watch?v=gG7uCskUOrA&t=161s
18
DNA TO PROTEIN

19
DNA TRANSCRIPTION

20
DNA TRANSCRIPTION

• Transcription takes place in


the nucleus.
• It uses DNA as a template to
make an RNA (mRNA)
molecule.
• During transcription, a strand
of mRNA is made that is
complementary to a strand of
DNA.

21
MRNA TRANSLATION

 Translation is the process by which a protein


is synthesized from the information
contained in a messenger RNA (mRNA).
 During translation, an mRNA sequence is
read using the genetic code, which is a set
of rules that defines how an mRNA
sequence is to be translated into the 20-
letter code of amino acids.
 amino acids are the building blocks of
proteins.
22
GENES AND PROTEINS

 One gene encodes one protein.


 Like a program, it starts with start codon (e.g. ATG), then each three
code one amino acid. Then a stop codon (e.g. TGA) signifies end of the
gene.
 Sometimes, in the middle of a (eukaryotic) gene, there are introns that
are spliced out (as junk) during transcription. Good parts are called
exons. This is the task of gene finding.

23
CODON TO AMINO ACID TABLE

24
CODON TO AMINO ACID

25
GENES AND PROTEINS

 Amino acids: the building blocks of


proteins, there are 20 different amino
acids.
 A protein: is composed of one or more long
chains of amino acids that corresponds to
the DNA sequence of the encoding gene.
 A protein is a basic structure that is found
in all of life. It’s a molecule.
 Proteins play a variety of important roles in
the body.

https://www.youtube.com/watch?v=gG7uCskUOrA&t=161s 26
AMINO ACIDS

27
FROM DNA TO PROTEINS

28
PROTEIN FOLDING AND STRUCTURE

• Protein amino acid sequence is folded into a 3-D


structure.
• Folding: Physical process in which the 3D structure of a
protein is formed.
• A conformation that is usually
biologically functional
• Function is related to structure:
Different structure implies
different function.
[Wikipedia]
29
PROTEIN STRUCTURE

https://www.rcsb.org/structure/3ERT 30
PROTEIN FOLDING AND STRUCTURE

31
GENETIC VARIANT EFFECT PREDICTION

32
GENETIC VARIANT EFFECT PREDICTION

33
GENETIC VARIANT EFFECT PREDICTION

34
GENETIC VARIATIONS
Causes of variations
1. Mistakes in DNA replication
2. Environmental agents (radiation, chemical agents)
3. Transposable elements (transposons)
A part of DNA is moved or copied to another location in genome
4. Horizontal transfer of DNA
• Organism obtains genetic material from another organism
that is not its parent
• Utilized in genetic engineering
GENETIC VARIATIONS CONT.
Types of variations:
1. SNP (Single Nucleotide Polymorphisms)
2. Indels (Insertion-Deletion)
3. Inversion
4. Duplication

Back
GENETIC VARIATIONS CONT.
SNP (Single Nucleotide Polymorphisms)

Ref
G A C T T C G A T C A
Sample
G A C G T C G A T C A

Back
Synonymous
GENETIC VARIATIONS Non-synonymous SNP SNP (sSNP)
(nsSNP)

Ref G A C T T C G A T C A G DFDQ
Sample G A C G T C G A T C A A DVDQ

Back
GENETIC VARIATIONS
Indels (Insertion-Deletion)
Insertion
Ref
G A C T - - - - T C G
Sample
G A C T C G A T T C G
GENETIC VARIATIONS CONT.
Deletion

Ref
G A C T T C G A T C A
Sample
G A C - - C G A T C A
GENETIC VARIATIONS CONT.
Inversion

REF
G A C T T C G A T C A
Sample
G A C G C T T A T C A

Back
GENETIC VARIATIONS CONT.
Duplication

REF
G A C G T C G A T C A
ReSeq
G A C G T C G T C C A

Back
DNA: THE SEQUENCE OF LIFE

• Nucleic acids: Chemical molecules that copy information about


fundamental determinants of life.
• DNA: is a string where each letter is a nucleotide {A, C, T, G}.
• A genome of a living organism is its entire DNA.
• Living organisms have up to trillions of cells (according to the
organism type). Each cell of the same living organism contains
the same genome.
• DNA varies in length from a few million nucleotides (bacteria)
to a few billion nucleotides (mammals).
43
DNA SEQUENCE

• DNA is usually double-stranded, with one strand being the complement


of the other:
Complement of A is T Complement of C is G.
• Thus, a letter of DNA is usually called a base-pair, not a nucleotide
since, it actually represents two nucleotides: one in a DNA strand , and
one in its reverse complement strand .
• A DNA strand has a start and an end, and such direction from start to
end is important. For example, consider the following DNA:
ATGTCAGGC
TACAGTCCG
44
DNA SEQUENCE
 Each nucleotide is physically bound to its complementary nucleotide below it, forming one
base-pair.
 Assuming that the start-to-end direction of the top strand is from left to right, the start-to-end
direction of the bottom strand will be from right to left.
 Normally, we write DNA strings or substrings such that it starts at left and ends at right. Thus,
the DNA consists of two these two strands:
ATGTC and
its reverse complement: GACAT.
 When working with DNA sequences, both strands are equally
important, and they are not equivalent.
 DNA is a factory of proteins.
45
RNA

• RNA sequences is usually single-stranded and consist of letters from the


4-sized alphabet of nucleotides {A, C, U, G}.
• It exists to initiate and regulate protein production from scattered genes.
• Also, the genome of many viruses is RNA genome.

46
PROTEINS
• Proteins are short strings (few hundred letters) where each letter is one of the 20 amino
acids.
• Bacteria make around 500 to 1500 proteins , while human genome makes around
100,000 proteins .
• Each protein is produced by a gene. A gene is a fragment of the DNA.
• Every three adjacent nucleotides of a gene produces one amino acid letter of the
corresponding proteins.
• Three adjacent nucleotides are called a codon. There are 43 = 64 possible codons.
• Since 64 > 20, several different codons may produce the same amino acid.
• Also, there is a special type of codons called stop codons which indicate the end of
protein. 47
BIOLOGICAL SEQUENCES

• So far we have discussed three types of sequences, or strings, in which we are


interested
• DNA sequences which consist of letters from the 4-sized alphabet of nucleotides:
• {A, C, T, G}.
• Protein sequences which consist of letters from the 20-sized alphabet of amino acids:
• {A, R, N, D, C, E, Q, G, H, I, L, K, M, F, P, S, T, W, Y, V}
• RNA sequences which consist of letters from the 4-sized alphabet of nucleotides:
• {A, C, U, G}.

48
OBTAINING SEQUENCES

Gene Sequences

www.ensembl.org

https://www.ncbi.nlm.nih.gov/
49
OBTAINING SEQUENCES

Protein Sequences
https://www.uniprot.org/

50

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy