0% found this document useful (0 votes)
97 views100 pages

Computing With DNANov28th2009

DNA was used to solve an NP-complete problem called the Hamiltonian Path Problem. Leonard Adleman encoded the vertices and edges of the problem graph as single-stranded DNA molecules. He then connected the edges by enzymatic ligation to generate all possible paths. After several filtering steps like PCR amplification and separation, DNA strands representing valid paths from the starting to ending vertex would remain, allowing him to determine if a solution existed. This was a landmark proof-of-concept demonstration that DNA computing was possible.

Uploaded by

Sunny3141
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views100 pages

Computing With DNANov28th2009

DNA was used to solve an NP-complete problem called the Hamiltonian Path Problem. Leonard Adleman encoded the vertices and edges of the problem graph as single-stranded DNA molecules. He then connected the edges by enzymatic ligation to generate all possible paths. After several filtering steps like PCR amplification and separation, DNA strands representing valid paths from the starting to ending vertex would remain, allowing him to determine if a solution existed. This was a landmark proof-of-concept demonstration that DNA computing was possible.

Uploaded by

Sunny3141
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 100

Ashwin Balu & Sunny Gupta

CISC879: Natural Computing


Queen’s University
Outline
Biology Overview
DNA Basics
Gene Expression and System
Biology
DNA Reactions
DNA Computations
DNA Computing Applications
Current Trends
Open Problems and Limitations

Background
 “The use of biochemicals and biomolecular
operations to solve problems and to perform
computation”

 Questions:
◦ Can any algorithm be simulated by means of DNA
computing?
◦ Is it possible to design a programmable molecular
computer?
Motivation
 Unique features
◦ DNA used as data and code structures!
◦ Many ways of creating DNA computers

 Usefulness
◦ Massive parallelism
◦ Smaller “hardware” size
◦ High energy efficiency
◦ Smaller information storage
◦ Can solve problems standard computers can’t
Need of DNA computer?
Moore’s Law states that silicon
microprocessors double in
complexity roughly every two
years.
One day this will no longer hold
true when miniaturisation limits
are reached. Intel scientists say
it will happen in about the year
2018.
Require a successor to silicon.

The Beginning
 Francis Crick &  James D. Watson.
co-discoverers of the structure of
the DNA molecule in 1953

Molecular Biology of the
Cell
 Cellular structures and processes result from a complex
interaction network of biological molecules
Carbohydrates
Lipids
Proteins

Mammals only use 20 different amino


acids to make the immense variety of
proteins it needs.
Molecular Forces &
Bonding
What is DNA?
Source code to life
Instructions for building and
regulating cells
Data store for genetic inheritance
Think of enzymes as hardware,
DNA as software

DNA Molecule

DNA is a medium of information storage using 4 base pairs


to store information for all living cells.

It has contained and transmitted the data of life for
billions of years

DNA is organized into long structures called chromosomes


DNA






 DNA makes the building blocks for life




DNA - Data and Code






 DNA doesn't just make proteins , it has
instructions on how the system should behave
 Human and chimpanzee DNA is 98.5 percent identical
 DNA of humans and mice is only around 60 percent
similar



Dense Information Storage

üThis image shows 1


gram of DNA on a CD. The
CD can hold 800 MB of
data.
ü
üThe 1 gram of DNA can
hold about 1x1014 MB of
data.
Types of DNA
Mitochondrial
Nuclear

 Nuclear and mitochondrial DNA are thought
to be of separate evolutionary origin

 mtDNA being derived from the 
circular genomes of the bacteria



Mitochondrial DNA
Nuclear chromosomes encode around
30,000 genes in 3 billion bases

Mitochondrial DNA genome is tiny with
only around 16,500 bases. However, this

16k of data is enough to encode
several proteins and RNA molecules,
containing exactly 35 genes.
From Large to Small

10 μm 0.84 μm

2 nm 11 nm
DNA Molecules
 Important features
◦ 3 basic parts for each
◦ Numbering carbons
◦ 5’: unattached phosphate group, 3’: unattached
hydroxyl group

A T

G C
DNA Bases
 Important features
◦ Complementarity
◦ Purines vs. pyrimidines
◦ Hydrogen bonds
◦ Phosphodiester bonds
◦ Antiparallelism
◦ Natural direction

DNA Molecules
 Simplest representation
◦ 5’ – CGTGTTCGAAGCCC – 3’
◦ 3’ – GCACAAGCTTCGGG – 5’

 Important features
◦ Representation
◦ Complementarity
◦ Directionality

 Sticky ends
◦ 5’ – CGTGTTCGA – 3’
◦ 3’ – GCACA – 5’
Manipulating DNA
 1) Denaturation (melting)









 2) Annealing (renaturation)

Manipulating DNA
 3) Polymerase extension

 5’ – TCGATT – 3’ (primer)
 3’ – AGCTAACTT – 5’ (template)

 5’ – TCGATTG – 3’
 3’ – AGCTAACTT – 5’

 5’ – TCGATTGA – 3’
 3’ – AGCTAACTT – 5’

 5’ – TCGATTGAA – 3’
 3’ – AGCTAACTT – 5’


Manipulating DNA
 4) Nuclease degradation 
◦ 
 5’ – TCGATTGAA – 3’ 5’ – TGAATTCCG – 3’
 3’ – AGCTAACTT – 5’ 3’ – ACTTAAGGC – 5’

 5’ – TCGATTGA – 3’ 5’ – TG– 3’ 5’ –
 3’ – GCTAACTT – 5’ AATTCCG – 3’
 3’ – ACTTAA – 5’
 5’ – TCGATTG – 3’ 3’ – GGC – 5’
 3’ – CTAACTT – 5’
 5’ – TGCCCGGGA – 3’
 5’ – TCGATT – 3’ 3’ – ACGGGCCCT – 5’
 3’ – TAACTT – 5’
 5’ – TGCCC – 3’ 5’
– GGGA – 3’
3’ – ACGGG – 5’ 3’
– CCCT – 5’
Manipulating DNA
 5) Ligation

 OH
P
 5’ – TC
GATTGAA – 3’
 3’ – AGCTAA
CTT – 5’
 P
OH

 OH P
 5’ – TCGATTGAA – 3’
 3’ – AGCTAACTT – 5’
 OH
P

Manipulating DNA
 6) Amplification

1. Denaturatation

2. Add primers

3. Annealling
Manipulating DNA
 6) Amplification (cont’d)

4. Polymerase extension
Manipulating DNA
 7) Gel electrophoresis
Manipulating DNA
 8) Modify nucleotides – insert, delete, substitute

 9) Filtering – magnetic bead separation

 10) Synthesis of a single strand

 11) Sequencing
DNA manipulations:
Ifwe want to use DNA as an
information bulk, we must be able to
manipulate it .
However we are talking of handling
molecules…
ENZYMES = Natural CATALYSERS.
So instead of using physical processes,
we would have to use natural ones,
more effective:
◦ for lengthening: polymerases…
◦ for cutting: nucleases (exo/endo-
nucleases)…
◦ for linking: ligases…
Serialization: 1985: Kary Mullis  PCR
 Thank this reaction we get millions of identical
Video
DNA Machine
Introduction to DNA
Computing
What is DNA computing ?
◦ Around 1950 first idea (precursor
Feynman)
Molecularlevel (just greater than
10-9 meter)
Massive parallelism.
◦ In a liter of water, with only 5
grams of DNA we get around 1021
bases !
◦ Each DNA strand represents a
processor !
DNA Computing Begins...
 1970s
◦ Much speculation

 1994
◦ Leonard Max Adleman
◦ “Molecular Computation Solutions to
 Combinatorial Problems”
◦ Used DNA computing to solve an NP-complete
problem: Hamiltonian Path Problem

“Biology and computer science - life and computation –


are related.”
Hamiltonian Path Problem
 Solution:
◦ 1. Generate random paths through the graph
◦ 2. Keep only those paths beginning with vin and
ending with vout
◦ 3. If graph has n vertices , keep only those paths
with exactly n vertices
◦ 4. Keep only those paths that enter all vertices
of the graph at least once
◦ 5. If any path remains, say YES; else, say NO

2 3

0
5 4
1
Hamiltonian Path Problem
 Instance of the HPP solved by Adleman

2 3

0
5 4

1
Adleman’s HPP Solution
 Adleman translated this solution step-by-step
into molecular biology
 Encoded each vertex as a single stranded
nucleotide of length 20 – randomized codes
 Each possible edge synthesized
 Connect edges by enzymatic ligation


 TGAATCCGACGTCCAGTGA
v
ATGAACTATGGCACGCTATC
v2
1
 GCAGGTCACT
TACTTGATAC

e12
Adleman’s HPP Solution
Adleman’s HPP Solution

The basic idea is to have a set


of molecules with unique
sequences representing the
vertices and edges of the graph
Adleman’s HPP Solution
 Solution:
◦ 1. Generate random paths through the graph
◦ 2. Keep only those paths beginning with vin and
ending with vout
◦ 3. If graph has n vertices , keep only those paths
with exactly n vertices
◦ 4. Keep only those paths that enter all vertices
of the graph at least once
◦ 5. If any path remains, say YES; else, say NO
Adleman’s HPP Solution
 Let Ei be the oligonucleotide of edge i
 Let Ei be the complement of Ei

 Using E0 and E6 as primers, PCR product of Step
1
 Only paths containing vertex 0 and vertex 6
remain
 Use filtering operation to separate out
strands starting at vertex 0 and ending with
vertex 6
Adleman’s HPP Solution
 Solution:
◦ 1. Generate random paths through the graph
◦ 2. Keep only those paths beginning with vin and
ending with vout
◦ 3. If graph has n vertices , keep only those paths
with exactly n vertices
◦ 4. Keep only those paths that enter all vertices
of the graph at least once
◦ 5. If any path remains, say YES; else, say NO
Adleman’s HPP Solution
 Separate product of Step 2 by gel
electrophoresis
 Identify DNA molecules with 7 vertices
◦ 7 vertices * 20 bases each = 140 bp
 Repeatcycles of PCR and gel electrophoresis
to purify the product further

 Result: 7-vertex molecules that start with 0,


end with 6
 Examples:
◦ 0, 1, 2, 3, 4, 5, 6
◦ 0, 3, 2, 3, 4, 5, 6
◦ 0,1, 1, 1, 1, 1, 6
Adleman’s HPP Solution
 Solution:
◦ 1. Generate random paths through the graph
◦ 2. Keep only those paths beginning with vin and
ending with vout
◦ 3. If graph has n vertices , keep only those paths
with exactly n vertices
◦ 4. Keep only those paths that enter all vertices
of the graph at least once
◦ 5. If any path remains, say YES; else, say NO
Adleman’s HPP Solution
 Probe single stranded DNA with complementary
oligonucleotides attached to magnetic beads
 Can pull sequences with specific vertices out
of the solution
 Use one step for each vertex



Adleman’s HPP Solution
 Solution:
◦ 1. Generate random paths through the graph
◦ 2. Keep only those paths beginning with vin and
ending with vout
◦ 3. If graph has n vertices , keep only those paths
with exactly n vertices
◦ 4. Keep only those paths that enter all vertices
of the graph at least once
◦ 5. If any path remains, say YES; else, say NO
Adleman’s HPP Solution
 PCR the remnants after Step 4
 Analyze it by gel electrophoresis
 If anything exists, obtain YES – Hamiltonian
path found
 Else, obtain NO – no Hamiltonian path
available
Adleman’s HPP Solution
 Solution:
◦ 1. Generate random paths through the graph
◦ 2. Keep only those paths beginning with vin and
ending with vout
◦ 3. If graph has n vertices , keep only those paths
with exactly n vertices
◦ 4. Keep only those paths that enter all vertices
of the graph at least once
◦ 5. If any path remains, say YES; else, say NO

◦ WE JUST COMPUTED WITH DNA!!!
Thoughts About Adleman’s
Solution
 Practical details of experiment are not
relevant
 Experiment took 7 days of lab work
 However, distinctive advantage
◦ # of oligonucleotides needed will increase
linearly in relation to the number of vertices
involved
◦ NP-complete in classical computing; O(n) in DNA
computing

Thoughts About Adleman’s
Solution
 Adleman’s solution has weaknesses
◦ # of single strands necessary to encode vertices
and edges of generic HPP is of the order n!
◦ Drastic limitations on the size of problems that
can be solved by this procedure
◦ General strategy to work around this is to
diminish set of candidate solutions to be
generated

 Since HPP is NP-complete, Adleman’s DNA


technique can solve any NP problem
◦ But not necessarily in a feasible way

Thoughts About Adleman’s
Solution
 Brute force was used

 Speed of any computer determined by


◦ Parallelism
◦ Number of steps per unit time

DNA Classical
Operations 106 - 1012 1014 - 1020
(per second)
Energy used 2*1019 109
(operations per joule) Theoretical:
Storage size of one bit 10 12 19
34*10 1
(per cubic nanometer)
SAT Problem
 Satisfiability problem for prepositional
formulae
 Logical variables E = {e1, e2, …, en}
 Clauses Cj = {e1j, e2j, …, enj} joined by AND,
OR, NOT

 Problem:
◦ Given C1 ^ C2 ^ … ^ Cm assign a Boolean value to
each variable such that the entire statement is
TRUE
◦ NP-complete!

Lipton’s SAT Solution
 Possibly represent this as a graph search
problem
 Two phases:
◦ 1) Generate all paths in the graph
◦ 2) Search (filter) for truth assignment set that
satisfies formula
◦ Basically, same principles as Adleman

 Assume formula with n variables


 FALSE e10 e20 e30 en-1 0 en0

 v0 v1 v2
… vn-1 vn

 TRUE e10 e20 e30 en-1 0 en0


Lipton’s
Gene Expression
Used by all known life, eukaryotes, prokaryotes and viruses
Modulates the macromolecular machinery for life

Transcription, RNA splicing, Translation, post-translational modifications of p

Gene regulation gives the cell control over structure and function
Metaprogramming
Gene Expression
Transcription

Translation

Turing machines, invented by


Alan Turing in 1936, are
extremely simple computers
that consist of a finite-
state compute head that can
move back-and-forth on an
infinite one-dimensional
memory tape.
Systems Biology
Gene Regulation Network
Cytoscape Demo
Internet connection map

Asia Pacific - Red


Europe/Middle
East/Central
Asia/Africa - Green

North America -
Blue
Latin American and
Caribbean - Yellow
RFC1918 IP
Addresses - Cyan
Unknown - White
DNA Computer
 Given enough strands of DNA and
certain biological operations
 DNA can model 1-tape
nondeterministic Turing machine
 DNA compare to formulas
 DNA can work like a state
machine

DNA Logic Gates
 DNA can work like a state
machine
 Catalytic DNA or  DNAzyme
 DNAzymes are used to build logic
gates
 DNAzymes are limited to 1-, 2-,
and 3-input gates

DNA Multiplication
DNA Multiplication

Restriction Enzyme Digests


DNA Multiplication
DNA Code Breaking
DNA Code Breaking
DNA Associative Memory
Content addressable memories
are useful in a number of
computer contexts and are
widely thought to be an
important component of human
intelligence

Content addressable memory is
one where a stored word may be
retrieved from sufficient, partial,
knowledge of its content.
DNA Associative Memory
Vesselcontaining DNA
Encode a word-appropriate single
strand DNA molecule encoding it

Stickers model:
Memory complex = Strand of DNA
(single or semi-double).
Stickers are segments of DNA,
that are composed of a certain
number of DNA bases.
To use correctly the stickers
model, each sticker must be
able to anneal only at a specific
place in the memory complex.

To visualize:

0 0 0 1 0 1 0 0 1 0
Memory complex:
Semi-double

Soup of stickers:

=
A G C A T G A T

Zoom
DNA Associative Memory
About a stickers machine?
Simple operations: merge, select,
detect, clean.
 Tubes are considered (cylinders
with two entries)
However for a mere computation
(DES):
◦ Great number of tubes is needed
(1000).
◦ Huge amount of DNA needed as
well.
Practically no such machine has
Technological Developments

US team shows that DNA


computing can be simplified
by attaching the molecules
to a surface.

DNA molecules were applied
to a small glass plate
overlaid with gold.

Exposure to certain enzymes,
destroyed the molecules with
wrong answers leaving only
the DNA with the right
answers.
DNA Limitations
DNA denaturing (temperature,
time, PH)
Length of the DNA strand=size of
the problem
While the number of strands
could be exponential ..1021 is the
upper bound (volume issues)
DNA algorithms need to be more
noise tolerant

Making DNA Computers Error
Resistant
DNA computing not error free

DNA calculations fall into 3 basic
classes
1.Decreasing Volume (# strands
are reduced with each step)
2.Constant Volume (# strands
constant throughout all steps)
3.Mixed Algorithms
Making DNA Computers Error
Resistant

Aldemanand Lipton are even
more special. Each strand is
“good” or “bad”
•Good strands encode a solution
•Bad strands do not
•If a good strand is damaged or
lost the algorithm fails
•If a bad strand is not removed
and many are left at end then
Making DNA Computers Error
Resistant

Aldemanand Lipton are even
more special. Each strand is
“good” or “bad”
•Good strands encode a solution
•Bad strands do not
•If a good strand is damaged or
lost the algorithm fails
•If a bad strand is not removed
and many are left at end then
Making DNA Computers Error
Resistant

Two sources of errors
1.Every operation can cause an error
(extraction)
◦ –extraction is not perfect usually 95%
strands match the desired pattern
◦ –In addition, strands that do not
match will sometimes be removed
anyways.
◦ Rates typically 1 part in 106
2.DNA has ½ life, and decays at a finite
rate. If an algorithm takes months
Some hurdles:
Operations done manually in the
lab.
Natural tools are what they are…
àFormation of a library (statistic
way)
àOperations problems

Molecular Computing
 Wayne State University’s Michael Conrad has defined
his vision of a molecular computer in which proteins
integrate multiple input modes to perform a
functional output (Conrad, 1986). In addition to
smaller size scale, protein based molecular
computing offers different architectures and
computing dimensions. Conrad suggests that “non-
von Neumann, nonserial and non-silicon” computers
will be “context dependent,” with input processed as
dynamical physical structures, patterns, or analog
symbols. Multidimensional conditions determine the
conformational state of any one protein:
temperature, pH, ionic concentrations, voltage,
dipole moment, electroacoustical vibration,
phosphorylation or hydrolysis state, conformational
state of bound neighbor proteins, etc. Proteins
integrate all this information to determine output.
Thus each protein is a rudimentary computer and
Molecular Computing
Cells and organisms are natural molecular computers

Allowing proteins to fold producing computation




Molecular Computing
 Allowing proteins to fold producing
computation


Protein Folding
 Mainly guided by:
Hydrophobic interactions
Intramolecular hydrogen bonds
Van der Waals forces

Protein Folding
 Folding is a free energy minimization process
that depends on the interactions among
amino acids
 Protein change as fast as femtoseconds (10-15
sec)

Folding Proteins
 All proteins begin to fold into three–
dimensional structures after synthesis
 These structures gives proteins its
functionally (lock and key receptors)
 Folding is a free energy minimization process

Protein Folding Problem
 Considered to be an NP-complete problem

 Massively parallel computers to derive
solutions by brute force have failed

 Molecular pathway too complex

 Genetic Algorithms do better but cannot
guarantee polynomial time, fitness
relies on structure, and since the
structure is not known you have the
“termination problem” in GA
Protein Folding Problem
 Protein with mere 25 amino acids requires 94,502
years to solve
 No way of knowing if a GA terminated with an optimal
solution
Protein based computing
Different architectures and computing

dimensions

 Non-von Neumann, non-serial and non-silicon
 Context dependent
 Input processed as dynamical physical
structures, patterns, or analog symbols
 Multidimensional conditions
 Temperature, pH, ionic concentrations, voltage,
dipole moment, electroacoustical vibration,
phosphorylation or hydrolysis state,
conformational state of bound neighbor
proteins, etc.
 Proteins integrate all this
Protein Folding Matrix
Computer
 Use Rose scale matrix
 Let the protein folding solve large matrix
problems
Folding Proteins
Applications
 Possible to generate a vast combinatorial of
different protein shapes just by changing
the DNA base sequence

 Encrypting data (lock and key)
 Decrypting data
 Encryption breaking
 Pattern Recognition

D N A C O M P U T E R V s S ILIC O N C O M P U T E R

Feature DNA COMPUTER SILICON COMPUTER

Miniaturization Unlimited Limited

Processing Parallel Sequential

Speed Very fast Slower

Cost Cheaper Costly

Materials used Non-toxic Toxic

Size Very Small Large

Data Capacity Very Large Smaller


Advantages
 Perform millions of operations
simultaneously;
 Conduct large parallel processing
 Massive amounts of working memory;
 Generate & use own energy source via the
input.
 Four storage bits A T G C .
 Miniaturization of data storage
Limitations
DNA computing involves a relatively
large amount of error

Requires human assistance!

Time consuming laboratory
procedures.

No universal method of data
representation.
Slides to go
I'm putting together a few slides
on associative memory,
cryptographic problems, DNA
based addition and matrix
multiplication, parallel machines
and DNA computer limitations.
 

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy