0% found this document useful (0 votes)
149 views46 pages

Next Generation Sequencing - : An Overview

This document provides an overview of next generation sequencing technologies. It discusses the history of genomic research and sequencing technologies. Several major NGS platforms are described including their read lengths, accuracy levels, and common applications. While each platform has strengths for different types of projects, the document emphasizes that researchers should consult sequencing facilities early in project planning to discuss their needs and ensure high quality, analyzable data. Pitfalls around low quality samples and lack of bioinformatics support are also highlighted.

Uploaded by

Shuaib Ahmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views46 pages

Next Generation Sequencing - : An Overview

This document provides an overview of next generation sequencing technologies. It discusses the history of genomic research and sequencing technologies. Several major NGS platforms are described including their read lengths, accuracy levels, and common applications. While each platform has strengths for different types of projects, the document emphasizes that researchers should consult sequencing facilities early in project planning to discuss their needs and ensure high quality, analyzable data. Pitfalls around low quality samples and lack of bioinformatics support are also highlighted.

Uploaded by

Shuaib Ahmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Next Generation Sequencing –

An Overview

Olga Vinnere Pettersson, PhD


National Genomics Infrastructure hosted by ScilifeLab,
Uppsala Node (UGC)

Version 5.2.1.b
Today we will talk about:
www.robustpm.com

• National Genomics Infrastructure – Sweden

• History and current state of genomic research

• Sequencing technologies:
– Types
– Principles
– Their “+” and “-”
– Couple of pieces of advise
DNA sequencing revolution

Massively parallel sequencing (454, Illumina, Life Tech)

Human genome

James Watsons genome

Center for Metagenomic


Sequence Analysis (KAW)

Swedish National Infrastructure


for Large-Scale Sequencing
(SNISS)

Science for Life


Laboratory (SciLifeLab)
What is sequencing?
DEFINITION

• “In genetics and biochemistry, sequencing


means to determine the primary structure (or
primary sequence) of an unbranched
biopolymer.”
(http://en.wikipedia.org/wiki/Sequencing)
Once upon a time…
• Fredrik Sanger and Alan Coulson
Chain Termination Sequencing (1977)
Nobel prize 1980

Principle:

SYNTHESIS of DNA is randomly TERMINATED at different points

Separation of fragments that are 1 nucleotide different in size


Sanger’s sequencing

P32 labelled ddNTPs

!
Lack of OH-group at 3’ position of deoxyribose

Fluorescent dye terminators

Max fragment length – 750 bp


Maxam & Gilbert Sequencing
Sequencing genomes
using Sanger’s method

• Extract & purify genomic DNA


• Fragmentation
• Make a clone library
• Sequence clones
• Align sequencies ( -> contigs -> scaffolds)
• Close the gaps

• Cost/Mb=1000 $, and it takes TIME


At the very beginning of genome sequencing
era…

• First genome: virus  X 174 - 5 368 bp (1977)

• First organism: Haemophilus influenzae - 1.5 Mb (1995)

• First eukaryote: Saccharomyces cerevisiae - 12.4 Mb (1996)

• First multicellular organism: Cenorhabditis elegans - 100 MB (1998-2002)

• First plant: Arabidopsis thaliana - 157 Mb (2000)


Just an interesting comparison:

• Human genome project, 2007


– Genome of Craig Wenter costs 70 mln $
• Sanger’s sequencing

– Genome of James Watson costs 2 mln $


• 454 pyrosequencing

– Ultimate goal: 1000 $ / individual


Almost there!
Paradigm change

• From single genes to complete genomes


• From single transcripts to whole transcriptomes
• From single organisms to complex metagenomic pools
• From model organisms to the species you are studying
IF 31.6

IF 2.9
!
Main hazard - DATA ANALYSIS

Data analysis
$

http://finchtalk.geospiza.com

Sequencing

=> More bioinformaticians to people!


Major NGS technologies
NGS technologies

Company Platform Amplification Sequencing


method
Roche 454** emPCR Pyrosequencing
Illumina HiSeq Bridge PCR Synthesis
MiSeq
LifeTech SOLiD** emPCR/ Wildfire Ligation

LifeTech Ion Torrent emPCR Synthesis (pH)


Ion Proton
Pacific Bioscience RSII None Synthesis
Complete Nanoballs None Ligation
genomics
Oxford Nanopore* GridION None Flow

RIP technologies: Helicos, Polonator, etc.


In development: Tunneling currents, nanopores, etc.
Differences between platforms

• Technology: chemistry + signal detection


• Run times vary from hours to days
• Production range from Mb to Gb
• Read length from <100 bp to > 20 Kbp
• Accuracy per base from 0.1% to 15%
• Cost per base varies
Making a NGS library

DNA QC – paramount importance Sharing & size selection

Amplification

Ligation of sequencing adaptors, technology specific


Roche
Instrument Yield and run Read Error rate Error type
time Length
454 FLX+ 0.9 GB, 20 hrs 700 1% Indels

454 FLX 0.5 GB, 10 hrs 450 1% Indels


Titanium
454 FLX Jr 0.050 GB, 10 hrs 400 1% Indels

Main applications:
• Microbial genomics and metagenomics
• Targeted resequencing
454 Titanium GS FLX
Illumina
Instrument Yield and run time Read Error Error
Length rate type
Upgrade 120 GB in 27h or 100x100 0.1% Subst
HiSeq2500 standard run
MiSeq 540 Mb – 15 Gb Upp to 0.1% Subst
(4 – 48 hours) 350x350

Main applications
• Whole genome, exome and targeted reseq
• Transcriptome analyses
• Methylome and ChiPSeq
• Rapid targeted resequencing (MiSeq)
Illumina
Life Technologies SOLiD
Instrument Yield and run Read Error rate Error
time Length type
SOLiD 5500 600 GB, 8 days 75x35 PE 0.01% A-T Bias
wildfire 60x60 MP

Features
• High accuracy due to two-base encoding
• True paired-end chemistry - ligation from either end
• Mate-pair libraries

Main applications (currently)


•ChiPSeq
SOLiD - ligation
Life Technologies - Ion Torrent & Ion Proton

Chip Yield - run time Read


Length
PGM 314 0.1 GB, 3 hrs 200 – 400
PGM 316 0.5GB, 3 hrs 200 - 400

PGM 318 1 GB, 3 hrs 200 - 400

P-I 10 GB 200

Main applications
• Microbial and metagenomic sequencing
• Targeted resequencing
• Clinical sequencing
Ion Torrent - H+ ion-sensitive field
effect transistors
Pacific Bioscience
Instrument Yield and run Read Error rate Error
time Length type

RS II 500 MB/180 min 250 bp – 15% Insertions


SMRTCell 20 000 bp (on a single , random
passage!)
(35 000 bp)

Single-Molecule, Real-Time DNA sequencing


Is there Life on Mars?
NGS technologies - SUMMARY

Platform Read length Accuracy Projects / applications


454 Medium Homo- Microbial + targeted reseq
polymer runs
HiSeq Short High Whole genome +
MiSeq Medium transcriptome seq, exome
SOLiD Short High Whole genome +
transcriptome seq, exome
Ion Torrent Medium High Microbial + targeted reseq

Ion Proton Short/ High Exome, transcriptome,


Medium genome
PacBio Long Low – ultra high* Microbial + targeted reseq
Gap closure & scaffolding
What is The BEST?
Illumina Illumina SOLiD Ion Torrent Ion Proton PacBio
HiSeq MiSeq Wildfire
Read length 100 + 250 + 75 bp 200 bp 150 bp 1 – 20 Kbp
100 bp 250 bp 400 bp 200 bp
(150+150 bp) (350+350 bp) (500 bp)

WGS:
- human ++++ (+) + (+)
- small +++ +++ (+) ++++ +++ +++++

De novo +++ ++ +++ ++ +++++


RNA-seq +++ +++ +++ +++*
miRNA +++ +++

ChIP +++ ++++

Amplicon ++ +++ +++ +++ +++

Metylation +++ ++++*


Target re- ++ +++ (+) +++ +++
seq

Exome +++ (+) ++++ (+)


Check list:
- Have others done similar work?
- Is your methodology sound? Sample size? Repetitions?
- Is there people to analyze the data?
- Is there computer capacity to analyze the data?
- Will you be able to publish NGS data by yourself?
- PLEASE consult the sequencing facility PRIOR to onset
of your project!
Common pitfalls and a piece of advise:
• If you give us low quality DNA/RNA - expect low quality data
• If you give us too little DNA/RNA – expect biased data

• Do not try to do everything by yourself


• Make sure there is a dedicated bioinformatician available
• Never underestimate time and money needed for data
analysis

• Google often!
• Use online forums, e.g. SeqAnswers.com
Summary

• Progress is FAST- keep yourselves updated!

• Chose technology based on:


– What is most feasible
– What is most accessible
– What is most cost-effective

SciLifeLab Genomics & Bioinformatics are here for you!


National Genomics Infrastructure

SciLifeLab, Stockholm SciLifeLab, Uppsala

Mid 2010

Uppmax, Uppsala
Projects
3. Access to at CMS platform
genomics
Portal project flow

NGI Project coordinators meet every second day via Skype

Ulrika Liljedahl Mattias Ormestad Olga Vinnere Pettersson


SNP&SEQ Stockholm Node UGC
Uppsala node Uppsala Node

Project distribution is based on:

1. Wish of PI
2. Type of sequencing technology
3. Type of application
4. Queue at technology platforms

Project is then assigned to a certain node and a coordinator contacts the PI


NGI Equipment
Illumina HiSeq 2000/2500 12
Illumina MiSeq 3
Life Technologies SOLiD 5500wildfire 1
Life Technologies Ion Torrent 2
Life Technologies Ion Proton 6
Life Technologies Sanger ABI3730 2
Pacific Biosciences RSII 2
Argus Whole Genome Mapping System 1

One of 5 best-equipped NGS sites in Europe


Projects
3. Access to at CMS platform
genomics
Project meeting

What we can help you with:


• Design your experiment based on the scientific question.
• Chose the best suited application for your project.
• Find the most optimal sequencing setup.
• Answer all questions about our technologies and applications, as well
as bioinformatics.

• Get UPPNEX account if you do not have one.

• In special cases, we can give extra-support with bioinformatics


analysis – development of novel methods and applications
Downstream Data Analysis

Bioinformatics Bioinformatics competence IS NOT


competence IS present present in research group
in research group

BILS: WABI:
Bioinformatics Wallenberg
Cooperation with
Infrastructure Advanced
platform personnel: R&D
for Life Bioinformatics
Sciences Initiative

Co-authorship Short-term commitment Long-term commitment

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy