0% found this document useful (0 votes)

45 views12 pages

Interpreting DNA SequenceREV

Uploaded by

Truong Dang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views12 pages

Interpreting DNA SequenceREV

Uploaded by

Truong Dang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Biol 312 L Molecular Biology Lab

DNA Sequencing Overview

DNA sequencing involves the determination of the sequence of nucleotides in a sample of DNA. It is presently conducted using a modified PCR reaction where both normal and labeled dideoxy-nucleotides are included in the reaction mix. The dideoxy-nucleotides lack the OH at their 3' carbon atom and are labeled with fluorescent dyes (each nucleotide has a different color). During PCR, mostly normal nucleotides are added to the DNA template. However, when a labeled nucleotide happens to add to the growing DNA strand, chain elongation stops because there is no 3' OH to which the next nucleotide can be attached. Hence, the dideoxy method is also called the chain termination method. At the end of the sequencing reaction, the fragments with fluorescent termini are separated by length from longest to shortest using a polyacrylamide gel (either a big thin slab gel or a narrow capillary tube filled with gel solution) that is scanned with a laser detection device. As each band moves past a viewer, the laser excites the dye, and the color of fluorescence is read by a photocell and recorded on a computer. A computer program evaluates the resulting histogram of fluorescence peaks and calls the nucleotides in the order they passed through the viewer. Interpreting Sequencing Results The computer output for a sequencing run consists of chromatogram (typically a sequencing facility will email you a file) that can be opened in one of many free DNA sequence viewers that are available on the web. We will be using a program called FinchTV. When viewing chromatograms, there is some ambiguity at various sites along the DNA sequence as the sequencing machine that is used to call the bases cant always precisely determine what nucleotide is represented by either a broad peak or a set of overlaying peaks. In such cases, a letter other than A, C, G, or T is recorded, most commonly N. When you obtain a sequence you should proofread it to ensure that all ambiguous sites are correctly called and determine the overall quality of your data. At the very minimum you should proofread and evaluate the quality of your sequence prior to uploading it to NCBI. If your sequence is suboptimal, you want to redo the sequencing or have both a forward and reverse sequencing reaction done and compare your two results to alleviate any possible miscalls.

Page 1

Biol 312 L Molecular Biology Lab Assignment: Viewing your Sequence and Analyses Open the Finch TV program. (Finch TV folder is on desktop, open it and hit the icon that resembles a TV). You will be directed to drag a sequencing file into the screen. Drag in your sequencing file. The first step is to examine your sequence from beginning to end by using the horizontal and vertical scroll bars. Good sequence generally begins roughly around base 20, and is represented by tall distinct peaks that have little overlap. Poor reactions will result in low or multiple peaks as illustrated below (very typical of beginning or end of a reaction).

Beginning of Sequence

End of sequence

At what base (indicated by the numbers at the top of the sequence) does your sequence become of high quality? (Note: some will not be of high quality anywhere, if this is the case get another sequence or compare a labmates)

How long is your sequence? Does the quality deteriorate towards the end? Do you find a stretch of AAAs at the end? (These AAAs are added by the Taq polymerase and are non-template derived.)

The second step is to check your sequence for ambiguity by searching for places where N is the base call instead of an A, C, G, or T. For example, in the case shown below, the correct nucleotide was likely a T. Do you find any instances where the N designation might not be appropriate? Also examine areas where two bases are superimposed on each other. Can you pick only one?

Page 2

Biol 312 L Molecular Biology Lab Look at the overall quality of the sequence, do you find that you have nice peaks that are easy to score? Does it appear that more than one product was sequenced?

Third step: determining homology. In other words, is your sequence similar to any other published sequences and if so, to what degree? This can be accomplished using BLAST, a program supported by the National Center for Biotechnology Information (NCBI). In Finch TV, select from the File dropdown menu the command export FASTA sequence and save the export onto the desktop. Open the exported file (you may need to open this up in the notepad program). You should see your sequence as a string of letters, copy the sequence and do a BLAST search at: http://www.ncbi.nlm.nih.gov/BLAST/cgi In the Basic Blast (left) section of this page, select nucleotide BLAST. Paste your sequence in the Enter Query Sequence box. In the Choose Search Set database, select other which includes all bacterial sequences. The drop-down menu should now say Nucleotide collection (nr/nt). You do not have to change any other parameters.

1) sequence in Fasta format

3) choose these 2) alternately you can upload your .seq file from the desktop

Click the Blast! button at the bottom to submit your sequence data.
Page 3

Biol 312 L Molecular Biology Lab

This screen will come up next. Finally (sometimes after a lengthy wait), a new window will appear showing any hits your sequence made. The results will be color coded and annotated so that you can either click on the horizontal homology bar(s) or you can simply scroll down the page.

many seq. with high homology

The bars show what places along your sequence are similar to other published sequences; the colors indicate how many bases were involved in homology determination. The scores indicate how much homology occurred for particular hits. Clicking on a score or bar in the graphic representation will take you to the section of the results page that describes a sequence that is similar to yours.

Page 4

Biol 312 L Molecular Biology Lab Clicking on a gi link at the beginning of any line will take you to the GenBank accession page for a sequence showing similarity to yours. There you can find a wealth of information about the published sequence to which yours showed some homology.

Fourth step: examine the proposed identities of your sequence. Ignore any uncultured bacterium because it does not designate an identity. Look for high query coverage, low E value (chance of random match) and high max identity scores. In this case, many instances of Enterococcus faecalis came up. Selecting the first Enterococcus sequence outlined in the box in the above figure (genbank# EU285587.1) will take you to the comparison between your query and the match. The majority of discrepancies are sequence gaps, with a few misalignments.

one misalignment several sequence gaps

Page 5

Biol 312 L Molecular Biology Lab Fifth step: now go back to YOUR sequence, check these areas, and decide if you would change anything. If you do change anything on your chromatogram, then go back and re-BLAST your sequence. Did any of the identity values change because of your corrections? What are the old and new values?

How does your sequence compare to its best match(es)? Fill out the BLAST results in the Table, then access the Ribosomal Database. BLAST RESULTS
NCBI Best match (Genus species) Your Sequence Alternate #1 Alternate #2 % Identity Organization where sequence was uploaded

RDB RESULTS
RDB Best match (Genus species) Your Sequence Alternate #1 Alternate #2 % Identity Organization where sequence was uploaded

To access the 16S rRNA database: http://rdp.cme.msu.edu/

Choose Sequence Match

Page 6

Biol 312 L Molecular Biology Lab Here is the start page:

you can upload your .seq file from the desktop here is your sequence in Fasta format

Paste in sequence. Click Submit

Next, the status page comes up:

Page 7

Biol 312 L Molecular Biology Lab

Finally, the results appear:

select this

This is the bottom half of the screen. You can try different parameters if you want.

Page 8

Biol 312 L Molecular Biology Lab

:: Result Format
Each match result line contains six elements, from left to right,

A short ID used to uniquely identify the RDP sequence. A click will return the simple entry, including the sequence. The orientation of the query sequence when the match is performed. "-" means the query sequence has been reverse-complemented. A top match hit with "-" orientation usually indicates the query sequence is a minus strand. A similarity score. SeqMatch reports the percent sequence identity over all pairwise comparable positions when run with aligned myRDP sequences. (Comparable positions are aligned positions containing a base in both sequences). Note that the rank order may differ between S_ab and pairwise identity scores, but the top 20 S_ab scores will contain the closest sequence by pairwise identity about 95% of the time (Cole et al). If two sequences do not overlap, the similarity between these two sequences will be displayed as "?". A seqmatch score (S_ab). These are the number of (unique) 7-base oligomers shared between your sequence and a given RDP sequence divided by the lowest number of unique oligos in either of the two sequences. The number of uniquely occurring oligomers within a given sequence (Olis). If the same oligomer occurs more than once then they are counted only once; thus this number only approximately reflects the sequence length. Counting only unique oligos compensates somewhat for composition bias (for example, inserts tend to be GC-rich and it becomes very likely that the same GC-rich oligos occur several times; by counting these only once, this artifact becomes less severe). Full name. The definition line from the RDP distribution, often the same as Genus/species/string name and accno.

Page 9

Biol 312 L Molecular Biology Lab Did results from NCBI and Ribosomal Database at the Genus species level come out the same? Fill out the Tables on page 6.

If there was a discrepancy between NCBI and the Ribosomal Database project which would you suspect is generally more accurate? Explain why.

NOTE: ANSWER THE QUESTIONS ON THE HANDOUT AND ON THE LAST PAGES (PAGES 11-12). KEEP THE HANDOUT AND TURN IN THE LAST PAGE.

REFERENCES FinchTV; http://www.geospiza.com/finchtv.html National Center for Biotechnology Information; http://www.ncbi.nlm.nih.gov/ Ribosomal Database Project; http://rdp.cme.msu.edu/

Page 10

Your Name: _______________________

Biol 312 L Molecular Biology Lab Date: ___________________

Sequence: _ Bacteria Identity: _______ % Identity:

Answer the questions in your handout an on this sheet. 1a. At what base (indicated by the numbers at the top of the sequence) does your sequence become of high quality?

1b. How long is your sequence? Does the quality deteriorate towards the end? Do you find a stretch of AAAs at the end?

2a. Do you find any instances where the N designation might not be appropriate? If so, give the nucleotide number of the N base and the base that you designated to replace N.

2b. Look at the overall quality of the sequence, do you find that you have nice peaks that are easy to score? Does it appear that more than one product was sequenced?

5a. check the areas that contain mismatches or deletions, and decide if you would change anything. If you do change anything on your chromatogram, then go back and re-BLAST your sequence. Did any of the identity values change because of your corrections? What are the old and new values?

5b. How does your sequence compare to its best match(es)? Fill out the BLAST results in the Table, then access the Ribosomal Database. BLAST RESULTS
NCBI Best match (Genus species) Your Sequence Alternate #1 Alternate #2 % Identity Organization where sequence was uploaded

Page 11

Biol 312 L Molecular Biology Lab RDB RESULTS

RDB Best match (Genus species) Your Sequence Alternate #1 Alternate #2 % Identity Organization where sequence was uploaded

5c. Did results from NCBI and Ribosomal Database at the Genus species level come out the same? Fill out the Tables on page 7.

5d. If there was a discrepancy between NCBI and the Ribosomal Database project which would you suspect is generally more accurate? Explain why.

Page 12

Module in Tics
No ratings yet
Module in Tics
20 pages
Using BLAST: FASTA Format
0% (1)
Using BLAST: FASTA Format
3 pages
Introduction To Bioinformatics: Accompaniment To Discovering Genomics
No ratings yet
Introduction To Bioinformatics: Accompaniment To Discovering Genomics
6 pages
RpoBSequencerW19 Copy Lab Report
No ratings yet
RpoBSequencerW19 Copy Lab Report
7 pages
UoG Bioinfomatics 3
No ratings yet
UoG Bioinfomatics 3
101 pages
3 Sequence Analysis
No ratings yet
3 Sequence Analysis
5 pages
DNA Sequences Analysis: Hasan Alshahrani CS6800
No ratings yet
DNA Sequences Analysis: Hasan Alshahrani CS6800
26 pages
(Directions Modified From Sarah Deel's Bio 125 Directions) : Week 7: Sequence Analysis
No ratings yet
(Directions Modified From Sarah Deel's Bio 125 Directions) : Week 7: Sequence Analysis
7 pages
RpoB ApE Analysis Guide
No ratings yet
RpoB ApE Analysis Guide
6 pages
Diploma - Practical
No ratings yet
Diploma - Practical
11 pages
Biological Sequence Determination: Protein
No ratings yet
Biological Sequence Determination: Protein
68 pages
LAB 1 Manual
No ratings yet
LAB 1 Manual
10 pages
Molecular Identification of Bacteria IIII 2015
No ratings yet
Molecular Identification of Bacteria IIII 2015
121 pages
Bioinformatics Exercises Print
No ratings yet
Bioinformatics Exercises Print
6 pages
Bioinfo Final Practical
No ratings yet
Bioinfo Final Practical
66 pages
02.-Sequence Analysis PDF
No ratings yet
02.-Sequence Analysis PDF
14 pages
Exercise 7 Bioinformatics
No ratings yet
Exercise 7 Bioinformatics
8 pages
Using Bioedit
100% (2)
Using Bioedit
2 pages
Dsappart 2 Blaststranscriptandnotes
No ratings yet
Dsappart 2 Blaststranscriptandnotes
7 pages
Biological Sequence Determination: Protein
No ratings yet
Biological Sequence Determination: Protein
68 pages
Reading The Blueprint of Life: DNA Sequencing
No ratings yet
Reading The Blueprint of Life: DNA Sequencing
23 pages
Bioinformatics: ABE 2007 Kent Koster Group 3
No ratings yet
Bioinformatics: ABE 2007 Kent Koster Group 3
43 pages
Quiz Dna
100% (3)
Quiz Dna
8 pages
Présentation Ekin en
No ratings yet
Présentation Ekin en
40 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Bioinformatics Lab 2 (Evelyn)
No ratings yet
Bioinformatics Lab 2 (Evelyn)
9 pages
Genomics For Beginner
No ratings yet
Genomics For Beginner
9 pages
Bioinformatics Lab 2
No ratings yet
Bioinformatics Lab 2
9 pages
Dna Sequencing
No ratings yet
Dna Sequencing
26 pages
Analysis of RNA-Seq Data
No ratings yet
Analysis of RNA-Seq Data
71 pages
RNA-Seq and Transcriptome Analysis: Jessica Holmes
No ratings yet
RNA-Seq and Transcriptome Analysis: Jessica Holmes
98 pages
BI Manual
No ratings yet
BI Manual
35 pages
DNA Project 2014
No ratings yet
DNA Project 2014
39 pages
Bioinformatics Tutorial 2019
No ratings yet
Bioinformatics Tutorial 2019
54 pages
DNA Sequencing Next Generation Sequencing
No ratings yet
DNA Sequencing Next Generation Sequencing
31 pages
Biology 171L - General Biology Lab I Lab 12: Introduction To Bioinformatics
No ratings yet
Biology 171L - General Biology Lab I Lab 12: Introduction To Bioinformatics
6 pages
Unit Iv - Blast
No ratings yet
Unit Iv - Blast
21 pages
Semwork 1
No ratings yet
Semwork 1
19 pages
Lecture1 Genome - Sequencing 2019
No ratings yet
Lecture1 Genome - Sequencing 2019
41 pages
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
No ratings yet
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
16 pages
TY-Exercise 4
No ratings yet
TY-Exercise 4
8 pages
RNA-Seq Module 1
No ratings yet
RNA-Seq Module 1
54 pages
Reading The Blueprint of Life: DNA Sequencing
No ratings yet
Reading The Blueprint of Life: DNA Sequencing
23 pages
Practical 2 Sequence Alignment
No ratings yet
Practical 2 Sequence Alignment
8 pages
DNA Sequencing
No ratings yet
DNA Sequencing
81 pages
About Basic Local Alignment Search Tool
No ratings yet
About Basic Local Alignment Search Tool
17 pages
Lab 2.1
No ratings yet
Lab 2.1
21 pages
Is To Be Acquaint With Sequence Analysis Tools That Can Be Accessed Through The Internet Specifically Working The NCBI Database
No ratings yet
Is To Be Acquaint With Sequence Analysis Tools That Can Be Accessed Through The Internet Specifically Working The NCBI Database
3 pages
Lab Report 03
No ratings yet
Lab Report 03
18 pages
IBDP Biology HL FE2016 Kognity
No ratings yet
IBDP Biology HL FE2016 Kognity
1 page
Lecture 01 - Genome Sequencing
No ratings yet
Lecture 01 - Genome Sequencing
48 pages
Bioinformatics Assingment - B8.Docx Alex Presly-37
No ratings yet
Bioinformatics Assingment - B8.Docx Alex Presly-37
10 pages
Bioinformatics Practical Assignment
No ratings yet
Bioinformatics Practical Assignment
25 pages
Protein Sequence Analysis
No ratings yet
Protein Sequence Analysis
44 pages
Individual Assignment Mic272
No ratings yet
Individual Assignment Mic272
9 pages
Sequence Analysis Primer, 1st Edition Full Download
100% (8)
Sequence Analysis Primer, 1st Edition Full Download
17 pages
02 Sequence Alignment
No ratings yet
02 Sequence Alignment
43 pages
Beginner's Guide To Using The DESeq2 Package
No ratings yet
Beginner's Guide To Using The DESeq2 Package
32 pages
Reordering Life: Knowledge and Control in the Genomics Revolution
From Everand
Reordering Life: Knowledge and Control in the Genomics Revolution
Stephen Hilgartner
No ratings yet
Gene Expression Programming: Fundamentals and Applications
From Everand
Gene Expression Programming: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Interpreting DNA SequenceREV

Uploaded by

Interpreting DNA SequenceREV

Uploaded by

Biol 312 L Molecular Biology Lab

DNA Sequencing Overview

1) sequence in Fasta format

Biol 312 L Molecular Biology Lab

many seq. with high homology

one misalignment several sequence gaps

To access the 16S rRNA database: http://rdp.cme.msu.edu/

Choose Sequence Match

Biol 312 L Molecular Biology Lab Here is the start page:

Paste in sequence. Click Submit

Next, the status page comes up:

Biol 312 L Molecular Biology Lab

Finally, the results appear:

Biol 312 L Molecular Biology Lab

Your Name: _______________________

Biol 312 L Molecular Biology Lab Date: ___________________

Sequence: _ Bacteria Identity: _______ % Identity:

Biol 312 L Molecular Biology Lab RDB RESULTS

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Interpreting DNA SequenceREV

Uploaded by

Interpreting DNA SequenceREV

Uploaded by

Biol 312 L Molecular Biology Lab

DNA Sequencing Overview

1) sequence in Fasta format

Biol 312 L Molecular Biology Lab

many seq. with high homology

one misalignment several sequence gaps

To access the 16S rRNA database: http://rdp.cme.msu.edu/

Choose Sequence Match

Biol 312 L Molecular Biology Lab Here is the start page:

Paste in sequence. Click Submit

Next, the status page comes up:

Biol 312 L Molecular Biology Lab

Finally, the results appear:

Biol 312 L Molecular Biology Lab

Your Name: _______________________

Biol 312 L Molecular Biology Lab Date: ___________________

Sequence: _______ Bacteria Identity: _____________________ % Identity: ________

Biol 312 L Molecular Biology Lab RDB RESULTS

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Sequence: _ Bacteria Identity: _______ % Identity: