0% found this document useful (0 votes)
13 views34 pages

IBB - MB.501 RNA-seq + Introduction To Galaxy

Genome Science

Uploaded by

Muhammad Shahzad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views34 pages

IBB - MB.501 RNA-seq + Introduction To Galaxy

Genome Science

Uploaded by

Muhammad Shahzad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

RNA-seq Analysis

1/31/2024 IBB.MB.501 1
RNA

– Transcribed form of the DNA

– Active state of the DNA

1/31/2024 IBB.MB.501 2
RNA sequencing

– RNA quantification at
single base resolution
– Cost efficient analysis of
the whole transcriptome in
a high-throughput manner

1/31/2024 IBB.MB.501 3
Where does data come from?

1/31/2024 IBB.MB.501 4
Challenges of RNA sequencing

Presence of
Different origin for
incompletely Sequencing biases
the sample RNA and
processed RNAs or (e.g. PCR library
the reference
transcriptional preparation)
genome
background noise

1/31/2024 IBB.MB.501 5
Benefits of RNA sequencing

1/31/2024 IBB.MB.501 6
2 main research applications for RNA-Seq
– Transcript discovery

– Which RNA molecules are in my sample?

– Novel isoforms and alternative splicing, Non-coding RNAs, Single nucleotide


variations, Fusion genes

– RNA quantification

– What is the concentration of RNAs?

– Absolute gene expression (within sample), Differential expression (between


biological samples)

1/31/2024 IBB.MB.501 7
How to analyze
RNA seq data
for RNA
quantification?

1/31/2024 IBB.MB.501 8
Overview of the Data Processing

• No available standardized workflow


• Multiple possible best practices for every dataset

1/31/2024 IBB.MB.501 9
Data Pre-processing

– Adapter clipping to trim the sequencing adapters


– Quality trimming to remove wrongly called and low quality bases

1/31/2024 IBB.MB.501 10
Annotation of
RNA-Seq reads

– Simple mapping on a
reference genome?
– More challenging

1/31/2024 IBB.MB.501 11
Annotation of RNA-Seq reads

– 3 main strategies for annotations

– Transcriptome mapping

– Genome mapping

– De novo transcriptome assembly and annotation

1/31/2024 IBB.MB.501 12
Transcriptome
mapping

– Need reliable gene models


– No detection of novel genes

1/31/2024 IBB.MB.501 13
Genome mapping

Splice-aware read alignment

Detection of novel genes and isoforms

1/31/2024 IBB.MB.501 14
Transcriptome and Genome mapping

– Needed

– Reference genome/transcriptome in FASTA

– Annotations of known genes, ... in GTF

– Where to find?

– Joint projects to produce and maintain annotations on selected organisms:


EMBL-EBI, UCSC, RefSeq, Ensembl, ...

1/31/2024 IBB.MB.501 15
De novo transcriptome assembly

– No need for a reference genome ...

1. Assembly into transcripts

2. Map reads back

1/31/2024 IBB.MB.501 16
Quantification
– What is the expression level of the genomic features?

– Counting the number of reads per features: Easy!!

– Challenges

– How to handle multi-mapped reads (i.e. reads with multiple alignments)?

– How to distinguish between different isoforms?

– At gene level?

– At transcript level?

– At exon level?

1/31/2024 IBB.MB.501 17
Differential
Expression Analysis

– Account for variability of


expression across biological
replicates with the help of
counts

1/31/2024 IBB.MB.501 18
Differential Expression Analysis: Normalization

– Make the expression levels comparable across


– By Features: genes, isoforms
– By Samples
– Methods
– FPKM/RPKM (Cufflinks/Cuffdiff)
– TMM (edgeR)
– DESeq2 (DESeq2)

1/31/2024 IBB.MB.501 19
Visualization
– Integrative Genomics Viewer (IGV) or Trackster
– Visualization of the aligned BAM files

– Sashimi plots
– Quantitative visualization of read coverage along exons and splice junctions

– CummeRbund
– Visualization package for Cufflinks high-throughput sequencing data

1/31/2024 IBB.MB.501 20
Using Galaxy
IBB.MB.501

1/31/2024 IBB.MB.501 21
Galaxy

– Galaxy is a web-based data analysis platform.

– It is easy to use, and completely free.

– Galaxy offers over eight thousand analysis tools.

– Galaxy is widely used. It currently has over ten thousand publications.

1/31/2024 IBB.MB.501 22
Why use Galaxy?

– It’s easy!

– No installation, all you need is a browser.

– No complex commands, just point and click!

– Makes your research reproducible

– Galaxy keeps track of all analysis details

– Cross-domain: bioinformatics, chemistry, ecology, climate science, ..

1/31/2024 IBB.MB.501 23
How to use
Galaxy?

– Find a Galaxy server


– UseGalaxy.*:
– Many other smaller, often
domain-specific Galaxies
available
– List of all public Galaxies
(135+): galaxyproject.org/use

1/31/2024 IBB.MB.501 24
Main Galaxy servers
– There are three main Galaxy servers: Galaxy Main, Galaxy Europe, and Galaxy
Australia.

– These three Galaxies have the biggest teams behind them and offer the most
tools and resources.

– You can register an account for free on any of these servers.

– In addition, there are many smaller Galaxy servers to choose from.

– Many of these are domain specific. For example, Galaxy Proteomics focuses on
proteomics tools and workflows.

– A lot of universities and other institutions have local private servers.


1/31/2024 IBB.MB.501 25
The Galaxy Interface

– Three main panels

– Left: Available Tools

– Middle: View your data and run tools

– Right: Full record of your analysis history

1/31/2024 IBB.MB.501 26
1/31/2024 IBB.MB.501 27
Uploading data

– Upload from your computer

– Import files from URL

– Import from public data stores

– UCSC, NCBI, ENA, many more..

1/31/2024 IBB.MB.501 28
1/31/2024 IBB.MB.501 29
Finding a tool

– After your data is uploaded, you are ready to run tools.

– You can find tools by exploring the tool list on the left.

– If you already know the name of the tool you want to use, you can enter this in
the search box at the top of the tool panel.

– If you have found a tool you like, you can add it to your favorites by clicking the
star at the top of the tool.

1/31/2024 IBB.MB.501 30
1/31/2024 IBB.MB.501 31
Running a tool

– When you click on a tool, it will show up in the middle panel.

– Here you can select your input files and set the parameters for the tool.

– Then you can hit the Execute button to start the tool.

– Older versions of a tool are usually kept available to ensure reproducibility.

1/31/2024 IBB.MB.501 32
Analysis Results

– Tool outputs are added to the history


– Different dataset states
– waiting , running , success, failed
– Expand for more options
– galaxy-save Download dataset
– galaxy-info Information about tool run
– galaxy-refresh Reload tool with the same parameters
– galaxy-barchart Visualize dataset
– Red dataset?
– galaxy-bug Click Bug icon
– view error message
– submit error report ]

1/31/2024 IBB.MB.501 33
Analysis Results
– The results of the analysis will be added to your history.
– You will see these output files go through various states.
– When a dataset is grey, it means it is waiting to run.
– When it turns orange, it means the tool is running.
– When the tool is finished, the outputs will turn green if the tool ran successfully.
– You can then click on the history item to get more information and options.
– For example, you can download the file to your computer.
– Or you can reload the tool with the same parameters.
– There is also an option to visualize your data.
– If there was a problem with the tool, it will turn red.
– You can click on the bug icon to view the error message or submit a error report
to the Galaxy administrators.
1/31/2024 IBB.MB.501 34

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy