IBB - MB.501 RNA-seq + Introduction To Galaxy
IBB - MB.501 RNA-seq + Introduction To Galaxy
1/31/2024 IBB.MB.501 1
RNA
1/31/2024 IBB.MB.501 2
RNA sequencing
– RNA quantification at
single base resolution
– Cost efficient analysis of
the whole transcriptome in
a high-throughput manner
1/31/2024 IBB.MB.501 3
Where does data come from?
1/31/2024 IBB.MB.501 4
Challenges of RNA sequencing
Presence of
Different origin for
incompletely Sequencing biases
the sample RNA and
processed RNAs or (e.g. PCR library
the reference
transcriptional preparation)
genome
background noise
1/31/2024 IBB.MB.501 5
Benefits of RNA sequencing
1/31/2024 IBB.MB.501 6
2 main research applications for RNA-Seq
– Transcript discovery
– RNA quantification
1/31/2024 IBB.MB.501 7
How to analyze
RNA seq data
for RNA
quantification?
1/31/2024 IBB.MB.501 8
Overview of the Data Processing
1/31/2024 IBB.MB.501 9
Data Pre-processing
1/31/2024 IBB.MB.501 10
Annotation of
RNA-Seq reads
– Simple mapping on a
reference genome?
– More challenging
1/31/2024 IBB.MB.501 11
Annotation of RNA-Seq reads
– Transcriptome mapping
– Genome mapping
1/31/2024 IBB.MB.501 12
Transcriptome
mapping
1/31/2024 IBB.MB.501 13
Genome mapping
1/31/2024 IBB.MB.501 14
Transcriptome and Genome mapping
– Needed
– Where to find?
1/31/2024 IBB.MB.501 15
De novo transcriptome assembly
1/31/2024 IBB.MB.501 16
Quantification
– What is the expression level of the genomic features?
– Challenges
– At gene level?
– At transcript level?
– At exon level?
1/31/2024 IBB.MB.501 17
Differential
Expression Analysis
1/31/2024 IBB.MB.501 18
Differential Expression Analysis: Normalization
1/31/2024 IBB.MB.501 19
Visualization
– Integrative Genomics Viewer (IGV) or Trackster
– Visualization of the aligned BAM files
– Sashimi plots
– Quantitative visualization of read coverage along exons and splice junctions
– CummeRbund
– Visualization package for Cufflinks high-throughput sequencing data
1/31/2024 IBB.MB.501 20
Using Galaxy
IBB.MB.501
1/31/2024 IBB.MB.501 21
Galaxy
1/31/2024 IBB.MB.501 22
Why use Galaxy?
– It’s easy!
1/31/2024 IBB.MB.501 23
How to use
Galaxy?
1/31/2024 IBB.MB.501 24
Main Galaxy servers
– There are three main Galaxy servers: Galaxy Main, Galaxy Europe, and Galaxy
Australia.
– These three Galaxies have the biggest teams behind them and offer the most
tools and resources.
– Many of these are domain specific. For example, Galaxy Proteomics focuses on
proteomics tools and workflows.
1/31/2024 IBB.MB.501 26
1/31/2024 IBB.MB.501 27
Uploading data
1/31/2024 IBB.MB.501 28
1/31/2024 IBB.MB.501 29
Finding a tool
– You can find tools by exploring the tool list on the left.
– If you already know the name of the tool you want to use, you can enter this in
the search box at the top of the tool panel.
– If you have found a tool you like, you can add it to your favorites by clicking the
star at the top of the tool.
1/31/2024 IBB.MB.501 30
1/31/2024 IBB.MB.501 31
Running a tool
– Here you can select your input files and set the parameters for the tool.
– Then you can hit the Execute button to start the tool.
1/31/2024 IBB.MB.501 32
Analysis Results
1/31/2024 IBB.MB.501 33
Analysis Results
– The results of the analysis will be added to your history.
– You will see these output files go through various states.
– When a dataset is grey, it means it is waiting to run.
– When it turns orange, it means the tool is running.
– When the tool is finished, the outputs will turn green if the tool ran successfully.
– You can then click on the history item to get more information and options.
– For example, you can download the file to your computer.
– Or you can reload the tool with the same parameters.
– There is also an option to visualize your data.
– If there was a problem with the tool, it will turn red.
– You can click on the bug icon to view the error message or submit a error report
to the Galaxy administrators.
1/31/2024 IBB.MB.501 34