0% found this document useful (0 votes)

45 views10 pages

BIOINFORMATICS Good.2

Uploaded by

Lawrence Ndam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views10 pages

BIOINFORMATICS Good.2

Uploaded by

Lawrence Ndam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

B RIEFINGS IN BIOINF ORMATICS . VOL 15. NO 6. 1004 ^1013 doi:10.

1093/bib/bbt065
Advance Access published on 5 September 2013

Teaching the ABCs of bioinformatics: a

brief introduction to the Applied
Bioinformatics Course
Jingchu Luo
Submitted: 20th May 2013; Received (in revised form) : 15th August 2013

Abstract
With the development of the Internet and the growth of online resources, bioinformatics training for wet-lab
biologists became necessary as a part of their education. This article describes a one-semester course ‘Applied
Bioinformatics Course’ (ABC, http://abc.cbi.pku.edu.cn/) that the author has been teaching to biological graduate
students at the Peking University and the Chinese Academy of Agricultural Sciences for the past 13 years. ABC is
a hands-on practical course to teach students to use online bioinformatics resources to solve biological problems
related to their ongoing research projects in molecular biology. With a brief introduction to the background of
the course, detailed information about the teaching strategies of the course are outlined in the ‘How to teach’
section. The contents of the course are briefly described in the ‘What to teach’ section with some real examples.
The author wishes to share his teaching experiences and the online teaching materials with colleagues working in
bioinformatics education both in local and international universities.
Keywords: bioinformatics education; introductory course; hands-on course; project-based learning; on-site teaching

INTRODUCTION computers to obtain and analyze biological data in

The international Human Genome Project started in their daily research with online or desktop bioinfor-
the early 1990s, and the rapid development of the matics tools. To help these biologists access the
World Wide Web at the same time marked the databases effectively and use the analysis tools
beginning of a new era of biological research, efficiently, bioinformatics training became a vital
foreseen by Walter Gilbert in his perspective article part of a biological education.
published on Nature ‘News and Views’ in January As we all know, nationally and internationally
1991 [1]. Under the title ‘Towards a paradigm shift renowned bioinformatics centers such as the
in biology’, he pointed out that ‘We must hook our National Center for Biotechnology Information
individual computers to the worldwide network that (NCBI) and the European Bioinformatics Institute
gives us access to daily changes in the database and (EBI) have been playing a leading role in the provi-
also makes immediate our communication with each sion, not only of bioinformatics resources and services
other’. but also online training materials [2, 3]. The NCBI
With a massive explosion of nucleotide and Education web portal (http://www.ncbi.nlm.nih.
protein sequence data, other types of information gov/education/) is an entry point for either novice
stored in primary and secondary biological databases, or advanced users to find various information includ-
experimental wet-lab biologists started to use ing how-to guides, handbooks, frequently asked

Corresponding author. Jingchu Luo, State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences and
Center for Bioinformatics, Peking University, Beijing 100871, P.R. China. Tel.: þ86 10 62756590; Fax: þ86 10 6275 9001;
E-mail: luojc@pku.edu.cn
Jingchu Luo is a Professor in the College of Life Sciences at Peking University and the China Node manager of the European
Molecular Biology Network. His main areas of research are development of bioinformatics platforms; construction of bioinformatics
databases; and computational comparative genomics. He has been teaching the Applied Bioinformatics Course for graduate students of
biology for 13 years.

ß The Author 2013. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial
re-use, please contact journals.permissions@oup.com
Teaching Bioinformatics ABCs 1005

questions, help manuals and tutorials. The online Prerequisites

training courses developed and maintained at EBI ABC is aimed to teach biological graduate students
(http://www.ebi.ac.uk/training/online/) are espe- how to access online bioinformatics resources and
cially helpful for users looking for eLearning oppor- how to use bioinformatics tools to solve practical
tunities. On the other hand, traditional, on-site, problems related to their research projects.
face-to-face teaching incorporating online teaching The prerequisites for the students to take the
materials also plays a significant role in both lecture- course are as follows:
based and hands-on bioinformatics education and
training, either as semester courses for undergraduate A good background in biological sciences.
students [4–7] or as short-term training courses [8]. An ability to read in English.
Community efforts, especially in Europe, such as Availability for at least 3 h every week outside the
the Bioinformatics Training Network (BTN, http:// class for group discussion.
www.biotnet.org/) have provided a forum for bio-
informatics trainers to share their experiences [9]. A set of self-test exercises is provided for the
For a broader view and detailed description of the students to check if they are qualified to take the
current situation in bioinformatics education and course. These include basic taxonomy and molecular
training, readers are encouraged to refer to the two evolution, the genetic code, the concept of the
recent reviews [10, 11] and the dedicated articles central dogma, the eukaryotic gene structure, the
published in the special issue of Briefings in principle of protein structure and conformation, as
Bioinformatics. well as sequence–structure–function relationship of
In this article, the Applied Bioinformatics Course biomolecules. A survey form is designed for each
(ABC), a face-to-face on-site semester-long course student to fill in before taking the course. In addition
for biological graduate students is introduced. ABC to the general information such as their educational
is a project-based and problem-oriented hands-on background and ongoing research project, the stu-
course for the graduate students enrolling in molecu- dents are encouraged to give their suggestions and
lar biology research projects. Benefited from various to state any special requirements based on their work
resources in bioinformatics education mentioned or interest. Finally, a good command of the English
above and tailored for special requirement from bio- language is also required to take the course, as most
logical graduates, the goal of this course is to help of the materials and reference books we use in the
students to solve practical biological problems related course are in English.
to their ongoing research projects using bioinfor- Owing to the increasing demand from the stu-
matics resources and analysis tools. The outline of dents, the nature of the on-site hands-on course as
the course, the main databases, analysis tools, problem well as the limited availability of computer facilities
examples and the demonstration projects are narrated, in the training room, the students are divided into
and the teaching strategy and organization of the several classes in each semester. For example, in the
course are described. Thus, the aim of this article is 2013 spring semester ending in June this year, 168
to give the reader an overview of the course and offer students were divided into three classes.
an understanding of the online materials dedicated to
this course. The author wishes to share experiences
accumulated over the past 13 years and to stimulate Format
discussion with both local and international col- ABC is essentially a hands-on practical course. It
leagues working on bioinformatics education. usually takes 15 weeks, with 3 h each week used
for teaching and a further 3 h/week for group dis-
cussion. Except for the introductory session at the
HOW TO TEACH beginning of the semester and the final group
The name of the course ABC implies that this is a report with presentations at the end, most sessions
practical course to teach the ABCs of bioinformatics. of the course are run in a computer training room.
A Web site dedicated to this course (http://abc.cbi. Each student has a desktop computer connected
pku.edu.cn/) was created in 2004 and is maintained to the Internet. Special teaching software with a
and updated regularly. Most of the materials men- broadcasting function is installed on each computer
tioned in this article can be found in greater detail on so that the contents on the teacher’s screen are
the ABC Web site. simultaneously displayed on the students’ screen.
1006 Luo

This makes it much easier to present step-by-step example, in the UniProt database query, students are
demonstrations. asked to find the list of species that has an alpha
The students are divided into working groups, hemoglobin (HBA) sequence identical to human
usually consisting of four members each, with one HBA. During Ensembl database browsing, students
volunteer as the group head. The members of each are asked to find the difference in chromosome
individual group often come from the same labora- numbers between human and chimpanzee, and the
tories and have similar research projects. They work syntenic regions between human chromosome 2 and
together during the class. Organized by the group chimpanzee chromosome 2A and 2B.
head, each group also works together outside the
class at least 3 h each week to do homework assign- Homework assignments
ments or to discuss and solve practical problems One of the most successful aspects of the course is the
related to their own research projects. Reports of design of a set of homework assignments for the
group discussion with achievements and questions students to work on outside of the class. Different
are submitted and some of the questions are discussed from the simple exercises described above, the ques-
in the class. tions in homework assignments are much more
Dedicated email accounts for each class are created comprehensive. The topics of the assignments in-
to communicate with students. For example, the clude literature searches, database queries, sequence
email account pku13s@gmail.com is used to alignments, BLAST searches, as well as the compre-
exchange messages with Peking University (PKU) hensive predefined projects introduced throughout
students taking the ABC course of the 2013 spring the course. Currently, 10 homework assignments
semester. Students send their homework assign- are designed and being updated or revised over the
ments, reports of group discussion as well as any time. Students usually discuss the questions in the
comments and suggestions to this email address. homework assignments within each group outside
the class, exchange ideas and experiences first and
Exercises then work on them independently. Undoubtedly,
A set of simple exercises has been created for the stu- the homework assignments strengthen the students’
dents to do either in or outside the class. They have understanding about the tools that can be used for
been designed to familiarize the student with various their own projects.
bioinformatics tools. The topics of the exercises in- For PKU students, teaching assistants are assigned.
clude PubMed searches, UniProt database queries, They help with checking the homework assignments
sequence alignments, dot plots, BLAST searches, and join in discussions with certain groups every
DNA and protein sequence analysis, motif identifica- week.
tion, gene structure prediction, tree construction and
protein structure visualization and analysis. Presentations
All the students are encouraged to make informal
Projects presentations and answer questions raised by other
As a project-based and problem-orientated course, students throughout the classes. For some of the
the emphasis has been put on two main areas. First, more theoretical topics, such as the algorithm of
a set of predefined projects are used in the course. the BLAST similarity search, the principles of mo-
These projects are mainly selected from the real ex- lecular evolution and phylogenetics, the methods of
amples that the author has worked on, either in his homology modeling, special lectures are given either
own research projects or collaboratively with wet-lab by the instructor or by invited senior students
colleagues in the college. Students from individual working on research projects of bioinformatics and
groups are required to bring their own projects clo- molecular evolution.
sely related to the research they are already enrolled At the end of the course, two workshops are orga-
in or plan to start. nized, a small one and a large one. The small one is a
Second, biological problems are focused on from class-based group report. Representatives of each
the beginning to the end of the course. Not only the group chosen by group members make presentation
comprehensive problems are embedded in the pre- to summarize what they have learned and applied to
defined projects but also general biological questions their own projects. One or two good presentations
are asked throughout the teaching of the course. For are selected to make further refinement and to
Teaching Bioinformatics ABCs 1007

present in the larger workshop with students from In recent years, various bioinformatics books with
all classes taking the course in the same semester. The diverse contents and different levels either in English
presentation slides are made available online after the or in Chinese became available. And students are
workshop. advised to find reference books that are most suitable
The large workshop is also open to other students for them based on their background.
who took the course previously or will take the A Chinese textbook ‘Bioinformatics Techniques
course in the next semester. In addition to the refined and Applications’ is being written by the author
representative talks from each class, invited lectures and will be published in the near future. Some of
are also given by well-established scientists from our the materials retrieved from this book are tailored as
university or external institutions. The topics of these Power Point slides and presented in the course.
lectures are more advanced and specific: such as
the methods and applications of next-generation
sequencing (NGS), the principle and application of WHAT TO TEACH
molecular evolution, the applications of comparative With the teaching strategies and focuses described
genomics and structure-based drug design. above, the following section tries to give a general
picture of the course contents.
Exams and credits
As a hand-on practical course, there are no conven- Analysis tools
tional examinations at the end of it. Instead, the Different from short-term training courses, which
students are required to make a summary of the might focus on certain packages owing to a limited
course and work on this during the summer or time, this course does not take the tool-centric ap-
winter vacation. The topics of the summary are flex- proach. Instead, different tools are introduced in sev-
ible. Students are encouraged to apply the approaches eral stages when they are needed to solve certain
implemented in the predefined projects and the problems. Also, students are advised to find suitable
techniques they learn during the course to their tools by themselves and read the documentation and
own ongoing research projects. help materials carefully to use the tools correctly and
Course credits are valued based on several factors: efficiently. The ABC Tools page (http://abc.cbi.
(i) the performance during the classes; (ii) the home- pku.edu.cn/tools.php) collects groups of popular
work assignments; (iii) the report of group discus- tools used in the course. The most commonly used
sions; (iv) the final group presentations; (v) the final tools are described as follows.
summary. Both web-based platforms and desktop packages
are used in this course (Table 1). WebLab is the main
Reference books online analysis platform used in the course. It is an
Several reference books including the two introduc- open-source package developed and maintained by
tory English books ‘Essential Bioinformatics’ [12] a team from our center over the past 10 years and
and ‘Bioinformatics for Dummies’ [13] are recom- contains >200 individual programs mainly for DNA
mended to all students taking the course. Students and protein sequence analyses [15]. The built-in user
are also encouraged to read a more advanced and data space can be used to store the sequence data and
comprehensive one ‘Bioinformatics: Sequence analysis results, and the data-sharing mechanism is
and Genome Analysis’ written by David Mount well suited for students to share the data among
and published in China as reprints [14]. members of a group. For desktop programs, we use

Table 1: The main analysis tools used in the course

Name Type URL

WebLab Online sequence analysis platform http://weblab.cbi.pku.edu.cn/

Jemboss Desktop sequence analysis interface http://emboss.sourceforge.net/Jemboss/
Blast Database similarity search http://blast.ncbi.nlm.nih.gov/
MEGA Phylogenetic analysis http://www.megasoftware.net/
SPDBV Structure analysis http://www.expasy.org/spdbv/
1008 Luo

Jemboss [16], which offers a Java interface to the Lists of online databases published in the Nucleic
well-known EMBOSS package [17]. Acid Research (NAR) special issues and web-based
For database similarity searching, we use the online analysis tools collected by Swiss Institute of
NCBI BLAST web server as the main online Bioinformatics (SIB) are introduced in the middle of
search tool, with the emphasis on selecting individual the semester with the emphasis on various bioinfor-
programs, choosing different databases, setting spe- matics resources that are available on the Internet
cific parameters, changing output formats and ana- (Table 2).
lyzing search results. We also install the command
line BLAST package locally under both Windows Database query
and Linux systems to make demonstrations. After a brief introduction to the bioinformatics re-
The MEGA package [18] is chosen for initial sources, we move on to introduce the most popular
phylogeny analysis, as it has a graphical user interface sequence databases including the UniProt protein
offering different methods in tree construction. sequence database, the NCBI RefSeq database, the
Other phylogenetic packages such as Phylip Ensembl genome database, the protein data bank
(http://evolution.genetics.washington.edu/phylip. (PDB) protein structure database, as well as other
html) and PAML (http://abacus.gene.ucl.ac.uk/soft databases such as the gene structure database Gene,
ware/paml.html) are also introduced, though not in the gene expression database Expression Atlas, the
detail. protein–protein interaction database STRING, the
For the protein three-dimensional structure metabolic pathway database KEGG, the protein
analysis and visualization, Swiss-PDBViewer [19] is domain database PFam, the proteomic database
selected as the main tool because it has a rich func- Pride, the phylogenetic database OrthoDB, the
tionality with well-written help documentation and disease-related database OMIM.
online tutorials. Other visualization tools such as the Rather than going through all these databases one
off-line package PyMol and online web browser by one using Power Point slides, we use a different
plug-in JMol are also introduced. approach to introduce these databases. First, we take
the human hemoglobin alpha subunit (HBA_
Bioinformatics resources HUMAN) as an example to show the rich annota-
We start with the introduction to bioinformatics re- tions of a UniProt/Swiss-Prot database entry, and
sources maintained by the two most popular bio- further explore the specific annotations provided by
informatics centers in the world, NCBI and EBI. It these databases by clicking the links in the Cross-
is impossible to browse all the web pages embedded Reference section of the entry. After browsing the
in these two web portals. Instead, we encourage stu- entries of these databases, a brief introduction to the
dents to use the NCBI online educational materials commonly used databases and the differences among
and the EBI online training courses to find useful these databases is given, so that the students can
resources by themselves. Advanced search of have a general overview of various databases in
PubMed, together with the useful tool MyNCBI, bioinformatics.
is introduced at the beginning of the semester. The advanced search functionality is another focus
Other literature resources are also introduced in the in this session and uses the powerful functions in the
first session, such as the scientific stories in ‘Molecule database query systems of international bioinfor-
of the Month’ and ‘Protein Spotlight’ publications. matics centers such as NCBI, EBI and the SIB. For

Table 2: Online bioinformatics resources

Resource URL Developer

Educational http://www.ncbi.nlm.nih.gov/education/ NCBI

Training http://www.ebi.ac.uk/training/online/ EBI
Molecular of the month http://www.rcsb.org/pdb/101/motm_archive.do David Goodsell
Protein Spotlight http://web.expasy.org/spotlight/ Vivienne Gerritsen
List of databases http://www.oxfordjournals.org/nar/database/c/ NAR
List of tools http://www.expasy.org/ SIB
Teaching Bioinformatics ABCs 1009

Table 3: Example of UniProt advanced search coding sequence (CDS) of the hemoglobin alpha
gene of these three mammals. By using the correct
Query field Query text and topic, term Hits scoring matrix and setting the appropriate gap penal-
Organism Arabidopsis thaliana [3702] 53 755 ties, they can obtain an output where results are
Reviewed Yes 12 019 more convincing. Indeed, the sequence identity of
Protein Evidence at protein level 4086
CDSs between mouse and rat is higher than that of
existence
General Subcellular localization: membrane; 1243 mouse and human, which indicates that mouse and
annotation confidence: experimental rat are closer relatives.
Sequence Signal peptide; confidence: any 234
annotation
Database search
Running BLAST seems an easy job for most of the
students. However, running a ‘good’ BLAST is an-
example, a simple search in UniProt using text other story. Literature searches tell us that neuroglo-
‘Human hemoglobin’ returns not only different sub- bin is a member of the human globin family of
units of human hemoglobin but also a 100 entries which nine hemoglobin subunits (alpha, beta,
from other species. Using the ‘Advanced Search’ gamma, delta, etc) as well as myoglobin and cytoglo-
function developed by the UniProt team, we may bin belong to.
easily eliminate these false positives. By selecting Taking this as an example for BLAST search, we
‘hemoglobin’ as protein name, ‘human [9606]’ as then ask, ‘can we find a human neuroglobin through
organism and ‘HB*’ as gene name, we can obtain BLAST search using the amino acid sequence of the
all nine human hemoglobin subunits in UniProt/ alpha subunit [Swiss-Prot: HBA_HUMAN] as a
Swiss-Prot without false positives or false negatives. query?’ The answer is ‘No’ if we use the default
Several more comprehensive examples are also given parameters to run BLASTP through the NCBI
as demonstrations in the class. With the rich anno- BLAST server. Nevertheless, we can obtain a good
tation embedded in the Swiss-Prot entries, we match by using PSI-BLAST and choosing Swiss-Prot
can find all experimentally verified Arabidopsis as the preferred database, selecting Human as the
membrane proteins with potential signal peptides organism and setting E-value to 0.001. The first
using the step-by-step ‘Advanced Search’ function run returns 11 hits including nine hemoglobin sub-
(Table 3). units as well as myoglobin and cytoglobin, but not
After doing several exercises likes this, the students neuroglobin. However, with the position-specific
start to search their own proteins, browse the anno- scoring matrix that is automatically constructed
tations in these entries and further explore informa- based on these 11 sequences obtained in the first
tion in other databases by clicking the cross-links. run, the output of the second PSI-Blast search does
return the neuroglobin (NGB_HUMAN) that has
Sequence alignment only 21.9% identical residues with HBA_HUMAN.
Sequence alignment is one of the most commonly Other BLAST programs are also introduced with
used approaches to study the relationship among dif- real examples to show the students how to choose
ferent molecules and organisms. It seems easy to run the different programs and how to set better param-
the global alignment program Needle implemented eters to obtain biologically meaningful results. At the
in the WebLab and Jemboss packages. However, it is end, we give a brief introduction to the algorithm
not trivial to obtain and analyze the output results behind BLAST so that the students have a better
without solid biological knowledge. understanding of the parameters such as the scoring
We start with pairwise alignment of the amino matrix, the E-value and the word size. By doing so,
acid sequence of the alpha subunit of human, the students become aware of the importance of
mouse and rat hemoglobin. Surprisingly, the knowing the general principles and biological back-
output of the alignment is different to that which ground behind this and other sequence analysis
would be expected—the sequence identity between programs.
mouse and rat is less than that of mouse and human.
This triggers the interests of the students with enthu- Analysis of a plant mRNA sequence
siastic discussion. Finally, they find the answer to this After several sessions of general introduction to the
question by retrieving and comparing the nucleotide online resources, database query and analysis
1010 Luo

platform, we start to work on more comprehensive genes is obtained by running a BLASTX search
predefined projects to solve real biological problems. against the UniProt/Swiss-Prot database.
In 1997, a Pisum sativum post-floral-specific gene
(PPF-1) was identified from a cDNA library of Analysis of the bar-headed goose
short-day grown G2 pea tissue by a colleague in hemoglobin
our college [20]. BLAST search hits with the query More comprehensive projects are introduced during
sequence of PPF-1 protein contains several bacterial the course. One of them is the analysis of the
inner membrane proteins, which demonstrate sequence, structure, function and evolution of the
hydrophobic regions in sequences. This suggests bar-headed goose hemoglobin. Hemoglobin is one
that the PPF-1 protein may have transmembrane of the most well-studied proteins of the past century.
(TM) helices and could be located in the chloroplast Hundreds of hemoglobin protein sequences have
membrane of pea leaves. been deposited into the Swiss-Prot database.
Taking the 1523 bp mRNA sequence as an ex- Three-dimensional structures of wild type and mu-
ample, we show the students how to use different tants from dozens of species have been solved. This
tools to display and retrieve the CDS using several provides us with a good opportunity to study the
tools such as PlotORF, ShowORF and GetORF, structure–function relationship of hemoglobin
and make analysis such as GC content, codon molecules.
usage and restriction enzyme cleavage. By translating The bar-headed goose is a special species of
the CDS to an amino acid sequence, we can make migratory bird. They live in the Qinghai lake
further analysis at the protein level, such as the during summer time, fly to India over the Tibetan
statistics of amino acids, the identification of signal plateau in autumn and return to the lake in spring.
peptide, the prediction of potential TM regions, Interestingly, the graylag goose, which is a close rela-
helical wheel display of these potential TM helices tive of the bar-headed goose, lives in the lowland all
and generally gain an overview of the protein. year round. Sequence alignment of bar-headed
goose hemoglobin with that of graylag goose
Analysis of a fugu genomic sequence shows only four substitutions. The most critical
In 1990s, Sydney Brenner and his colleagues initiated one is the Pro119 to Ala119 substitution located at
the fugu genome project (http://www.fugu-sg.org/). the surface of the alpha/beta interface. In 1983, Max
The genome size of this model organism is only Perutz proposed that this substitution reduces the
390 Mb, yet it contains most of the human homolog contact between the alpha and beta subunit and
genes and can be used as a model system to study the increases the oxygen affinity, owing to the relation
function of human genes. For example, the human of the tension status in the deoxy form [21].
multidrug resistance gene (MDR) family has several During the past decade, a research group at our
members that belong to the ABC transporter university has solved the crystal structure of both the
superfamily. To investigate the mechanism of drug deoxy (PDB ID: 1HV4) and the oxy (PDB ID:
resistance, a PhD student at the Chinese Academy 1A4F) form of the bar-headed goose hemoglobin,
of Medical Science obtained a 39 kb genomic as well as the oxy form (PDB ID: 1FAW) of the
sequence from a Fugu cosmid using the human graylag goose hemoglobin. Using the powerful free
MDR gene as a probe. Analysis of the sequence Swiss-PDBViewer, the students create a Magic Fit to
data confirmed that this genomic sequence contained superimpose the alpha/beta heterodimer of the oxy
two MDR genes. form of the two goose hemoglobins (1A4F and
The approach and tools used in this project are 1FAW) on each other. They are excited to see the
rather different from those used in the analysis of difference of the side chains between Pro119 in gray-
the pea mRNA sequence as discussed in the previous lag goose hemoglobin and Ala119 in bar-headed
section. First, a repeat region can be identified using goose hemoglobin, and make measurements of the
a dot plot program to compare the genomic distance between the atoms of these two side chains
sequence with itself. Several gene identification pro- and the atoms in the side chain of the corresponding
grams can then be used to predict the exon/intron residue in the beta subunit.
structure of the different fragments of the genomic Several other projects are also discussed through-
sequence. Finally, confirmation that this genomic out the course, such as analyzing and modeling of
sequence does contain two multidrug resistance the human carcinoembryonic antigen, predicting the
Teaching Bioinformatics ABCs 1011

three-dimensional structure of an antifungal peptide working on NGS data analysis was run in the past
from pokeweed, engineering the metallothionein, semester.
proposing the evolutionary mechanism of a plant- With a background in biology and self-taught
specific transcription factor family. computer scientist, the author would like to empha-
In addition to the predefined projects, students are size that the strategy used in teaching the ABC
also encouraged to bring their own problems to solve course is more biological than computational. This
either in or outside the class. For example, the fol- means that a bottom-up but not top-down approach
lowing list is taken from some group presentations of is used. The course evolved from an original one
the 2013 spring semester course: under the name ‘Computer Applications to
Molecular Biology’ (CAMB), which he taught
The sequence analysis of EIN3 in Arabidopsis during the 1990s. Whereas CAMB was a lecture-
thaliana. based course, which was taught in a conventional
Analysis of the cold-response receptor kinase classroom with presentations using transparencies in
OsRLK1 in rice. the early days and Power Point slides later on, ABC
Analysis of the PDF1 gene in Brassica napus. is a hands-on course, which is arranged in a com-
Identification of odorant binding protein7 in puter training room with desktop computers for each
Apolygus lucorum. student to access the Internet and to make analyses
Sequence analysis and structure prediction of SIR2 with bioinformatics tools. The transit from CAMB
in leishmania. to ABC was inspired by the success of several training
Structural and functional analysis of structural pro- courses that the author organized during the late
teins of foot-and-mouth disease virus. 1990s. A dozen colleagues and training officers
Identification of the antigenic sites in the hem- from the European Molecular biology Network
agglutinin of avian influenza virus. (EMBnet) and institutions around the world, includ-
Analysis of structure and function of DNA binding ing Europe and the United States, came to our uni-
protein HU in Synechococcus PCC7002. versity to teach several courses such as an EMBnet
Bioinformatics Course, a WhatIf Protein Structure
Using the techniques on sequences they are per- Course, a Genome Data Analysis Course and a
sonally interested in enhances the learning experi- Database Query Course with the Sequence Retrial
ence. And fundamentals of the programs used System. The experiences gained from these courses
become clearer to the student as they can relate the show that a hands-on practical course is necessary for
computational findings to their expectations based wet-lab biologists working in molecular biology.
on what they already know about their own The ‘evolution’ keeps going throughout the
projects.
teaching of the course during the past 13 years,
and is still underway, mainly propelled with the
DISCUSSIONS AND FUTURE comments and suggestions from the students’ feed-
DIRECTIONS back delivered at the end of each semester. For ex-
As indicated by the name of the course and the title ample, the idea of dividing the students into working
of this article, ABC is an entry-level introductory group was suggested by several students based on
course for biological graduate students and aimed their experience of benchwork laboratory experi-
to teach them how to use the software tools available ments >10 years ago and has been proved to be a
to solve their biological problems. The methods and good strategy. The small workshop for group report
algorithms of these tools, as well as the statistical at the end of semester was started at the same time,
aspects behind them, are taught in another course, but the large workshop with talks by representative
Methods in Bioinformatics (MiB), by colleagues in groups to stimulate students’ further learning, and
our college. Analyses of high-throughput Omics data with invited lectures by senior scientists to expand
generated by NGS are briefly introduced in invited bioinformatics knowledge, was initiated in the recent
lectures, not as hands-on exercises in either course, years. On the other hand, some impractical settings
owing to limited hardware resources. An Omics data were eliminated. For example, Linux system and
analysis forum was organized in the past year to- command-line user environment were introduced
gether with colleagues from two other institutions, in the early years, which took a great amount of
and a seminar for a small class of about 20 students time for most of the students to get familiar with
1012 Luo

the commands and parameters. However, a survey the lab!’ (personal communication)—if you do your
shows that only a few students use Linux system after software analysis in the correct manner!
the course. Therefore, we changed the command-
line–based environment to web-based one for the
course, and organize seminars for students working Key Points
on Omics data analysis. ABC is an on-site, practical, hands-on course for graduate stu-
As application of bioinformatics techniques dents of biology.
ABC is a project-based and problem-orientated course to teach
become more and more important in biological re- students how to access the online bioinformatics resources and
search, as well as the success of the ABC course, the how to use analysis tools to solve problems related to research
number of the students enrolling for the course projects.
Students are divided into groups to work together in and outside
grows every semester, from an average of 100 each
the class collaboratively.
year during the early 2000s to >250 in the past year. Special exercises and homework assignments are designed for
Currently, ABC is only for biological graduate stu- the course.
dents. In recent years, many senior undergraduate Discussion and presentations are encouraged throughout the
course, and workshops are organized at the end of the course.
students in our college have shown great interest in
taking the ABC course. Some of them have already
been engaging in wet-lab experiments in the labora-
tory and working in certain projects. Clearly, it is Acknowledgements
The author thanks all the students who have taken the ABC
necessary for them to learn the essential bioinfor-
course and made suggestions for its further development. The
matics tools to help with their ongoing projects. author also thanks the critical comments from the anonymous
A plan to run ABC as a selective course for some reviewers and helpful suggestions from colleagues and students in
undergraduate students is being proposed. It is chal- the revision of the manuscript. Special thanks to Dr Lisa Mullan
lenging to run the course by a single person for for her help in critical reading and language editing of the
both graduate students with increasing number and manuscript.
for undergraduate students in the same semester.
A plan to design some modules of the course and
make them available as network-accessible open FUNDING
course is underway. At the same time, improvements College of Life Sciences and Computing Center of
of the ABC Web site with more detailed materials PKU; Graduate School of the Chinese Academy
and step-by-step tutorials may help students with of Agricultural Sciences; State Key Laboratory of
self-learning. Protein and Plant Gene Research.
The main pages of the ABC Web site are in
English, though some of the WORD and PDF
files such as exams and homework assignments are References
in mixture of English and Chinese. We are currently
1. Gilbert W. Towards a paradigm shift in biology. Nature
considering whether it is necessary to create a mirror 1991;349:99.
site and translate all the pages into Chinese. With 2. Cooper PS, Lipshultz D, Matten WT, et al. Education
new technologies in web design, dynamic pages resources of the National Center for Biotechnology
such as discussion groups are also being planned. Information. Brief Bioinform 2010;11:563–9.
3. Wright VA, Vaughan BW, Laurent T, et al. Bioinformatics
training: selecting an appropriate learning content
CONCLUSION management system—an example from the European
Bioinformatics Institute. Brief Bioinform 2010;11:552–62.
ABC is an entry-level introductory course with
4. Chapman BS, Christmann JL, Thatcher EF. Bioinformatics
the aim of teaching the ABCs of bioinformatics. for undergraduates: steps toward a quantitative bioscience
We hope that, by taking the course, students will curriculum. Biochem Mol Biol Educ 2006;34:180–6.
be in a position to efficiently access the online bio- 5. Cummings MP, Temple GG. Broader incorporation of
informatics resources and to use the bioinformatics bioinformatics in education: opportunities and challenges.
tools to solve their own practical biological prob- Brief Bioinform 2010;11:537–43.
lems. The goal of the ABC course is best summed 6. Weisman D. Incorporating a collaborative web-based vir-
tual laboratory in an undergraduate bioinformatics course.
up using a phrase coined by Dr Alan Bleasby (coau- Biochem Mol Biol Educ 2010;38:4–9.
thor of the EMBOSS sequence analysis package): 7. Furge LL, Stevens-Truss R, Moore DB, et al. Vertical
‘half a day on the web, saves you half a month in and horizontal integration of bioinformatics education: a
Teaching Bioinformatics ABCs 1013

modular, interdisciplinary approach. Biochem Mol Biol Educ 15. Liu X, Wu J, Wang J, et al. WebLab: a data-centric, know-
2009;37:26–36. ledge-sharing bioinformatic platform. Nucleic Acids Res
8. Fernandes PL. The GTPB training programme in Portugal. 2009;37:W33–9.
Brief Bioinform 2010;11:626–34. 16. Mullan L. Jemboss reloaded. Brief Bioinform 2004;5:
9. Schneider MV, Walter P, Blatter MC, et al. Bioinformatics 193–5.
Training Network (BTN): a community resource for 17. Rice P, Longden I, Bleasby A. EMBOSS: the European
bioinformatics trainers. Brief Bioinform 2012;13:383–9. Molecular Biology Open Software Suite. Trends Genet
10. Via A, Blicher T, Bongcam-Rudloff E, et al. Best practices 2000;16:276–7.
in bioinformatics training for life scientists. Brief Bioinform 18. Kumar S, Nei M, Dudley J, etal. MEGA: a biologist-centric
2013. (Advance Access publication 25 June 2013). software for evolutionary analysis of DNA and protein
11. Shapiro C, Ayon C, Moberg-Parker J, et al. Strategies for sequences. Brief Bioinform 2008;9:299–306.
using peer-assisted learning effectively in an undergraduate 19. Guex N, Peitsch MC. SWISS-MODEL and the Swiss-
bioinformatics course. Biochem Mol Biol Educ 2013;41:24–33. PdbViewer: an environment for comparative protein
12. Xiong J. Essential Bioinformatics. Cambridge, UK: modeling. Electrophoresis 1997;18:2714–23.
Cambridge University Press, 2006. 20. Zhu Y, Zhang Y, Luo J, et al. PPF-1, a post-floral-specific
13. Claverie J-M, Notredame C. Bioinformatics for Dummies. gene expressed in short-day-grown G2 pea, may be
2nd edn. Hoboken, USA: Wiley Publishing Inc., 2007. important for its never-senescing phenotype. Gene 1998;
208:1–6.
14. Mount D. Bioinformatics: Sequence and Genome Analysis.
2nd edn. New York, USA: Cold Spring Laboratory Press, 21. Perutz MF. Species adaptation in a protein molecule.
2004. Mol Biol Evol 1983;1:1–28.

Antropology - 298 Marks - Siva Reddy Notes Paper 1 @CSGR
No ratings yet
Antropology - 298 Marks - Siva Reddy Notes Paper 1 @CSGR
293 pages
Internship Report
100% (1)
Internship Report
22 pages
Gram Staining: Risks and Benefits
No ratings yet
Gram Staining: Risks and Benefits
3 pages
Basic Bioinformatics Syllabus
No ratings yet
Basic Bioinformatics Syllabus
2 pages
Planting and Maintenance of Ornamental Plants
No ratings yet
Planting and Maintenance of Ornamental Plants
34 pages
Lab Manual Dam Be
100% (1)
Lab Manual Dam Be
147 pages
Santiago Bio1011 MW Spring2025
No ratings yet
Santiago Bio1011 MW Spring2025
12 pages
BBT 065
No ratings yet
BBT 065
10 pages
New Curriculum - 2023 2024
No ratings yet
New Curriculum - 2023 2024
3 pages
Case Study - CSI and Genetic Fingerprinting
No ratings yet
Case Study - CSI and Genetic Fingerprinting
6 pages
EBSCO FullText 2024 07 30
No ratings yet
EBSCO FullText 2024 07 30
16 pages
Curt Richter
No ratings yet
Curt Richter
3 pages
Itb0809 Slides p1 431 PDF
No ratings yet
Itb0809 Slides p1 431 PDF
431 pages
The Skeletal System Lesson Plan
No ratings yet
The Skeletal System Lesson Plan
8 pages
Study Guide For Biology Test
No ratings yet
Study Guide For Biology Test
2 pages
1s Term Exam G10 2
No ratings yet
1s Term Exam G10 2
5 pages
Bioinformatics Training Plan
No ratings yet
Bioinformatics Training Plan
3 pages
Itb0607 Slides All PDF
No ratings yet
Itb0607 Slides All PDF
137 pages
BANS-183 EM 2022-23 CP - Page-0001
No ratings yet
BANS-183 EM 2022-23 CP - Page-0001
9 pages
Exploring Bioinformatic - A Proyect Based Approach PDF
0% (2)
Exploring Bioinformatic - A Proyect Based Approach PDF
255 pages
Bioinformtics Future
No ratings yet
Bioinformtics Future
27 pages
Unit 13 Cpe
No ratings yet
Unit 13 Cpe
6 pages
Development of A Database Course For Bioinformatics
No ratings yet
Development of A Database Course For Bioinformatics
8 pages
Bio1 Summary
No ratings yet
Bio1 Summary
3 pages
6 26 18 Webinar
No ratings yet
6 26 18 Webinar
60 pages
The Science of Memory
No ratings yet
The Science of Memory
4 pages
Course Curriculum
No ratings yet
Course Curriculum
3 pages
Cs474 - Bioinformatics: Dr. Raja Hashim Ali 27 Aug 2019
No ratings yet
Cs474 - Bioinformatics: Dr. Raja Hashim Ali 27 Aug 2019
16 pages
CAM10 - Reading Test 1
No ratings yet
CAM10 - Reading Test 1
4 pages
BTOE-401 Introduction To Bioinformatics - DR Chandi Charan Patra
No ratings yet
BTOE-401 Introduction To Bioinformatics - DR Chandi Charan Patra
2 pages
Bioinformatics Thesis Download
100% (3)
Bioinformatics Thesis Download
8 pages
Biotech Assignment No 2
No ratings yet
Biotech Assignment No 2
2 pages
Assay Guidance Manual
67% (3)
Assay Guidance Manual
1,397 pages
Itb0607 Slides 1-21
No ratings yet
Itb0607 Slides 1-21
21 pages
Bioinformatics Learning Framework
No ratings yet
Bioinformatics Learning Framework
7 pages
Bioinformatics Week 1: Play Video Starting At:4:13 and Follow Transcript4:13
No ratings yet
Bioinformatics Week 1: Play Video Starting At:4:13 and Follow Transcript4:13
7 pages
Course Outline
No ratings yet
Course Outline
9 pages
Package Labeling
No ratings yet
Package Labeling
5 pages
Abdulrahman Smko Nasih
No ratings yet
Abdulrahman Smko Nasih
4 pages
BIOINFORMATICS RESEARCH INTERNSHIP Revised 2024-25
No ratings yet
BIOINFORMATICS RESEARCH INTERNSHIP Revised 2024-25
5 pages
Biomanager: The Use of A Bioinformatics Web Application As A Teaching Tool in Undergraduate Bioinformatics Training
No ratings yet
Biomanager: The Use of A Bioinformatics Web Application As A Teaching Tool in Undergraduate Bioinformatics Training
9 pages
MSC - Bioinformatics - Year1 Detailing by Bioinformatics Centre SPPU - 03082023
No ratings yet
MSC - Bioinformatics - Year1 Detailing by Bioinformatics Centre SPPU - 03082023
33 pages
BBCS 185
No ratings yet
BBCS 185
126 pages
Bioinformatics Analyst LSSSDC Brochure - Compressed
No ratings yet
Bioinformatics Analyst LSSSDC Brochure - Compressed
27 pages
BIO310 Lecture-1
No ratings yet
BIO310 Lecture-1
15 pages
Physiology of Flowering by Anjali
100% (1)
Physiology of Flowering by Anjali
46 pages
ONRNAMENTAL Pre - and Post-Harvest Treatments To Maintain Quality and
No ratings yet
ONRNAMENTAL Pre - and Post-Harvest Treatments To Maintain Quality and
18 pages
Sicana Odorifera Cucurbitaceae A New Phytoplasma H
No ratings yet
Sicana Odorifera Cucurbitaceae A New Phytoplasma H
3 pages
Bioinformatics
No ratings yet
Bioinformatics
3 pages
Copy of ICABB-Brochure-mod - 20241130 - 123438 - 0000
No ratings yet
Copy of ICABB-Brochure-mod - 20241130 - 123438 - 0000
2 pages
Module 1
No ratings yet
Module 1
34 pages
Cost Estimate For Survey On Traditional Healers
No ratings yet
Cost Estimate For Survey On Traditional Healers
2 pages
Journal Pcbi 1009218
No ratings yet
Journal Pcbi 1009218
12 pages
Open Elective III Year VI Semester
No ratings yet
Open Elective III Year VI Semester
16 pages
BioNome Bioinformatics Training Programs-2021
No ratings yet
BioNome Bioinformatics Training Programs-2021
15 pages
Bif205 Perl-For-bioinformatics Eth 2.00 Sc04
No ratings yet
Bif205 Perl-For-bioinformatics Eth 2.00 Sc04
2 pages
Lec1-Introduction To Bioinformatics
No ratings yet
Lec1-Introduction To Bioinformatics
27 pages
Course Outline FST612 Bioinformatics
No ratings yet
Course Outline FST612 Bioinformatics
3 pages
Dichotomous Key
No ratings yet
Dichotomous Key
2 pages
IBT Applicant Outcome (Successful)
No ratings yet
IBT Applicant Outcome (Successful)
9 pages
Msczo 603
No ratings yet
Msczo 603
141 pages
Coursera BioinfoMethods-I Lecture01 r2022 For Slides
No ratings yet
Coursera BioinfoMethods-I Lecture01 r2022 For Slides
16 pages
Introduction A La Bioinformatique
100% (1)
Introduction A La Bioinformatique
165 pages
Bi Lesson Plan02
No ratings yet
Bi Lesson Plan02
5 pages
Bioinformatics - A Student's Com
No ratings yet
Bioinformatics - A Student's Com
295 pages
Lecture1 BIMM143 Large
No ratings yet
Lecture1 BIMM143 Large
73 pages
BIO310 Course Outline
No ratings yet
BIO310 Course Outline
3 pages
NewSyllabus 155220207600842 PDF
No ratings yet
NewSyllabus 155220207600842 PDF
6 pages
Bif401 Highlighted Subjective Handouts by BINT - E - HAWA
No ratings yet
Bif401 Highlighted Subjective Handouts by BINT - E - HAWA
222 pages
Bioinfoi
No ratings yet
Bioinfoi
5 pages
Isef Forms
No ratings yet
Isef Forms
20 pages
Bioinformatics Class 12 Presentation Paragraph
No ratings yet
Bioinformatics Class 12 Presentation Paragraph
14 pages
Bit 1029 Basic Bioinformatics
No ratings yet
Bit 1029 Basic Bioinformatics
2 pages
Mark Ptashne and Alexander Gann - Genes and Signals (2001)
100% (1)
Mark Ptashne and Alexander Gann - Genes and Signals (2001)
209 pages
Estimating Near-Infrared Leaf Reflectance From Lea PDF
No ratings yet
Estimating Near-Infrared Leaf Reflectance From Lea PDF
8 pages
BSC Syllabus
No ratings yet
BSC Syllabus
6 pages
Bio
No ratings yet
Bio
3 pages
2006 09 01 - Lect01 - ch1 2 PDF
No ratings yet
2006 09 01 - Lect01 - ch1 2 PDF
104 pages
Special Section: Innovative Laboratory Exercises-Focus On Genomic Annotation Introduction: Sequences and Consequences
No ratings yet
Special Section: Innovative Laboratory Exercises-Focus On Genomic Annotation Introduction: Sequences and Consequences
4 pages
Bioinformatics: Nadiya Akmal Binti Baharum (PHD)
100% (2)
Bioinformatics: Nadiya Akmal Binti Baharum (PHD)
54 pages
Learner's Activity Sheet: MAPEH Health (Quarter I - Week 1)
No ratings yet
Learner's Activity Sheet: MAPEH Health (Quarter I - Week 1)
8 pages
And Applications, Third Edition, BPB Publications, New Delhi
No ratings yet
And Applications, Third Edition, BPB Publications, New Delhi
6 pages
Maslow's Hierarchy of Motives - SEMINAR REPORT
No ratings yet
Maslow's Hierarchy of Motives - SEMINAR REPORT
11 pages
Atural Science
No ratings yet
Atural Science
4 pages
Soft Computing Methodologies in Tics
No ratings yet
Soft Computing Methodologies in Tics
7 pages
Cloning
No ratings yet
Cloning
6 pages
Syllabus of Bio-Informatics, PUCC
No ratings yet
Syllabus of Bio-Informatics, PUCC
14 pages
Bioinformatics: Algorithms, Coding, Data Science And Biostatistics
From Everand
Bioinformatics: Algorithms, Coding, Data Science And Biostatistics
Rob Botwright
No ratings yet
Quick Hits for Teaching with Technology: Successful Strategies by Award-Winning Teachers
From Everand
Quick Hits for Teaching with Technology: Successful Strategies by Award-Winning Teachers
Robin K. Morgan
No ratings yet
Introduction To Bioinformatics Lab: 10B17BT571 Core Course Credits: 1 L0T0P2
No ratings yet
Introduction To Bioinformatics Lab: 10B17BT571 Core Course Credits: 1 L0T0P2
3 pages
Exploring Database and Analyzing Protein Sequence
No ratings yet
Exploring Database and Analyzing Protein Sequence
70 pages
BIOINFORMATICS
100% (1)
BIOINFORMATICS
4 pages
2clouds of Secrecy - The Army's Germ Warfare Tests Over Populated Areasby Leonard A. Cole
No ratings yet
2clouds of Secrecy - The Army's Germ Warfare Tests Over Populated Areasby Leonard A. Cole
3 pages
Online Degrees (Volume 5)
From Everand
Online Degrees (Volume 5)
Word Chapter
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

BIOINFORMATICS Good.2

Uploaded by

BIOINFORMATICS Good.2

Uploaded by

B RIEFINGS IN BIOINF ORMATICS . VOL 15. NO 6. 1004 ^1013 doi:10.

Teaching the ABCs of bioinformatics: a

INTRODUCTION computers to obtain and analyze biological data in

ß The Author 2013. Published by Oxford University Press.

questions, help manuals and tutorials. The online Prerequisites

Table 1: The main analysis tools used in the course

Name Type URL

WebLab Online sequence analysis platform http://weblab.cbi.pku.edu.cn/

Table 2: Online bioinformatics resources

Resource URL Developer

Educational http://www.ncbi.nlm.nih.gov/education/ NCBI

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.