0% found this document useful (0 votes)
524 views8 pages

Exam Year Questions and Answers

This document contains an exam for a bioinformatics course covering several topics: - Section A contains 20 multiple choice questions testing knowledge of bioinformatics databases, sequence alignment tools, protein structure prediction, and genomics. - Section B instructs students to answer 4 of 6 longer answer questions worth 20 marks each. Sample question topics include describing NCBI databases, dynamic programming, and identifying homologous sequences. - The questions assess understanding of key concepts and tools used for biological sequence analysis, comparative genomics, protein structure and function prediction. Correct answers require knowledge of databases, algorithms, and applications of bioinformatics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
524 views8 pages

Exam Year Questions and Answers

This document contains an exam for a bioinformatics course covering several topics: - Section A contains 20 multiple choice questions testing knowledge of bioinformatics databases, sequence alignment tools, protein structure prediction, and genomics. - Section B instructs students to answer 4 of 6 longer answer questions worth 20 marks each. Sample question topics include describing NCBI databases, dynamic programming, and identifying homologous sequences. - The questions assess understanding of key concepts and tools used for biological sequence analysis, comparative genomics, protein structure and function prediction. Correct answers require knowledge of databases, algorithms, and applications of bioinformatics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

lOMoARcPSD|8643681

Exam year, questions and answers

Bioinformatics (Royal Melbourne Institute of Technology)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by Ardianto Suhendar (ardianto86@gmail.com)
lOMoARcPSD|8643681

SECTION A. Answer all questions. 2 marks per question.


Give the best possible answer (only one per question).

1. Which of the following statements is true?


a. There are more entries in the protein sequence database than in the PDB
b. Proteins are easier to sequence than DNA
c. There are more protein sequences than nucleotide sequences in databases
d. GenBank only contains entries from eukaryotic organisms
e. GenBank entries never contain intron sequences

2. What is true of the human genome?


a. Every nucleotide is transcribed
b. There are more genes encoding ribosomal RNA's than proteins
c. Many small RNA's are encoded which are not translated into proteins
d. Different human genomes are less that 90% identical
e. All the above statements are true

3. Why are membrane protein structures under-represented in the PDB?


a. They have little interest for biologists
b. They are difficult to crystallise
c. They are too hydrophilic
d. There are very few true membrane-associated proteins
e. They are easy to model, and so experimentally-obtained structures are
unnecessary

4. What is different between FASTA and BLAST?


a. FASTA is used for protein sequences, BLAST for nucleotide
b. Only BLAST uses “words” to speed up alignments
c. FASTA doesn’t allow gaps in alignments
d. BLAST is slower than FASTA
e. By default, FASTA uses shorter words than BLAST

5. Which tool would you use to predict secondary structure?


a. Hopps-Wood
b. Chou-Fasman
c. Smith-Waterman
d. PSI-BLAST
e. Doolittle

6. What is a general observation about genomes?


a. Bacterial genomes have more genes than mammalian genomes
b. In mammalian genomes, introns are generally larger than the intergenic regions
c. All transcripts derived from a genome are translated into proteins
d. The number of genes in the human genome is accurately known
e. Mammalian genomes contain much repetitive DNA, making sequence
assembly difficult

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)


lOMoARcPSD|8643681

7. Why are there more sequences in TrEMBL than Swiss-Prot?


a. There aren’t
b. Swiss-Prot only houses human sequences
c. Swiss-Prot only contains proteins with solved structures
d. TrEMBL is a translated nucleotide sequence database, Swiss-Prot only
contains entries of proteins known to be expressed
e. TrEMBL contains many examples of identical sequences

8. Which of the following statements regarding dynamic programming is false?


a. It is guaranteed to give the optimal alignment
b. It is exhaustive
c. It uses heuristics to speed up the process
d. It is the basis of BLAST and FASTA
e. It is slower than BLAST and FASTA

9. In a family of homologous proteins, how would you best determine the most highly
conserved amino acids?
a. Perform a multiple sequence alignment
b. Determine the isolelectric points of the proteins
c. Compare Ramachandran plots of each protein
d. Build a phylogenic tree
e. All of the above

10. If a single organism contains two loci encoding homologous genes as a result of gene
duplication, the encoded proteins are said to be
a. Heterologous
b. Orthologous
c. Xenologous
d. Paralogous
e. Zoologous

11. When using a tool such as WebCutter to produce a restriction map, it is observed that
some restriction enzymes can digest DNA more frequently than others. Why?
a. They can digest within introns as well as exons
b. They have recognition sequences that are palindromic
c. They have higher enzymatic activity
d. They leave a blunt-ended cut
e. Their recognition sequence is shorter

12. Which of the following is UNTRUE about motifs?


a. They occur in more than one protein
b. They are typically quite short (10-50 amino acids)
c. The can have variability of amino acids
d. Two proteins containing the same motif must be homologues
e. Often a motif is a site for a ligand to bind.

13. When creating a phylogenic tree


a. A multiple sequence alignment is required
b. Either nucleotide or protein sequence can be used
c. The tree can be rooted or unrooted
d. Evolutionarily related sequences are located in clades
e. All of the above are true

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)


lOMoARcPSD|8643681

14. When would you use a Ramachandran plot?


a. When examining bond angles in a protein structure
b. When performing a PSI-BLAST
c. When performing a basic dot-plot alignment
d. When examining transmembrane regions of a protein
e. This kind of plot is not applicable to bioinformatics

15. Which statement is generally true about gaps in alignments?


a. Every pairwise alignment will contain a gap
b. The cost is higher to extend than to open a gap
c. The cost is higher to open than to extend a gap
d. Gaps greater than 5 characters mean the sequences cannot be homologous
e. In protein alignments, gaps only occur in alpha helical regions

16. The amino acid depicted in the figure can be assigned to two overlapping functional
amino acid groupings. These are:

a. Hydrophobic and polar


b. Basic and polar
c. Aromatic and hydrophobic
d. Acidic and aromatic
e. Aromatic and polar

17. How would you identify transmembrane regions from a protein sequence?
a. Determine the pI of the protein
b. Perform an amino acid content analysis
c. Perform a hydropathy analysis of the protein
d. Identify regions of secondary structure using neural network tools
e. None of the above

18. Regarding protein structures, which of the following is true?


a. The number of unique folds that are found is increasing exponentially
b. No new superfamilies have been discovered recently
c. All proteins within a superfamily will have detectable sequence similarity
d. SCOP classifies protein structure based on the method of structure
determination
e. The majority of protein structures in the PDB were derived by NMR

19. To perform in silico drug screening the requirements are


a. A target protein structure
b. An algorithm such as CLUSTALW
c. A library of drug-like molecules
d. a and b are both correct
e. a and c are both correct

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)


lOMoARcPSD|8643681

20. The amount of nucleotide sequence data in GenBank:


a. Increases linearly
b. Doubles every 10 years
c. Increases exponentially
d. Is now decreasing
e. Is no longer relevant after the human genome sequence has been completed

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)


lOMoARcPSD|8643681

SECTION B Note this is a marking scheme guide- I haven’t listed all relevant points.
Answer any 4 of questions 1 – 6. No extra marks given for answering more than 4 questions,
and only the first 4 will be marked. 20 marks per question.
1 or more per point, depending on how well described. Can get marks for any relevant.

1. Describe the major databases that are present in the NCBI ENTREZ suite. What
information does each house? How do we find relevant data within these databases?
Marks for each database mentioned.
Extra marks if described fully
Should list how the databases are queried
Text (can use Boolean query), use descriptors eg [au]
BLAST etc is not a db
2. What is dynamic programming? How is it applied to pairwise alignment? What are the
programs generally used, and how do they make alignments run faster than when
using exhaustive dynamic programming?
Breaks down the problem into states, solves each individually
Will give the optimal sequence alignment
Dome using a matrix
Three stages
Gives a score
Used in BLAST and FASTA.
Make assumptions that homologous sequences will contain words
Word lenghs BLAST 3 for protein, 11 nucleotide
FASTA 2,6

3. Discuss the concept of homology (include the various classes of homologues and how
they arise). How can we search for homologues of a particular nucleotide or amino
acid sequence? How do we know if we have identified a homologue?
Homology is descent from a common ascestor.
No such thing as % homology, homolog or not.
Have orthologues, paralogues and xenologues. Define.
Search using BLAST.
Look at E value.

4. Describe the amino acid substitution matrices that are used in calculating similarity
scores between protein sequences. Comment on the derivation of the matrices, what
information they contain and how they are used. Use the matrix below, and a gap
opening penalty of -11 and extension cost of –1, calculate a score for the match
between the following peptides. Show your alignment and the score.

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)


lOMoARcPSD|8643681

Pep1 AVALE I N S TE I N S S Pep2 AMALI N STEAD S S

Describe the PAM and BLOSUM databases, how they were developed.
What do they do?
What do the different numbers refer to?
How are they used in FASTA and BLAST?

Score = 33 (accept 34)

AVALEINSTEINSS
A AL INSTE +SS
AMAL-INSTEADSS

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)


lOMoARcPSD|8643681

5. Describe multiple sequence alignment. How is it performed, and what is the


information we can obtain from these alignments? Give examples.
Often done using CLUSTALW.
Takes each sequence, does a pairwise alignment with each and calculates similarity
Takes closest two, aligns and forms consensus
Then next most similar etc
So on to the end
Forms a guide tree
From MSA can detect conserved regions in the family. In proteins, may correspond to
active site or conserved motifs.
Can use an MSA to form a phylogenetic tree. Can use UPGMA, Neighbour-joining
methods. Tree can tell us the evolutionary history of a family of homologous genes or
proteins

6. Describe the requirements and processes involved in some protein modelling


techniques. How can we use bioinformatics to assist with drug design and
development?
Should describe homology modelling.
Molecular dynamics- the basic theory.
How we can do in silico screening- what we require for this (a structure or very good
model, library of drug-like compounds.
Describe Dock (or similar).

Downloaded by Ardianto Suhendar (ardianto86@gmail.com)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy