100% found this document useful (1 vote)
1K views22 pages

Bioinfo MCQS

This document provides a set of 10 multiple choice questions that assess knowledge of protein motifs and domain prediction in bioinformatics. The questions cover topics like the typical length of motifs and domains, common structural motifs, motif databases that use statistical models or regular expressions, and protein family databases.

Uploaded by

Samana Zaynab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views22 pages

Bioinfo MCQS

This document provides a set of 10 multiple choice questions that assess knowledge of protein motifs and domain prediction in bioinformatics. The questions cover topics like the typical length of motifs and domains, common structural motifs, motif databases that use statistical models or regular expressions, and protein family databases.

Uploaded by

Samana Zaynab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Protein Motifs

and Domain Prediction”.

1. What is the length of a motif, in terms of amino acids residue?


a) 30- 60
b) 10- 20
c) 70- 90
d) 1- 10

2. On average, what is the length of a typical domain?


a) About 100 residues
b) About 300 residues
c) About 500 residues
d) About 900 residue.

3. Which of the following is false about the ‘loop’ structure in proteins?


a) They connect helices and sheets
b) They are more tolerant of mutations
c) They are more flexible and can adopt multiple conformations
d) They are never the components of active sites

4. Which of the common structural motifs are described wrongly?


a) β-hairpin – adjacent antiparallel strands
b) Greek key – 4 adjacent antiparallel strand
c) β-α-β – 2 parallel strands connected by helix
d) β-α-β – 2 antiparallel strands connected by helix

5. Which of the following least describes Long Loop β-hairpins?


a) They are Often referred to as a ‘random coil’ conformation
b) Generally they are referred to as the β-meander supersecondary structure
c) Loop looks similar to the Greek Letter Ω
d) Wide-range of conformations with very specific sequence preferences

6. Motifs that can form α/β horseshoes conformation are rich with which protein residue?
a) Proline
b) Arginine
c) Valine
d) Leucine

7. Which of the following wrongly describes protein domains?


a) They are made up of one secondary structure
b) Defined as independently foldable units
c) They are stable structures as compared to motifs
d) They are separated by linker regions

8. The protein structural motif domain- helix loop helix are contained by all of the following except
________
a) Scleraxis
b) Neurogenins
c) Transcription Factor 4
d) Leucine zipper

9. Which of the following is not the function of Short Linear Motifs?


a) Irreversible cleavage of the peptide at the SLiM
b) Reversible cleavage of the peptide at the SLiM
c) Moiety addition at targeted sites on SLiM
d) Structural modifications of the peptide backbone
10. In the zinc finger, which residues in this sequence motif form ligands to a zinc ion?
a) Cysteine and histidine
b) Cysteine and arginine
c) Histidine and proline
d) Histidine and arginine

Motif and Domain Databases Using Regular Expressions


1. While scanning for similarities in motifs, how regular expressions’ techniques work?
a) It represents a sequence family by a string of characters and further compares them
b) An algorithm similar to dynamic programming is used
c) Dot matrix analysis is used in this type of sequence analysis
d) Matrix analysis methods are used in this type

2. Which of the following best defines regular expressions?


a) They are made up of terms, operators and modifiers
b) They describe string or set of strings to find matching patterns
c) They are strictly restricted to alignment and corresponding score
d) They consist of set of rules for the connotations of various amino acid residues

3. In regular expressions, which of the following pair of pattern is wrongly matched with its
significance?
a) [ ] – Or
b) { } – Not
c) ( ) – Repeats
d) Z – Any

4. In terminologies related to regular expressions which of the following is false about terms and
operators?
a) Terms are strings or substrings
b) Operators combine terms and expressions
c) Operators do not have precedence
d) Operators have precedence like arithmetic operators

5. In regular expressions, which of the following pair of pattern is wrongly matched with its
significance?
a) ‘-’ – separator
b) < – N-terminal
c) > – C-terminal
d) ‘>>’ – end

Get Free Certificate of Merit in Bioinformatics Now!


6. Emotif uses which databases for alignment of sequences?
a) BLOCKS and PRINTS databases
b) PROSITE
c) BLOCKS
d) PRINTS

7. While analysing motif sequences, what is the major disadvantageous feature of PROSITE?
a) The database constructs profiles to complement some of the sequence patterns
b) The functional information of these patterns is primarily based on published literature
c) Some of the sequence patterns are too short to be specific
d) Lack of specificity about probability and variation and relation between them

8. Which of the following is not a characteristic of Fuzzy or approximate matches in regular


expression?
a) This method is able to include more variant forms of a motif with a conserved function
b) the rule of matching is based on observations, not actual assumptions
c) with the more relaxed matching, there is increase of the noise level and false positives
d) the rule of matching is based on assumptions not actual observations

9. Which of the following is not a characteristic of exact matches in regular expression?


a) There must be a strict match of sequence patterns
b) Any variations in the query sequence from the predefined patterns are not allowed
c) Provide more permissive matching by allowing more flexible matching of residues of similar
biochemical properties
d) Searching a motif database using this approach results in either a match or non-match

10. What does this representation mean- R.L.[EQD]?


a) An arginine- Amino acid- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
b) An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine
c) An arginine- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
d) An arginine- Leucine- Apartic acid and glutamic acid and glutamine

Motif and Domain Databases Using Statistical Models

1. Which of the following is not an advantage of Statistical models’ methods in analyzing protein
motifs?
a) Sequence information is preserved from a multiple sequence alignment and expresses it with
probabilistic models
b) Statistical models allow partial matches and compensate for unobserved sequence patterns using
pseudo-counts
c) Statistical models have stronger predictive power than the regular expression based approach,
even when they are derived from a limited set of sequences
d) The comparative flexibility is less in case of these methods when compared to regular expressions
methods

2. For motif scanning which of the following programs or databases is for regulated sites curated from
scientific literature?
a) ENSEMBL
b) ORegAnno
c) MAST
d) Clover

3. Which of the following is not an advantageous feature or algorithm of the database PRINTS?
a) This program breaks down a motif into even smaller non-overlapping units called ‘fingerprints’,
which are represented by unweighted PSSMs
b) To define a motif, at least a majority of fingerprints are required to match with a query sequence
c) A query that has simultaneous high-scoring matches to a majority of fingerprints belonging to a
motif is a good indication of containing the functional motif
d) The difficulty to recognize short motifs when they reach the size of single fingerprints

4. In which of the following multipurpose packages Gibbs sampling algorithm is used?


a) Consensus
b) BEST
c) AlignACE
d) PhyloCon

5. Which of the following is untrue in case of the database BLOCKS?


a) The alignments are automatically generated using the same data sets used for deriving the
BLOSUM matrices
b) The derived ungapped alignments are called ‘blocks’, which are usually longer than motifs, are
subsequently converted to PSSMs
c) A weighting scheme and pseudo counts are subsequently applied to the PSSMs to account for
underrepresented and unobserved residues in alignments
d) The functional annotation of blocks is not consistent with that for the motifs

6. Which of the following is false in case of the database Pfam and its algorithm?
a) Each motif or domain is represented by an HMM profile generated from the seed alignment of a
number of conserved homologous proteins
b) Since the probability scoring mechanism is more complex in HMM than in a profile-based approach
the use of HMM yields further increases in sensitivity of the database matches
c) Pfam-B only contains sequence families not covered in Pfam
d) The functional annotation of motifs in Pfam-A is often related to that in UNIPROT

7. Which of the following is false in case of the database SMART and its algorithm?
a) Contains HMM profiles constructed from manually refined protein domain alignments
b) Alignments in the database are built based on tertiary structures whenever available or based on
PSI-BLAST profiles
c) Alignments are further checked but not refined by human annotators before HMM profile
construction
d) SMART stands for Simple Modular Architecture Research Tool

8. Which of the following is false in case of the database InterPro and its algorithm?
a) InterPro is an integrated pattern database designed to unify multiple databases for protein
domains and functional sites
b) This database integrates information from PROSITE, Pfam, PRINTS, ProDom, and SMART databases
c) Only overlapping motifs and domains in a protein sequence derived by all five databases are
included
d) All the motifs and domains in a protein sequence derived by all five databases are included

9. Which of the following is false in case of the CDART and its algorithm?
a) CDART is a domain search program that combines the results from RPS-BLAST, SMART, and Pfam
b) The program is now an integral part of the regular BLAST search function
c) CDART is a substitute for individual database searches
d) It stands for Conserved Domain Architecture

10. Point out the wrong or irrelevant mathematical method in motif analysis.
a) Enumeration
b) Probabilistic Optimization
c) Deterministic Optimization
d) Literature mining
Protein Family Databases
1. Which of the following statements about COG is incorrect regarding its features?
a) Currently, there are 4,873 clusters in the COG databases derived from unicellular organisms
b) It is constructed by comparing protein sequences encoded in forty-three completely sequenced
genomes, which are mainly from prokaryotes, representing thirty major phylogenetic lineages
c) The interface for sequence searching in the COG database is the COGnitor program, which is based
on gapped BLAST
d) It is a protein family database based on structural classification

2. Which of the following statements about InterPro is incorrect regarding its features?
a) Protein relatedness is defined by the P-values from the BLAST alignments
b) The most closely related sequences are grouped into the lowest level clusters
c) More distant protein groups are merged into higher levels of clusters
d) The outcome of this cluster merging is a tree-like structure of functional categories

3. Pfam is available at four locations around the world. Which of the following is not one of them?
a) UK
b) Sweden
c) US
d) Japan

4. Which of the following is not a member database of InterPro?


a) SCOP
b) HAMAP
c) PANTHER
d) Pfam

5. Which of the following statements about SCOP is incorrect regarding its features?
a) Proteins with the same shapes but having little sequence or functional similarity are placed in
different super families, and are assumed to have only a very distant common ancestor
b) Proteins having the same shape and some similarity of sequence and/or function are placed in
‘families’, and are assumed to have a closer common ancestor
c) SCOP was created in 1994 in the Centre of Protein Engineering and the University College London
d) It aims to determine the evolutionary relationship between proteins

Get Free Certificate of Merit in Bioinformatics Now!


6. What is the source of protein structures in SCOP and CATH?
a) Uniprot
b) Protein Data Bank
c) Ensemble
d) InterPro

7. Which of the following statements about SUPERFAMILY database is incorrect regarding its
features?
a) Sequences can be submitted raw or FASTA format
b) Sequences must be submitted in FASTA format only
c) It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB
or HMM ID’s
d) It has generated GO annotations for evolutionarily closed domains and distant domains

8. Which of the following statements about PRINTS and ProDom databases is incorrect regarding its
features?
a) PRINTS is a compendium of protein fingerprints
b) Usually the motifs do not overlap, but are separated along a sequence, though they may be
contiguous in 3D-space
c) Current versions of ProDom are built using a novel procedure based on recursive BLAST searches
d) ProDom domain database consists of an automatic compilation of homologous domains

9. Which of the following statements about CATH-Gene3D and HAMAP databases is incorrect
regarding its features?
a) CATH-Gene3D describes protein families and domain architectures in complete genomes
b) In CATH-Gene3D the functional annotation is provided to proteins from single resource
c) HAMAP profiles are manually created by expert curators they identify proteins that are part of well-
conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.
d) HAMAP stands for High-quality Automated and Manual Annotation of microbial Proteomes

10. Which of the following statements about PANTHER and TIGRFAMs databases is incorrect
regarding its features?
a) TIGRFAMs provides a tool for identifying functionally related proteins based on sequence homology
b) TIGRFAMs is a collection of protein families, featuring curated multiple sequence alignments,
hidden Markov models (HMMs) and annotation
c) Hidden Markov models (HMMs) are not used in PANTHER
d) PANTHER is a large collection of protein families that have been subdivided into functionally
related subfamilies, using human expertise.
Global Sequence Alignment
1. When did Needleman-Wunsch first describe the algorithm for global alignment?
a) 1899
b) 1970
c) 1930
d) 1950

2. Which of the following does not describe dynamic programming?


a) The approach compares every pair of characters in the two sequences and generates an alignment,
which is the best or optimal
b) Global alignment algorithm is based on this method
c) Local alignment algorithm is based on this method
d) The method can be useful in aligning protein sequences to protein sequences only

3. Which of the following is not an advantage of Needleman-Wunsch algorithm?


a) New algorithmic improvements as well as increasing computer capacity make it possible to align a
query sequence against a large DB in a few minutes
b) Similar sequence region is of same order and orientation
c) This does not help in determining evolutionary relationship
d) If you have 2 genes that are already understood as closely related, then this type of algorithm can
be used to understand them in further details

4. Which of the following is not a disadvantage of Needleman-Wunsch algorithm?


a) This method is comparatively slow
b) There is a need of intensive memory
c) This cannot be applied on genome sized sequences
d) This method can be applied to even large sized sequences

5. Which of the following does not describe global alignment algorithm?


a) In initialization step, the first row and first column are subject to gap penalty
b) Score can be negative
c) In trace back step, beginning is with the cell at the lower right of the matrix and it ends at top left
cell
d) First row and first column are set to zero

Take Bioinformatics Tests Now!


6. Which of the following does not describe PAM matrices?
a) These matrices are used in optimal alignment scoring
b) It stands for Point Altered Mutations
c) It stands for Point Accepted Mutations
d) It was first developed by Margaret Dayhoff

7. Which of the following is untrue regarding the scoring system used in dynamic programming?
a) If the residues are same in both the sequences the match score is assumed as +5 which is added to
the diagonally positioned cell of the current cell
b) If the residues are not same, the mismatch score is assumed as -3
c) If the residues are not same, the mismatch score is assumed as 3
d) The score should be added to the diagonally positioned cell of the current cell

8. Which of the following does not describe global alignment algorithm?


a) Score can be negative in this method
b) It is based on dynamic programming technique
c) For two sequences of length m and n, the matrix to be defined should be of dimensions m+1 and
n+1
d) For two sequences of length m and n, the matrix to be defined should be of dimensions m and n

9. Which of the following does not describe global alignment algorithm?


a) It attempts to align every residue in every sequence
b) It is most useful when the aligning sequences are similar and of roughly the same size
c) It is useful when the aligning sequences are dissimilar
d) It can use Needleman-Wunsch algorithm

10. Which of the following is wrong in case of substitution matrices?


a) They determine likelihood of homology between two sequences
b) They use system where substitutions that are more likely should get a higher score
c) They use system where substitutions that are less likely should get a lower score
d) BLOSUM-X type uses logarithmic identity to find similarity

Local Sequence Alignment

1. When did Smith–Waterman first describe the algorithm for local alignment?
a) 1950
b) 1970
c) 1981
d) 1925

2. Which of the following does not describe local alignment?


a) A local alignment aligns a substring of the query sequence to a substring of the target sequence
b) A local alignment is defined by maximizing the alignment score, so that deleting a column from
either end would reduce the score, and adding further columns at either end would also reduce the
score
c) Local alignments have terminal gaps
d) The substrings to be examined may be all of one or both sequences; if all of both are included then
the local alignment is also global

3. Which of the following does not describe local alignment algorithm?


a) Score can be negative
b) Negative score is set to 0
c) First row and first column are set to 0 in initialization step
d) In traceback step, beginning is with the highest score, it ends when 0 is encountered

4. Local alignments are more used when _____________


a) There are totally similar and equal length sequences
b) Dissimilar sequences are suspected to contain regions of similarity
c) Similar sequence motif with larger sequence context
d) Partially similar, different length and conserved region containing sequences

5. Which of the following does not describe BLOSUM matrices?


a) It stands for BLOcks SUbstitution Matrix
b) It was developed by Henikoff and Henikoff
c) The year it was developed was 1992
d) These matrices are logarithmic identity values

Take Bioinformatics Practice Tests - Chapterwise!


Start the Test Now: Chapter 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
6. Which of the following is untrue regarding the gap penalty used in dynamic programming?
a) Gap penalty is subtracted for each gap that has been introduced
b) Gap penalty is added for each gap that has been introduced
c) The gap score defines a penalty given to alignment when we have insertion or deletion
d) Gap open and gap extension has been introduced when there are continuous gaps (five or more)

7. Among the following which one is not the approach to the local alignment?
a) Smith-Waterman algorithm
b) K-tuple method
c) Words method
d) Needleman-Wunsch algorithm

8. Which of the following does not describe k-tuple methods?


a) k-tuple methods are best known for their implementation in the database search tools FASTA and
the BLAST family
b) They are also known as words methods
c) They are basically heuristic methods to find local alignment
d) They are useful in small scale databases

9. Which of the following does not describe BLAST?


a) It stands for Basic Local Alignment Search Tool
b) It uses word matching like FASTA
c) It is one of the tools of the NCBI
d) Even if no words are similar, there is an alignment to be considered

10. Which of the following is untrue regarding BLAST and FASTA?


a) FASTA is faster than BLAST
b) FASTA is the most accurate
c) BLAST has limited choices of databases
d) FASTA is more sensitive for DNA-DNA comparisons

Motif Discovery in Unaligned Sequences

1. For what type of sequences Gibbs sampling is used?


a) Closely related sequences
b) Distinctly related sequences
c) Distinctly related sequences that share common motifs
d) Closely related sequences that share common motifs

2. Which of the following is untrue about Expectation Maximization (EM) method?


a) It is used to find hidden motifs
b) The method works by first making a random or guessed alignment of the sequences to generate a
trial PSSM
c) The trial PSSM is used to compare with each sequence individually
d) The log odds scores of the PSSM are modified at the end of the process

3. Which of the following is true about Expectation Maximization (EM) method?


a) The log odds scores of the PSSM are modified at the end of the process
b) The procedure stops prematurely if the scores reach convergence
c) The final result is not sensitive to the initial alignment
d) Local optimum is an advantage of EM method

4. MEME stands for _____________


a) Multiple Expectation Maximization for Motif Elicitation
b) Multiple Expectation Maximization for Motif Extraction
c) Mega Expectation Maximization for Motif Elicitation
d) Micro Expectation Maximization for Motif Extraction

5. In the web-based program MEME, the computation is a _____ step procedure.


a) one
b) two
c) three
d) four
6. Gibbs is a web-based program that uses the Gibbs sampling approach to look for _____ gap-free
segments for either DNA or protein sequences.
a) short, partially conserved
b) long, partially conserved
c) long, conserved
d) short, not conserved

7. A multiple sequence alignment or a motif is often represented by a graphic representation called a


______
a) logo
b) motto
c) algorithm
d) algo

8. The overall height of a logo position reflects how conserved the position is, and the _____ of each
letter in a position reflects the _______ of the residue in the alignment.
a) height, relative frequency
b) width, relative frequency
c) height, amplitude
d) width, amplitude

9. Conserved positions have _____ residues and bigger symbols.


a) fewer
b) more
c) maximum
d) minimum

10. _______ is an interactive program for generating sequence logos.


a) EMBOSS
b) WebLogo
c) LOGOLY
d) BLAST

Dot Matrix Sequence Comparison

1. Which of the following is not a software for dot plot analysis?


a) SIMMI
b) DOTLET
c) DOTMATCHER
d) LALIGN

2. The softwares for dot plot analysis perform several tasks. Which one of them is not performed by
them?
a) Gap open penalty
b) Gap extend penalty
c) Expectation threshold
d) Change or mutate residues

3. For palindromic sequences, what is the structure of the dot plot?


a) 2 intersecting diagonal lines at the midpoint
b) One diagonal
c) Two parallel diagonals
d) No diagonal

4. For significantly aligning sequences what is the resulting structure on the plot?
a) Intercrossing lines
b) Crosses everywhere
c) Vertical lines
d) A diagonal and lines parallel to diagonal

5. When was this method, first described?


a) 1959
b) 1966
c) 1970
d) 1982

Check this: Biotechnology Books | Biotechnology MCQs


6. Who were the inventors of this method?
a) Smith-Waterman
b) Margaret Preston
c) Gibbs and McIntyre
d) Needleman-Wunsch

7. Which of the following is true for EMBOSS Dottup?


a) Allows you to specify threshold
b) Doesn’t allow you to specify threshold
c) Doesn’t allow you to specify window size
d) If all cells in the window are identity, it colors in some specific cells in the window

8. Isolated dots that are not on the diagonal represent exact matches.
a) True
b) False

9. Vertical frame shifts show ______ while the horizontal ones show _______
a) insertion, insertion
b) insertion, deletion
c) deletion, deletion
d) deletion, insertion

10. Dot plot of repeating elements would be small crosses on plot.


a) True
b) False

Dynamic Programming Algorithm for Sequence Alignment”.

1. Use of the dynamic programming method requires a scoring system for the comparison of symbol
pairs, and a scheme for GAP penalties.
a) True
b) False

2. After the derivation, the outputs of the dynamic programming are the ratios are called even scores.
a) True
b) False

3. The matrices PAM250 and BLOSUM62 contain _______


a) positive and negative values
b) positive values only
c) negative values only
d) neither positive nor negative values, just the percentage

4. The higher is the score in the alignment _________


a) the more significant is the alignment
b) or the less it resembles alignments in related proteins
c) the less significant is the alignment
d) the less it aligns with the related protein sequence

5. Gaps are added to the alignment because it ______


a) increases the matching of identical amino acids at subsequent portions in the alignment
b) increases the matching of or dissimilar amino acids at subsequent portions in the alignment
c) reduces the overall score
d) enhances the area of the sequences
6. Which of the following is not a description of dynamic programming algorithm?
a) A method of sequence alignment
b) A method that can take gaps into account
c) A method that requires a manageable number of comparisons
d) This method doesn’t provide an optimal (highest scoring) alignment

7. Which of the following is not a site on internet for alignment of sequence pairs?
a) BLASTX
b) BLASTN
c) SIM
d) BCM Search Launcher

8. Dayhoff PAM matrices, are based on an evolutionary model of protein change, whereas, BLOSUM
matrices, are designed to identify members of the same family.
a) True
b) False

9. A feature of the dynamic programming algorithm is that the alignments obtained depend on the
choice of a scoring system for comparing character pairs and penalty scores for gaps.
a) True
b) False

10. Which of the following is untrue regarding dynamic programming algorithm?


a) The method compares every pair of characters in the two sequences and generates an alignment
b) The output alignment will include matched and mismatched characters and gaps in the two
sequences that are positioned so that the number of matches between identical or related characters
is the maximum possible
c) The dynamic programming algorithm provides a reliable computational method for aligning DNA
and protein sequences
d) This doesn’t allow making evolutionary predictions on the basis of sequence alignments

Use of Scoring Matrices and Gap Penalties in Sequence Alignments.

1. In scoring matrices, for convenience, odds scores are converted to log odds scores.
a) True
b) False
View Answer

2. Which of the following doesn’t describe PAM matrices?


a) This family of matrices lists the likelihood of change from one amino acid to another in homologous
protein sequences during evolution
b) There is presently no other type of scoring matrix that is based on such sound evolutionary
principles as are these matrices
c) Even though they were originally based on a relatively small data set, the PAM matrices remain a
useful tool for sequence alignment
d) It stands for Percent Altered Mutation
View Answer

3. The assumption in this evolutionary model is that the amino acid substitutions observed over short
periods of evolutionary history can be extrapolated to longer distances.
a) True
b) False

4. Which of the following is untrue about the modification of PAM matrices?


a) At one time, the PAM250 scoring matrix was modified in an attempt to improve the alignment
obtained
b) All scores for matching a particular amino acid were normalized to the same mean and standard
deviation, and all amino acid identities were given the same score to provide an equal contribution for
each amino acid in a sequence alignment
c) This took place in 1976
d) These modifications were included as the default matrices for the GCG sequence alignment
programs in versions 8 and earlier and are optional in later versions
View Answer

5. The Dayhoff model of protein evolution is not a Markov process.


a) True
b) False
View Answer
6. Which of the following is true regarding the assumptions in the method of constructing the Dayhoff
scoring matrix?
a) it is assumed that each amino acid position is equally mutable
b) it is assumed that each amino acid position is not equally mutable
c) it is assumed that each amino acid position is not mutable at all
d) sites do not vary in their degree of mutability
View Answer

7. The more conserved amino acids in similar proteins from different species are ones that play an
essential role in structure and function and the less conserved are in sites that can vary without
having a significant effect on function.
a) True
b) False
View Answer

8. A gap opening penalty for any gap (g) and a gap extension penalty for each element in the gap (r)
are most often used, to give a total gap score wx, according to the equation ______
a) wx – rx = -g
b) wx = g – rx
c) wx = g + rx
d) wx + g + rx = 0
View Answer

9. In the GCG and FASTA program suites, the scoring matrix itself is formatted in a way that includes
default ______
a) gap additions
b) alignment scores
c) score penalties
d) gap penalties
View Answer

10. In case of the varying alignment, penalizing gaps heavily might occur. Then the best scoring local
alignment between the sequences will be one that optimizes the score between matches and
mismatches, without any gaps.
a) True
b) False
Assessing the Significance of Sequence Alignments”.

1. On analysis of the alignment scores of random sequences will reveal that the scores follow a
different distribution than the normal distribution called the _________
a) Gumbel equal value distribution
b) Gumbel extreme value distribution
c) Gumbel end value distribution
d) Gumbel distribution
View Answer

2. The statistical analysis of alignment scores is much better understood for ________ than for
_______
a) global alignments, local alignments
b) local alignments, global alignments
c) global alignments, any other alignment method
d) Needleman-Wunsch alignment, Smith-Waterman alignment
View Answer

3. When random or unrelated sequences are compared using a global alignment method, they can
have ____________ reflecting the tendency of the global algorithm to match as many characters as
possible.
a) very low scores
b) very high scores
c) moderate scores
d) low scores
View Answer

4. Which of the following are not related to Needleman-Wunsch alignment algorithm?


a) Global alignment programs use this algorithm
b) The output is a positive number
c) Small changes in the scoring system can produce a different alignment
d) Changes in the scoring system can produce the same alignment
View Answer

5. Waterman, in1989, provided a set of means and standard deviations of global alignment scores
between random DNA sequences, using mismatch and gap penalties that produce a linear increase in
score with _______ a distinguishing feature of global alignments.
a) alignment score
b) sequence score
c) sequence length
d) scoring system
View Answer

6. Who suggested that the global alignment scores between unrelated protein sequences followed
the extreme value distribution, similar to local alignment scores? And when?
a) Abagyan and Batalov, in 1981
b) Chvátal and Lipman, in 1984
c) Abagyan and Batalov, in 1997
d) Chvátal and Sankoff, in 1995
View Answer

7. _______ analyzed the distribution of scores among 100 vertebrate nucleic acid sequences and
compared these scores with randomized sequences prepared in different ways.
a) Lipman, in 1984
b) Batalov, in 1964
c) Waterman, in 1987
d) Lipman, in 1967
View Answer

8. If the random sequences were prepared in a way that maintained the local base composition by
producing them from overlapping fragments of sequence, the distribution of scores has a _______
standard deviation that is closer to the distribution of the natural sequences.
a) lowest
b) higher
c) lower
d) moderate
View Answer

9. The GCG alignment programs have a RANDOMIZATION option, which shuffles the second sequence
and calculates similarity scores between the unshuffled sequence and each of the shuffled copies.
a) True
b) False
View Answer

10. Dayhoff, 1978- 1983, devised a second method for testing the relatedness of two protein
sequences that can accommodate some local variation. Where this method is useful?
a) For finding repeated regions within a sequence
b) For finding similar regions that are in a different order in two sequences
c) For finding small conserved region such as an active site
d) For finding huge regions within sequences
Bayesian Statistics

1. By whom and when were the Bayesian methods applied first?


a) Smith-Waterman, 1981
b) Agarwal and States, 1996
c) Smith-Waterman, 1996
d) Agarwal and States, 1981
View Answer

2. With the application of Bayesian methods, the most probable repeat length and evolutionary time
since the repeat was formed may be derived.
a) True
b) False
View Answer

3. If the purpose is to calculate the probability of one event AND a second event, the odds scores for
the events are _________
a) added
b) multiplied
c) multiplied and added
d) subtracted
View Answer

4. In a type of probability, analysis is to calculate the odds score for one event OR a second event, or
of a series of events. In this case, the odds scores are _______
a) multiplied
b) subtracted
c) added and multiplied
d) added
View Answer

5. In Bayesian methods, difficulty with making estimations is that the estimate depends on the
following Assumption. (Assumption – The mutation rate in sequences has been constant with time
and that the rate of mutation of all nucleotides is the same.)
a) True
b) False
View Answer
6. Another difficulty in Bayesian methods is deciding on the length of sequence that was duplicated.
a) True
b) False
View Answer
7. A length and distance that gives the highest overall probability may then be determined. Such
alignments are initially found using ________
a) a particular scoring matrix only
b) an alignment algorithm only
c) an alignment algorithm and a particular scoring matrix
d) dot method
View Answer

8. Which of the following feature of Bayesian methods is the disadvantage of it?


a) A length and distance that gives the highest overall probability may be determined
b) They are used to calculate evolutionary distance
c) Computationally Bayesian methods are better
d) A specific mutational model is required
View Answer

9. Zhu (1998) have devised a computer program called the Bayes block aligner which in effect slides
____ sequences along each other to find the ______ ungapped regions or blocks.
a) two, least scoring
b) two, highest scoring
c) multiple, highest scoring
d) multiple, least scoring
View Answer

10. Unlike the commonly used methods for aligning a pair of sequences, the Bayesian method
_______ using a particular scoring matrix or designated gap penalties.
a) does not depend on
b) depends on
c) is based on
d) involves
Sequence Homology Versus Sequence Similarity and Identity.

1. Which of the following is incorrect regarding pair wise sequence alignment?


a) The most fundamental process in this type of comparison is sequence alignment
b) It is an important first step toward structural and functional analysis of newly determined
sequences
c) This is the process by which sequences are compared by searching for common character patterns
and establishing residue-residue correspondence among related sequences
d) It is the process of aligning multiple sequences
View Answer

2. Which of the following is incorrect about evolution?


a) The macromolecules can be considered molecular fossils that encode the history of millions of
years of evolution
b) The building blocks of these biological macromolecules, nucleotide bases, and amino acids form
linear sequences that determine the primary structure of the molecules
c) DNA and proteins are products of evolution
d) The molecular sequences barely undergo changes
View Answer

3. The presence of evolutionary traces is because some of the residues that perform key functional
and structural roles tend to be preserved by natural selection; other residues that may be less crucial
for structure and function tend to mutate more frequently.
a) True
b) False
View Answer
4. The degree of sequence variation in the alignment reveals evolutionary relatedness of different
sequences, whereas the conservation between sequences reflects the changes that have occurred
during evolution in the form of substitutions, insertions, and deletions.
a) True
b) False
View Answer

5. If the two sequences share significant similarity, it is extremely ______ that the extensive similarity
between the two sequences has been acquired randomly, meaning that the two sequences must have
derived from a common evolutionary origin.
a) unlikely
b) possible
c) likely
d) relevant
View Answer

6. Sometimes, it is also possible that two sequences have derived from a common ancestor, but may
have diverged to such an extent that the common ancestral relationships are not recognizable at the
sequence level.
a) True
b) False
View Answer

7. Which of the following is incorrect regarding sequence homology?


a) Two sequences can homologous relationship even if have do not have common origin
b) It is an important concept in sequence analysis
c) When two sequences are descended from a common evolutionary origin, they are said to have a
homologous relationship
d) When two sequences are descended from a common evolutionary origin, they are said to share
homology
View Answer

8. Sequence similarity can be quantified using ________ homology is a ______ statement.


a) percentages, quantitative
b) percentages, qualitative
c) ratios, qualitative
d) ratios, quantitative
View Answer

9. Shorter sequences require higher cutoffs for inferring homologous relationships than longer
sequences.
a) True
b) False
View Answer

10. Sequence similarity and sequence identity are synonymous for nucleotide sequences and protein
sequences as well.
a) True
b) False
Methods.

1. The overall goal of pair wise sequence alignment is to find the best pairing of two sequences, such
that there is maximum correspondence among residues.
a) True
b) False
View Answer
2. In local alignment, the two sequences to be aligned cannot be of unequal lengths.
a) True
b) False
View Answer

3. Alignment algorithms, both global and local, are fundamentally similar and only differ in the
optimization strategy used in aligning similar residues.
a) True
b) False
View Answer

4. In a dot matrix, two sequences to be compared are written in the _____________ of the matrix.
a) horizontal and vertical axes
b) 2 parallel horizontal axes
c) 2 parallel vertical axes
d) horizontal axis (one preceding another)
View Answer

5. When the two sequences have substantial regions of similarity, many dots line up to form
contiguous _______ lines.
a) crossings on
b) horizontal
c) diagonal
d) vertical
View Answer

6. A problem exists when comparing _____ sequences using the dot matrix method, namely, the
_______
a) small, amplification
b) large, amplification
c) small, high noise level
d) large, high noise level
View Answer

7. If the selected window size is too long, sensitivity of the alignment is lost.
a) True
b) False
View Answer

8. A sequence can be aligned with itself to identify internal repeat elements.


a) True
b) False
View Answer

9. Self complementarity of DNA sequences cannot be identified using a dot plot.


a) True
b) False
View Answer

10. Which of the following is untrue about dot plot method and its applications?
a) This method gives a direct visual statement of the relationship between two sequences
b) One of its advantages is the identification of sequence repeat regions based on the presence of
parallel diagonals of the same size vertically or horizontally in the matrix
c) It is not useful in identifying chromosomal repeats
d) The method can be used in identifying nucleic acid secondary structures through detecting self-
complementarity of a sequence.
Statistical Significance of Sequence Alignment”.

1. The truly statistically significant sequence alignment will be able to provide evidence of homology
between the sequences involved.
a) True
b) False
View Answer

2. By calculating alignment scores of a large number of ______ sequence pairs, a distribution model of
the ______ sequence scores can be derived.
a) related, randomized
b) unrelated, randomized
c) unrelated, unrandomized
d) related, unrandomized
View Answer

3. Many studies have demonstrated that the distribution of similarity scores assumes a peculiar shape
that resembles a highly skewed normal distribution with a long tail on one side. The distribution
matches the _______
a) Gumble elective value distribution
b) Gumble extreme void distribution
c) Gumble end value distribution
d) Gumble extreme value distribution
View Answer

4. Which of the following is a part of the statistical test of sequences?


a) An optimal alignment between two chosen sequences is obtained at the end
b) Unrelated sequences of the same length are then generated through a randomization process
c) Unrelated sequences of the different length are then generated through a randomization process
d) Related sequences of the same length are then generated through a randomization process
View Answer

5. In the statistical test, randomization process in which one of the two given sequences is randomly
shuffled.
a) True
b) False
View Answer

6. What is used to generate parameters for the extreme distribution?


a) The pool of alignment scores from the shuffled sequences
b) A single score of a shuffled sequence
c) The pool of alignment scores from the unshuffled sequences
d) The basic optimal score computed at the beginning of the test
View Answer

7. If the score is located in the extreme margin of the distribution, that means that the alignment
between the two sequences is ______ due to random chance and is thus considered ______
a) unlikely, significant
b) unlikely, insignificant
c) unlikely, insignificant
d) very likely, significant
View Answer

8. It is not known whether the Gumble distribution applies equally well to gapped alignments.
a) True
b) False
View Answer
9. Which of the following is untrue about the PRSS program?
a) It stands for Probability of Random Shuffles
b) It is a web-based program that can be used to evaluate the statistical significance of DNA or protein
sequence alignment
c) It first aligns two sequences using the Needleman-Wunsch algorithm and calculates the score
d) It holds one sequence in its original form and randomizes the order of residues in the other
sequence.
View Answer

10. The major disadvantage of the PRSS program is that it doesn’t allow partial shuffling.
a) True
b) False
Exhaustive Algorithms.

1. Related sequences are identified through the database similarity searching and as the process
generates multiple matching sequence pairs, it is often necessary to convert the numerous pair wise
alignments into a single alignment.
a) True
b) False
View Answer

2. There is a unique advantage of multiple sequence alignment because it reveals more biological
information than many pair wise alignments can.
a) True
b) False
View Answer

3. Which of the following cannot be related to multiple sequence alignment?


a) Many conserved and functionally critical amino acid residues can be identified in a protein multiple
alignment
b) Multiple sequence alignment is also an essential prerequisite to carrying out phylogenetic analysis
of sequence families and prediction of protein secondary and tertiary structures
c) Multiple sequence alignment also has applications in designing degenerate polymerase chain
reaction (PCR) primers based on multiple related sequences
d) This method does not contribute much to degenerate polymerase chain reaction (PCR) primers
creation
View Answer

4. The scoring function for multiple sequence alignment is based on the concept of sum of pairs (SP).
a) True
b) False
View Answer

5. Which of the following scores are not considered while calculating the SP scores?
a) All possible pair wise matches
b) All possible mismatches
c) All possible gap costs
d) Number of gap penalties
View Answer
6. Given a multiple alignment of three sequences, the sum of scores is calculated as the sum of the
dissimilarity scores of every pair of sequences at each position.
a) True
b) False
View Answer
7. There are two approaches viz. exhaustive and heuristic approaches used in multiple sequence
alignment.
a) True
b) False
View Answer

8. In a multidimensional search matrix, for aligning N sequences, an (N+2)-dimensional matrix is


needed to be filled with alignment scores.
a) True
b) False
View Answer

9. As the amount of computational time and memory space required increases exponentially with the
number of sequences, it makes the multidimensional search matrix method computationally
prohibitive to use for a large data set.
a) True
b) False
View Answer

10. Which of the following is untrue about DCA?


a) It stands for Divide-and-Conquer Alignment
b) It works by breaking each of the sequences into two smaller sections
c) The breaking points during the process are determined based on regional similarity of the
sequences
d) If the sections are not short enough, further divisions are restricted as well

Heuristic Algorithms
1. Which of the following is untrue regarding the Progressive Alignment Method?
a) Progressive alignment depends on the stepwise assembly of multiple alignments and is heuristic in
nature
b) It speeds up the alignment of multiple sequences through a multistep process
c) It first conducts pair wise alignments for each possible pair of sequences using the Needleman–
Wunsch global alignment method and records these similarity scores from the pair wise comparisons
d) Its drawback is it slows down the alignment of multiple sequences through a single step process
View Answer

2. Clustal is a progressive multiple alignment program available either as a stand-alone or on-line


program.
a) True
b) False
View Answer

3. Which of the following is untrue regarding the progressive alignment method?


a) The program also applies a weighting scheme to increase the reliability of aligning divergent
sequences (sequences with less than 25% identity)
b) The progress is done by down weighting redundant and closely related groups of sequences in the
alignment by a certain factor
c) This scheme is useful in enhancing similar sequences from dominating the alignment
d) This scheme is useful in enhancing similar sequences from dominating the alignment
View Answer

4. Which of the following is not a drawback of the progressive alignment method?


a) The progressive alignment method is not suitable for comparing sequences of different lengths
because it is a global alignment–based method
b) In this method the use of affine gap penalties, long gaps are not allowed, and, in some cases, this
may limit the accuracy of the method
c) In this method the use of affine gap penalties, long gaps is allowed, and, in some cases, this may
limit the accuracy of the method
d) The final alignment result is also influenced by the order of sequence addition
View Answer

5. Which of the following is untrue regarding T-Coffee?


a) It stands for Tree-based Consistency Objective Function for alignment Evaluation
b) It performs progressive sequence alignments as in Clustal.
c) The global pair wise alignment is not performed using the Clustal program.
d) The local pair wise alignment is generated by the Lalign program, from which the top ten scored
alignments are selected
View Answer

6. Which of the following is untrue about iterative approach?


a) The iterative approach is based on the idea that an optimal solution can be found by repeatedly
modifying existing suboptimal solutions
b) Because the order of the sequences used for alignment is different in each iteration
c) This method is also heuristic in nature and does not have guarantees for finding the optimal
alignment
d) This method is not based on heuristic methods
View Answer

7. Which of the following is untrue about PRRN?


a) PRRN is a web-based program that uses a double nested iterative strategy for multiple alignment
b) It performs multiple alignments through two sets of iterations: inner iteration and outer iteration
c) In the outer iteration, an initial random alignment is generated that is used to derive a UPGMA tree
d) In the inner iteration, the sequences are randomly divided into multiple groups
View Answer

8. The major drawback of the progressive and iterative alignment strategies is that they are largely
global alignment based and may therefore fail to recognize conserved domains and motifs among
highly divergent sequences of varying lengths.
a) True
b) False
View Answer

9. Which of the following is untrue about DIALIGN2?


a) It is a web based program designed to detect local similarities
b) It is designed to detect global similarities
c) It does not apply gap penalties and thus is not sensitive to long gaps
d) The method breaks each of the sequences down to smaller segments and performs all possible pair
wise alignments between the segments
View Answer

10. Match-Box compares segments of some of the nine residues of possible Pair wise alignments.
a) True
b) False
Needleman – Wunsch Algorithm.

1. Which of the following is not the objective to perform sequence comparison?


a) To observe patterns of conservation
b) To find the common motifs present in both sequences
c) To study the physical properties of molecules
d) To study evolutionary relationships
View Answer
2. A dotplot is a visual and qualitative technique whereas the sequence alignment is an exact and
quantitative measure of similarity of alignments.
a) True
b) False
View Answer

3. The global sequence alignment is suitable when the two sequences are of dissimilar length, with a
negligible degree of similarity throughout.
a) True
b) False
View Answer

4. The alignment score is the sum of substitution scores and gap penalties in this type of algorithm.
a) True
b) False
View Answer

5. The substitution matrices are rarely used in this type of matching.


a) True
b) False
6. Which of the following is untrue about Protein substitution matrices?
a) They are significantly more complex than DNA scoring matrices
b) They have the N x N matrices of the amino acids
c) Protein substitution matrices have quite important role in evolutionary studies
d) They are significantly quite less complex than DNA scoring matrices
View Answer

7. In Needleman-Wunsch algorithm, the gaps are scored -2.


a) True
b) False
View Answer

8. The number of possible global alignments between two sequences of length N is _____
a) 2NπN√
b) 22NπN√
c) 2(N−1)πN√
d) 22NN√
View Answer

9. Which of the following is untrue about Needleman-Wunsch algorithm?


a) It is an example of dynamic programming
b) Basic idea here is to build up the best alignment by using optimal alignments of larger
subsequences
c) It was first used by Saul Needleman and Christian Wunsch
d) It was first used in 1970
View Answer

10. There are two types matrices involved in the study- score matrices and trace matrices.
a) True
b) False

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy