Eyrich Bioinformatics 2001
Eyrich Bioinformatics 2001
12 2001
BIOINFORMATICS APPLICATIONS NOTE Pages 1242–1243
Received on February 20, 2001; revised on May 28, 2001; accepted on July 4, 2001
1242
c Oxford University Press 2001
EVA: evaluation of prediction servers
of these objectives have already been realised. We will into 3665 chains, 3130 (85%) of which were similar in
extend EVA in three additional ways: (i) test more servers, sequence to known structures and 535 (15%) of which
(ii) refine the evaluation of threading servers, and (iii) add were new (less than 30% sequence identity over more than
alternative structure alignment methods for evaluation. 100 residues). In comparative modelling, EVA evaluated
EVA is already downloading target sequences from PDB over 6600 models with common subsets for 303 chains.
prior to the release of their structures. In secondary structure prediction, EVA based its analysis
on over 30 000 individual predictions; 127 chains were
CURRENT IMPLEMENTATION OF EVA common to all methods, 348 to four methods. For both
Results in four prediction categories. Currently, EVA
of these categories, EVA evaluated most of the existing
evaluates four different categories of structure prediction
servers in the field on the largest protein sets ever. Details
servers (see EVA home page for URLs and list of servers):
about the evaluation are available on the EVA web site;
comparative modelling (3), threading (6), secondary
details about the predictions will be published elsewhere.
structure prediction (9), and inter-residue contact predic-
tion (4). Brief explanations about the methods are on the Additional resources: PSI-BLAST alignments and se-
EVA web site. quence unique subset of PDB. EVA also maintains a
Results are updated every week. Every day, EVA down- number of additional data resources. One resource is
loads the newest protein structures from PDB (Berman et a continuously updated list giving the largest subset of
al., 2000). The structures are added to a mySQL database, sequence-unique proteins in PDB (no protein in the set
sequences are extracted for every protein chain, and sent shares more than 33 identical residues over 100 residues
to each server by META-PP (Eyrich and Rost, 2000). aligned). This set now contains 2435 chains. Another
Predictions are collected and sent for evaluation to the resource contains over 5000 PSI-BLAST alignments for
EVA-satellites (comparative modelling: Rockefeller Uni- proteins added to PDB during the existence of EVA.
versity, contacts: CNB Madrid, and all other: Columbia
University). Depending on the category, the assessments ACKNOWLEDGEMENTS
are made available within hours to days. The central EVA We are particularly grateful to Phil Bourne (UCSD)
site at CUBIC downloads all HTML pages produced by and Kevin Karplus (UCSC) for their support. We
the satellites, and builds up the ‘latest week’ results that would also like to thank Arne Elofsson (Stockholm),
are then mirrored at the satellites (for a flowchart of EVA, Torsten Schwede, Nicolas Guex, and Manual Peitsch (all
see http://cubic.bioc.columbia.edu/eva/doc/flow.html). three form Glaxo, Geneva) for helpful discussions, and
Comparing: Identical data sets, major questions first! Nigel Brown (MRC, London) for his program MView.
EVA compares methods based only on identical data sets. Last not least, we are grateful to the developers who
This approach is essential for reliably ranking methods. permitted us to test their prediction servers. We apologise
However, it reduces the number of available proteins since to all whose servers we evaluated that we had to remove
not all predictions are available for all servers. Another their citations from this paper; they can be found at: http:
important feature of EVA is that it displays the results //cubic.bioc.columbia.edu/eva/doc/explain methods.html.
hierarchically, so that users get the ‘big picture’ first,
followed by information at increasingly higher levels of REFERENCES
detail upon request. Berman,H.M., Westbrook,J., Feng,Z., Gillliland,G., Bhat,T.N.,
Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) The protein
Methods are not ranked based on too few test proteins! data bank. Nucleic Acids Res., 28, 235–242.
Since prediction accuracy varies between proteins, pub- Eyrich,V. and Rost,B. (2000) The META-PredictProtein
lished estimates for performance are averages over many server. WWW document (http://cubic.bioc.columbia.edu/
proteins, with some standard deviation. We use this stan- predictprotein/submit meta.html) CUBIC, Columbia University,
dard deviation to estimate the error of the average accuracy Department of Biochemistry & Molecular Biophysics.
as a function of the test set size. This is justified, since dif- Fischer,D., Barret,C., Bryson,K., Elofsson,A., Godzik,A., Jones,D.,
ferent prediction methods typically have similar standard Karplus,K.J., Kelley,L.A., MacCallum,R.M., Pawowski,K.,
deviations. For example, when a method correctly predicts Rost,B., Rychlewski,L. and Sternberg,M. (1999) CAFASP-1:
75% of the residues in a set of 16 proteins with a standard critical assessment of fully automated structure prediction
deviation of 10%, a difference relative to another method methods. Proteins, 3 (Suppl.), 209–217.
< 2.5% (Q = 10/sqrt (16)) is not significant. Thus, we Rychlewski,L. and Fischer,D. (2000) LiveBench: continuous bench-
marking of prediction servers. WWW document (http://BioInfo.
cannot distinguish between 75% and 73% accuracy.
PL/LiveBench/) http://BioInfo.PL/LiveBench/, IIMCB Warsaw.
Resource with over 40 000 predictions. 2996 new Zemla,A., Venclovas,C. and Fidelis,K. (2001) Protein structure pre-
protein structures have been added to PDB since EVA diction center. http://PredictionCenter.llnl.gov/, Lawrence Liver-
started in June 2000. The 2996 proteins were dissected more National Laboratory.
1243