Abstract
Virus infection is a common social health issue. In the past decades, serious virus infectious events have caused great loss in people’s life and the economics. The nature of rapid widespread and frequent variation increases the difficulty for precision viral prevention and treatment. In the era of big data and artificial intelligence (AI), advances in bioinformatics techniques bring unprecedented opportunities for virus informatics study, which contribute to the systems-level modeling of virus biology. In this chapter, data resources including virus-related databases and knowledgebases are introduced. Bioinformatics models and software tools for multiple sequence alignment, evolutionary analysis, and genome-wide research of viruses are summarized and emphasized. Translational applications of recently developed data-driven and AI-assisted methods to viral cases such as SARS-CoV-2, HBV/HCV, and influenza virus are discussed. Finally, the concept and significance of virus informatics are highlighted for both virus surveillance and health promotion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hatano Y, Ideta T, Hirata A, Hatano K, Tomita H, Okada H et al (2021) Virus-driven carcinogenesis. Cancers (Basel) 13(11):2625
Windhaber S, Xin Q, Lozach PY (2021) Orthobunyaviruses: from virus binding to penetration into mammalian host cells. Viruses 13(5):872
Alnuqaydan AM, Almutary AG, Sukamaran A, Yang BTW, Lee XT, Lim WX et al (2021) Middle East respiratory syndrome (MERS) virus-pathophysiological axis and the current treatment strategies. AAPS PharmSciTech 22:173
Goyal M, Tewatia N, Vashisht H, Jain R, Kumar S (2021) Novel corona virus (COVID-19); global efforts and effective investigational medicines: a review. J Infect Public Health 14:910–921
Goettsch W, Beerenwinkel N, Deng L, Dolken L, Dutilh BE, Erhard F et al (2021) ITN-VIROINF: understanding (harmful) virus-host interactions by linking virology and bioinformatics. Viruses 13(5):766
Ramirez-Salinas GL, Garcia-Machorro J, Rojas-Hernandez S, Campos-Rodriguez R, de Oca AC, Gomez MM et al (2020) Bioinformatics design and experimental validation of influenza A virus multi-epitopes that induce neutralizing antibodies. Arch Virol 165:891–911
Hu T, Li J, Zhou H, Li C, Holmes EC, Shi W (2021) Bioinformatics resources for SARS-CoV-2 discovery and surveillance. Brief Bioinform 22:631–641
Ibrahim B, McMahon DP, Hufsky F, Beer M, Deng L, Mercier PL et al (2018) A new era of virus bioinformatics. Virus Res 251:86–90
Pickett BE, Sadat EL, Zhang Y, Noronha JM, Squires RB, Hunt V et al (2012) ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res 40:D593–D598
Hatcher EL, Zhdanov SA, Bao Y, Blinkova O, Nawrocki EP, Ostapchuck Y et al (2017) Virus Variation Resource - improved response to emergent viral outbreaks. Nucleic Acids Res 45:D482–D490
Canakoglu A, Pinoli P, Bernasconi A, Alfonsi T, Melidis DP, Ceri S (2021) ViruSurf: an integrated database to investigate viral sequences. Nucleic Acids Res 49:D817–D824
Goodacre N, Aljanahi A, Nandakumar S, Mikailov M, Khan AS (2018) A Reference Viral Database (RVDB) to enhance bioinformatics analysis of high-throughput sequencing for novel virus detection. mSphere 3(2):e00069-18
Wang Y, Tong Y, Zhang Z, Zheng R, Huang D, Yang J et al (2021) ViMIC: a database of human disease-related virus mutations, integration sites and cis-effects. Nucleic Acids Res 50(D1):D918–D927
Yang X, Lian X, Fu C, Wuchty S, Yang S, Zhang Z (2021) HVIDB: a comprehensive database for human-virus protein-protein interactions. Brief Bioinform 22:832–844
Cook HV, Doncheva NT, Szklarczyk D, von Mering C, Jensen LJ (2018) Viruses.STRING: a virus-host protein-protein interaction database. Viruses 10(10):519
Xiang Y, Zou Q, Zhao L (2020) VPTMdb: a viral posttranslational modification database. Brief Bioinform 22(4):bbaa251
Cai Z, Fan Y, Zhang Z, Lu C, Zhu Z, Jiang T et al (2021) VirusCircBase: a database of virus circular RNAs. Brief Bioinform 22:2182–2190
Tang D, Li B, Xu T, Hu R, Tan D, Song X et al (2020) VISDB: a manually curated database of viral integration sites in the human genome. Nucleic Acids Res 48:D633–D641
Zhao WM, Song SH, Chen ML, Zou D, Ma LN, Ma YK et al (2020) The 2019 novel coronavirus resource. Yi Chuan 42:212–221
Feng Z, Chen M, Liang T, Shen M, Chen H, Xie XQ (2021) Virus-CKB: an integrated bioinformatics platform and analysis resource for COVID-19 research. Brief Bioinform 22:882–895
Chen TF, Chang YC, Hsiao Y, Lee KH, Hsiao YC, Lin YH et al (2021) DockCoV2: a drug database against SARS-CoV-2. Nucleic Acids Res 49:D1152–D1159
Gowthaman R, Guest JD, Yin R, Adolf-Bryfogle J, Schief WR, Pierce BG (2021) CoV3D: a database of high resolution coronavirus protein structures. Nucleic Acids Res 49:D282–D287
Mahdi A, Blaszczyk P, Dlotko P, Salvi D, Chan TS, Harvey J et al (2021) OxCOVID19 Database, a multimodal data repository for better understanding the global impact of COVID-19. Sci Rep 11:9237
Shu Y, McCauley J (2017) GISAID: global initiative on sharing all influenza data - from vision to reality. Euro Surveill 22(13):30494
Squires RB, Noronha J, Hunt V, Garcia-Sastre A, Macken C, Baumgarth N et al (2012) Influenza research database: an integrated bioinformatics resource for influenza research and surveillance. Influenza Other Respir Viruses 6:404–416
Ding X, Yuan X, Mao L, Wu A, Jiang T (2020) FluReassort: a database for the study of genomic reassortments among influenza viruses. Brief Bioinform 21:2126–2132
Squires B, Macken C, Garcia-Sastre A, Godbole S, Noronha J, Hunt V et al (2008) BioHealthBase: informatics support in the elucidation of influenza virus host pathogen interactions and virulence. Nucleic Acids Res 36:D497–D503
Muthaiyan M, Naorem LD, Seenappa V, Pushan SS, Venkatesan A (2021) Ebolabase: Zaire ebolavirus-human protein interaction database for drug-repurposing. Int J Biol Macromol 182:1384–1391
Lathwal A, Kumar R, Raghava GPS (2020) OvirusTdb: a database of oncolytic viruses for the advancement of therapeutics in cancer. Virology 548:109–116
Usman Z, Velkov S, Protzer U, Roggendorf M, Frishman D, Karimzadeh H (2020) HDVdb: a comprehensive hepatitis D virus database. Viruses 12(5):538
Yan B, Zhang S, Yu S, Hussain S, Liu T, Wang B et al (2020) HRRD: a manually-curated database about the regulatory relationship between HPV and host RNA. Sci Rep 10:19586
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797
Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340
Novak A, Miklos I, Lyngso R, Hein J (2008) StatAlign: an extendable software package for joint Bayesian estimation of alignments and evolutionary trees. Bioinformatics 24:2403–2404
Troshin PV, Procter JB, Barton GJ (2011) Java bioinformatics analysis web services for multiple sequence alignment—JABAWS:MSA. Bioinformatics 27:2001–2002
Yachdav G, Wilzbach S, Rauscher B, Sheridan R, Sillitoe I, Procter J et al (2016) MSAViewer: interactive JavaScript visualization of multiple sequence alignments. Bioinformatics 32:3501–3503
Gascuel O (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14:685–695
Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274
Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25:2286–2288
Bouckaert R, Heled J, Kuhnert D, Vaughan T, Wu CH, Xie D et al (2014) BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol 10:e1003537
Zhang D, Gao F, Jakovlic I, Zou H, Zhang J, Li WX et al (2020) PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour 20:348–355
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739
Abril JF, Guigo R (2000) gff2ps: visualizing genomic annotations. Bioinformatics 16:743–744
Liu W, Xie Y, Ma J, Luo X, Nie P, Zuo Z et al (2015) IBS: an illustrator for the presentation and visualization of biological sequences. Bioinformatics 31:3359–3361
Zablocki O, Michelsen M, Burris M, Solonenko N, Warwick-Dugdale J, Ghosh R et al (2021) VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature. PeerJ 9:e11088
Flageul A, Lucas P, Hirchaud E, Touzain F, Blanchard Y, Eterradossi N et al (2021) Viral variant visualizer (VVV): a novel bioinformatic tool for rapid and simple visualization of viral genetic diversity. Virus Res 291:198201
Martin DP, Murrell B, Golden M, Khoosal A, Muhire B (2015) RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol 1:vev003
Alawi M, Burkhardt L, Indenbirken D, Reumann K, Christopeit M, Kroger N et al (2019) DAMIAN: an open source bioinformatics tool for fast, systematic and cohort based analysis of microorganisms in diagnostic samples. Sci Rep 9:16841
Borges V, Pinheiro M, Pechirra P, Guiomar R, Gomes JP (2018) INSaFLU: an automated open web-based bioinformatics suite “from-reads” for influenza whole-genome-sequencing-based surveillance. Genome Med 10:46
Li G, Ruan S, Zhao X, Liu Q, Dou Y, Mao F (2021) Transcriptomic signatures and repurposing drugs for COVID-19 patients: findings of bioinformatics analyses. Comput Struct Biotechnol J 19:1–15
Vastrad B, Vastrad C, Tengli A (2020) Bioinformatics analyses of significant genes, related pathways, and candidate diagnostic biomarkers and molecular targets in SARS-CoV-2/COVID-19. Gene Rep 21:100956
Xie TA, Han MY, Su XR, Li HH, Chen JC, Guo XG (2020) Identification of Hub genes associated with infection of three lung cell lines by SARS-CoV-2 with integrated bioinformatics analysis. J Cell Mol Med 24:12225–12230
Grifoni A, Sidney J, Zhang Y, Scheuermann RH, Peters B, Sette A (2020) A sequence homology and bioinformatic approach can predict candidate targets for immune responses to SARS-CoV-2. Cell Host Microbe 27:671–80.e2
Min YQ, Mo Q, Wang J, Deng F, Wang H, Ning YJ (2020) SARS-CoV-2 nsp1: bioinformatics, potential structural and functional features, and implications for drug/vaccine designs. Front Microbiol 11:587317
Barker H, Parkkila S (2020) Bioinformatic characterization of angiotensin-converting enzyme 2, the entry receptor for SARS-CoV-2. PLoS One 15:e0240647
Teufel A (2015) Bioinformatics and database resources in hepatology. J Hepatol 62:712–719
Lin Y, Qian F, Shen L, Chen F, Chen J, Shen B (2019) Computer-aided biomarker discovery for precision medicine: data resources, models and applications. Brief Bioinform 20:952–975
Tang Y, Zhang Y, Hu X (2020) Identification of potential hub genes related to diagnosis and prognosis of hepatitis B virus-related hepatocellular carcinoma via integrated bioinformatics analysis. Biomed Res Int 2020:4251761
Huang DP, Zeng YH, Yuan WQ, Huang XF, Chen SQ, Wang MY et al (2021) Bioinformatics analyses of potential miRNA-mRNA regulatory axis in HBV-related hepatocellular carcinoma. Int J Med Sci 18:335–346
Liu J, Ma Z, Liu Y, Wu L, Hou Z, Li W (2019) Screening of potential biomarkers in hepatitis C virus-induced hepatocellular carcinoma using bioinformatic analysis. Oncol Lett 18:2500–2508
Zhan Z, Chen Y, Duan Y, Li L, Mew K, Hu P et al (2019) Identification of key genes, pathways and potential therapeutic agents for liver fibrosis using an integrated bioinformatics analysis. PeerJ 7:e6645
Liu S, Huang Z, Deng X, Zou X, Li H, Mu S et al (2021) Identification of key candidate biomarkers for severe influenza infection by integrated bioinformatical analysis and initial clinical validation. J Cell Mol Med 25:1725–1738
Hu YJ, Chow KC, Liu CC, Lin LJ, Wang SC, Wang SD (2015) Using combinatorial bioinformatics methods to analyze annual perspective changes of influenza viruses and to accelerate development of effective vaccines. J Formos Med Assoc 114:774–778
Kaewpongsri S, Sukasem C, Srichunrusami C, Pasomsub E, Zwang J, Pairoj W et al (2010) An integrated bioinformatics approach to the characterization of influenza A/H5N1 viral sequences by microarray data: implication for monitoring H5N1 emerging strains and designing appropriate influenza vaccines. Mol Cell Probes 24:387–395
Shen L, Ye B, Sun H, Lin Y, van Wietmarschen H, Shen B (2017) Systems Health: a transition from disease management toward health promotion. Adv Exp Med Biol 1028:149–164
Acknowledgments
This study was supported by the COVID-19 Research Projects of West China Hospital Sichuan University (Grant no. HX-2019-nCoV-057), the National Natural Science Foundation of China (grant nos. 32070671, and 31900490), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (grant no. 20KJB180010), and the regional innovation cooperation between Sichuan and Guangxi Provinces (grant no. 2020YFQ0019).
Competing Interests
The authors declare no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Lin, Y., Qian, Y., Qi, X., Shen, B. (2022). Databases, Knowledgebases, and Software Tools for Virus Informatics. In: Shen, B. (eds) Translational Informatics. Advances in Experimental Medicine and Biology, vol 1368. Springer, Singapore. https://doi.org/10.1007/978-981-16-8969-7_1
Download citation
DOI: https://doi.org/10.1007/978-981-16-8969-7_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8968-0
Online ISBN: 978-981-16-8969-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)