ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation

Zhang, Zuobai; Lu, Jiarui; Chenthamarakshan, Vijil; Lozano, Aurélie; Das, Payel; Tang, Jian

Quantitative Biology > Biomolecules

arXiv:2402.07955 (q-bio)

[Submitted on 10 Feb 2024]

Title:ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation

Authors:Zuobai Zhang, Jiarui Lu, Vijil Chenthamarakshan, Aurélie Lozano, Payel Das, Jian Tang

View PDF HTML (experimental)

Abstract:Protein function annotation is an important yet challenging task in biology. Recent deep learning advancements show significant potential for accurate function prediction by learning from protein sequences and structures. Nevertheless, these predictor-based methods often overlook the modeling of protein similarity, an idea commonly employed in traditional approaches using sequence or structure retrieval tools. To fill this gap, we first study the effect of inter-protein similarity modeling by benchmarking retriever-based methods against predictors on protein function annotation tasks. Our results show that retrievers can match or outperform predictors without large-scale pre-training. Building on these insights, we introduce a novel variational pseudo-likelihood framework, ProtIR, designed to improve function predictors by incorporating inter-protein similarity modeling. This framework iteratively refines knowledge between a function predictor and retriever, thereby combining the strengths of both predictors and retrievers. ProtIR showcases around 10% improvement over vanilla predictor-based methods. Besides, it achieves performance on par with protein language model-based methods, yet without the need for massive pre-training, highlighting the efficacy of our framework. Code will be released upon acceptance.

Subjects:	Biomolecules (q-bio.BM); Machine Learning (cs.LG)
Cite as:	arXiv:2402.07955 [q-bio.BM]
	(or arXiv:2402.07955v1 [q-bio.BM] for this version)
	https://doi.org/10.48550/arXiv.2402.07955

Submission history

From: Zuobai Zhang [view email]
[v1] Sat, 10 Feb 2024 17:31:46 UTC (8,408 KB)

Quantitative Biology > Biomolecules

Title:ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Quantitative Biology > Biomolecules

Title:ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.