research-article

LiMe: linear methods for pseudo-relevance feedback

Authors:

Daniel Valcarce,

Javier Parapar,

Álvaro BarreiroAuthors Info & Claims

SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

Pages 678 - 687

https://doi.org/10.1145/3167132.3167207

Published: 09 April 2018 Publication History

Abstract

Retrieval effectiveness has been traditionally pursued by improving the ranking models and by enriching the pieces of evidence about the information need beyond the original query. A successful method for producing improved rankings consists in expanding the original query. Pseudo-relevance feedback (PRF) has proved to be an effective method for this task in the absence of explicit user's judgements about the initial ranking. This family of techniques obtains expansion terms using the top retrieved documents yielded by the original query. PRF techniques usually exploit the relationship between terms and documents or terms and queries. In this paper, we explore the use of linear methods for pseudo-relevance feedback. We present a novel formulation of the PRF task as a matrix decomposition problem which we called LiMe. This factorisation involves the computation of an inter-term similarity matrix which is used for expanding the original query. We use linear least squares regression with regularisation to solve the proposed decomposition with non-negativity constraints. We compare LiMe on five datasets against strong state-of-the-art baselines for PRF showing that our novel proposal achieves improvements in terms of MAP, nDCG and robustness index.

References

[1]

Nasreen Abdul-Jaleel, James Allan, W. Bruce Croft, Fernando Diaz, Leah Larkey, Xiaoyan Li, Mark D. Smucker, and Courtney Wade. 2004. UMass at TREC 2004: Novelty and HARD. In TREC 2004. 1--13.

[2]

Adam Berger and John Lafferty. 1999. Information Retrieval as Statistical Translation. In SIGIR '99. ACM, 222--229.

Digital Library

[3]

David Carmel and Elad Yom-Tov. 2010. Estimating the Query Difficulty for Information Retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services 2, 1 (2010), 1--89.

[4]

Claudio Carpineto, Renato de Mori, Giovanni Romano, and Brigitte Bigi. 2001. An Information-Theoretic Approach to Automatic Query Expansion. ACM Trans. Inf. Syst. 19, 1 (2001), 1--27.

Digital Library

[5]

Claudio Carpineto and Giovanni Romano. 2012. A Survey of Automatic Query Expansion in Information Retrieval. ACM Comput. Surv. 44, 1 (2012), 1:1--1:50.

Digital Library

[6]

Kevyn Collins-Thompson and Jamie Callan. 2007. Estimation and use of uncertainty in pseudo-relevance feedback. In SIGIR '07. ACM, 303.

Digital Library

[7]

Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of Recommender Algorithms on Top-N Recommendation Tasks. In RecSys '10. ACM, 39--46.

Digital Library

[8]

W. Bruce Croft and David J. Harper. 1979. Using Probabilistic Models of Document Retrieval Without Relevance Information. J. Doc. 35, 4 (1979), 285--295.

[9]

Arthur E Hoerl and Robert W Kennard. 1970. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 12, 1 (1970), 55--67.

[10]

Maryam Karimzadehgan and ChengXiang Zhai. 2010. Estimation of Statistical Translation Models Based on Mutual Information for Ad Hoc Information Retrieval. In SIGIR '10. ACM, 323.

Digital Library

[11]

Yehuda Koren and Robert Bell. 2015. Advances in Collaborative Filtering. In Recommender Systems Handbook (2nd ed.), Francesco Ricci, Lior Rokach, and Bracha Shapira (Eds.). Springer US, 77--118.

[12]

John Lafferty and Chengxiang Zhai. 2001. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01. ACM, 111--119.

Digital Library

[13]

Victor Lavrenko and W. Bruce Croft. 2001. Relevance-Based Language Models. In SIGIR '01. ACM, 120--127.

Digital Library

[14]

Kyung Soon Lee, W Bruce Croft, and James Allan. 2008. A Cluster-based Resampling Method for Pseudo-relevance Feedback. In SIGIR '08. ACM, 235--242.

Digital Library

[15]

Yuanhua Lv and ChengXiang Zhai. 2009. A Comparative Study of Methods for Estimating Query Language Models with Pseudo Feedback. In CIKM '09. ACM, 1895--1898.

Digital Library

[16]

Yuanhua Lv and ChengXiang Zhai. 2014. Revisiting the Divergence Minimization Feedback Model. In CIKM '14. ACM, 1863--1866.

Digital Library

[17]

Craig Macdonald, Richard McCreadie, Rodrygo L. T. Santos, and Iadh Ounis. 2012. From Puppy to Maturity: Experiences in Developing Terrier. In Proceedings of the SIGIR 2012 Workshop in Open Source Information Retrieval. 60--63.

[18]

M. E. Maron and J. L. Kuhns. 1960. On Relevance, Probabilistic Indexing and Information Retrieval. J. ACM 7, 3 (1960), 216--244.

Digital Library

[19]

Donald Metzler and W. Bruce Croft. 2007. Linear Feature-Based Models for Information Retrieval. Inf. Retr. 10, 3 (2007), 257--274.

Digital Library

[20]

Xia Ning and George Karypis. 2011. SLIM: Sparse Linear Methods for Top-N Recommender Systems. In ICDM '11. IEEE Computer Society, 497--506.

Digital Library

[21]

Javier Parapar and Álvaro Barreiro. 2011. Promoting Divergent Terms in the Estimation of Relevance Models. In ICTIR '11. Springer-Verlag, 77--88.

Digital Library

[22]

Javier Parapar, Alejandro Bellogín, Pablo Castells, and Álvaro Barreiro. 2013. Relevance-based Language Modelling for Recommender Systems. Inf. Process. Manage. 49, 4 (2013), 966--980.

Digital Library

[23]

Javier Parapar, Manuel A. Presedo-Quindimil, and Álvaro Barreiro. 2014. Score Distributions for Pseudo Relevance Feedback. Inf. Sci. 273 (2014), 171--181.

[24]

Jay M. Ponte and W. Bruce Croft. 1998. A Language Modeling Approach to Information Retrieval. In SIGIR '98. ACM, 275--281.

Digital Library

[25]

Karthik Raman, Raghavendra Udupa, Pushpak Bhattacharya, and Abhijit Bhole. 2010. On Improving Pseudo-Relevance Feedback Using Pseudo-Irrelevant Documents. In ECIR '10. Springer-Verlag, 573--576.

Digital Library

[26]

Stephen E. Robertson. 1990. On Term Selection for Query Expansion. J. Doc. 46, 4 (1990), 359--364.

Digital Library

[27]

Stephen E. Robertson and Karen Sparck Jones. 1976. Relevance Weighting of Search Terms. J. Am. Soc. Inf. Sci. 27, 3 (1976), 129--146.

[28]

Joseph J. Rocchio. 1971. Relevance Feedback in Information Retrieval. In The SMART Retrieval System - Experiments in Automatic Document Processing, Gerard Salton (Ed.). Prentice Hall, 313--323.

[29]

Ian Ruthven and Mounia Lalmas. 2003. A Survey on the Use of Relevance Feedback for Information Access Systems. Knowl. Eng. Rev. 18, 2 (2003), 95--145.

Digital Library

[30]

Tetsuya Sakai, Toshihiko Manabe, and Makoto Koyama. 2005. Flexible Pseudo-Relevance Feedback via Selective Sampling. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 4, 2 (2005), 111--135.

Digital Library

[31]

Gerard Salton. 1971. The SMART Retrieval System---Experiments in Automatic Document Processing. Prentice-Hall, Inc.

Digital Library

[32]

Jangwon Seo and W. Bruce Croft. 2010. Geometric Representations for Multiple Documents. In SIGIR '10. ACM, 251--258.

Digital Library

[33]

Mark D Smucker, James Allan, and Ben Carterette. 2007. A Comparison of Statistical Significance Tests for Information Retrieval Evaluation. In CIKM '07. ACM, 623--632.

Digital Library

[34]

Robert Tibshirani. 1996. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. Ser. B-Stat. Methodol. 58, 1 (1996), 267--288.

[35]

Daniel Valcarce, Javier Parapar, and Álvaro Barreiro. 2016. Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation. In ECIR '16. Springer International Publishing, 602--613.

[36]

Yang Xu, Gareth J.F. Jones, and Bin Wang. 2009. Query Dependent Pseudo-Relevance Feedback Based on Wikipedia. In SIGIR '09. ACM, 59.

Digital Library

[37]

Hamed Zamani, Javid Dadashkarimi, Azadeh Shakery, and W. Bruce Croft. 2016. Pseudo-Relevance Feedback Based on Matrix Factorization. In CIKM '16. ACM, 1483--1492.

Digital Library

[38]

ChengXiang Zhai and John Lafferty. 2001. Model-based Feedback in the Language Modeling Approach to Information Retrieval. In CIKM '01. ACM, 403.

Digital Library

[39]

ChengXiang Zhai and John Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22, 2 (2004), 179--214.

Digital Library

[40]

Hui Zou and Trevor Hastie. 2005. Regularization and Variable Selection via the Elastic Net. J. R. Stat. Soc. Ser. B-Stat. Methodol. 67, 2 (2005), 301--320.

Cited By

Das TRath SSengupta S(2024)Telling Apart: ML Framework Towards Cyber Attack and Fault Differentiation in Microgrids2024 IEEE 7th International Conference on Industrial Cyber-Physical Systems (ICPS)10.1109/ICPS59941.2024.10639982(1-6)Online publication date: 12-May-2024
https://doi.org/10.1109/ICPS59941.2024.10639982
Khan TRashid UKhan A(2024)End-to-end pseudo relevance feedback based vertical web search queries recommendationMultimedia Tools and Applications10.1007/s11042-024-18559-4Online publication date: 21-Feb-2024
https://doi.org/10.1007/s11042-024-18559-4
Valcarce D(2021)Information retrieval models for recommender systemsACM SIGIR Forum10.1145/3458537.345854553:1(44-45)Online publication date: 23-Mar-2021
https://dl.acm.org/doi/10.1145/3458537.3458545
Show More Cited By

Index Terms

LiMe: linear methods for pseudo-relevance feedback
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
      1. Query reformulation
    2. Retrieval models and ranking

Recommendations

Document-based and term-based linear methods for pseudo-relevance feedback

Query expansion is a successful approach for improving Information Retrieval effectiveness. This work focuses on pseudo-relevance feedback (PRF) which provides an automatic method for expanding queries without explicit user feedback. These techniques ...
Query Expansion as a Matrix Factorization Problem: Extended Abstract
CERI '18: Proceedings of the 5th Spanish Conference on Information Retrieval

Pseudo-relevance feedback (PRF) provides an automatic method for query expansion in Information Retrieval. These techniques find relevant expansion terms using the top retrieved documents with the original query. In this paper, we present an approach ...
Document expansion for image retrieval
RIAO '10: Adaptivity, Personalization and Fusion of Heterogeneous Information

Successful information retrieval requires effective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of these documents. One ...

Comments

comments powered by Disqus.

Information & Contributors

Information

Published In

SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

April 2018

2327 pages

ISBN:9781450351911

DOI:10.1145/3167132

Conference Chairs:
Hisham M. Haddad
Kennesaw State University
,
Roger L. Wainwright
University of Tulsa
,
Richard Chbeir
University of Pau & Pays Adour, France

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SAC 2018

Sponsor:

SIGAPP

SAC 2018: Symposium on Applied Computing

April 9 - 13, 2018

Pau, France

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25

Sponsor:
sigapp

The 40th ACM/SIGAPP Symposium on Applied Computing

March 31 - April 4, 2025

Catania , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
123
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Das TRath SSengupta S(2024)Telling Apart: ML Framework Towards Cyber Attack and Fault Differentiation in Microgrids2024 IEEE 7th International Conference on Industrial Cyber-Physical Systems (ICPS)10.1109/ICPS59941.2024.10639982(1-6)Online publication date: 12-May-2024
https://doi.org/10.1109/ICPS59941.2024.10639982
Khan TRashid UKhan A(2024)End-to-end pseudo relevance feedback based vertical web search queries recommendationMultimedia Tools and Applications10.1007/s11042-024-18559-4Online publication date: 21-Feb-2024
https://doi.org/10.1007/s11042-024-18559-4
Valcarce D(2021)Information retrieval models for recommender systemsACM SIGIR Forum10.1145/3458537.345854553:1(44-45)Online publication date: 23-Mar-2021
https://dl.acm.org/doi/10.1145/3458537.3458545
Otero DParapar JBarreiro ÁHung CHong JBechini ASong E(2021)The wisdom of the rankersProceedings of the 36th Annual ACM Symposium on Applied Computing10.1145/3412841.3441947(672-680)Online publication date: 22-Mar-2021
https://dl.acm.org/doi/10.1145/3412841.3441947
Das TShukla RSengupta S(2021)The Devil is in the Details: Confident & Explainable Anomaly Detector for Software-Defined Networks2021 IEEE 20th International Symposium on Network Computing and Applications (NCA)10.1109/NCA53618.2021.9685157(1-5)Online publication date: 23-Nov-2021
https://doi.org/10.1109/NCA53618.2021.9685157
Arampatzis APeikos GSymeonidis S(2021)Pseudo relevance feedback optimizationInformation Retrieval Journal10.1007/s10791-021-09393-5Online publication date: 25-May-2021
https://doi.org/10.1007/s10791-021-09393-5
Valcarce DParapar JBarreiro Á(2019)Document-based and term-based linear methods for pseudo-relevance feedbackACM SIGAPP Applied Computing Review10.1145/3307624.330762618:4(5-17)Online publication date: 15-Jan-2019
https://dl.acm.org/doi/10.1145/3307624.3307626
Aklouche BBounhas ISlimani Y(2019)BM25 Beyond Query-Document SimilarityString Processing and Information Retrieval10.1007/978-3-030-32686-9_5(65-79)Online publication date: 3-Oct-2019
https://doi.org/10.1007/978-3-030-32686-9_5
Aklouche BBounhas ISlimani Y(2019)Pseudo-Relevance Feedback Based on Locally-Built Co-occurrence GraphsAdvances in Databases and Information Systems10.1007/978-3-030-28730-6_7(105-119)Online publication date: 13-Aug-2019
https://doi.org/10.1007/978-3-030-28730-6_7
Landin A(2019)Learning User and Item Representations for Recommender SystemsAdvances in Information Retrieval10.1007/978-3-030-15719-7_45(337-342)Online publication date: 14-Apr-2019
https://dl.acm.org/doi/10.1007/978-3-030-15719-7_45
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Alternative Proxy