Passage Retrieval on Structured Documents Using Graph Attention Networks

Albarede, Lucas; Mulhem, Philippe; Goeuriot, Lorraine; Le Pape-Gardeux, Claude; Marie, Sylvain; Chardin-Segui, Trinidad

doi:10.1007/978-3-030-99739-7_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13186))

Included in the following conference series:

European Conference on Information Retrieval

2905 Accesses
3 Citations

Abstract

Passage Retrieval systems aim at retrieving and ranking small text units according to their estimated relevance to a query. A usual practice is to consider the context a passage appears in (its containing document, neighbour passages, etc.) to improve its relevance estimation. In this work, we study the use of Graph Attention Networks (GATs), a graph node embedding method, to perform passage contextualization. More precisely, we first propose a document graph representation based on several inter- and intra-document relations. Then, we investigate two ways of leveraging the use of GATs on this representation in order to incorporate contextual information for passage retrieval. We evaluate our approach on a Passage Retrieval task for structured documents: CLEF-IP2013. Our results show that our document graph representation coupled with the expressive power of GATs allows for a better context representation leading to improved performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Heterogeneous graph attention networks for passage retrieval

Article 16 November 2023

A Neural Passage Model for Ad-hoc Document Retrieval

Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking

Article 30 July 2022

References

Albarede, L., Mulhem, P., Goeuriot, L., Le Pape-Gardeux, C., Marie, S., Chardin-Segui, T.: Passage retrieval in context: experiments on patents. In: Proceedings of CORIA 2021, Grenoble, France (2021). https://hal.archives-ouvertes.fr/hal-03230421
Andersson, L., Lupu, M., Palotti, J.A., Hanbury, A., Rauber, A.: When is the time ripe for natural language processing for patent passage retrieval? In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016, pp. 1453–1462. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2983323.2983858
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2016)
Google Scholar
Beigbeder, M.: Focused retrieval with proximity scoring. In: Proceedings of the 2010 ACM Symposium on Applied Computing, SAC 2010, pp. 1755–1759. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1774088.1774462
Bendersky, M., Kurland, O.: Utilizing passage-based language models for document retrieval. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 162–174. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_17
Chapter Google Scholar
Callan, J.P.: Passage-level evidence in document retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994, pp. 302–310. Springer-Verlag, Heidelberg (1994). https://doi.org/10.1007/978-1-4471-2099-5_31
Fernández, R., Losada, D., Azzopardi, L.: Extending the language modeling framework for sentence retrieval to include local context. Inf. Retr. 14, 355–389 (2011). https://doi.org/10.1007/s10791-010-9146-4
Article Google Scholar
Geva, S., Kamps, J., Lethonen, M., Schenkel, R., Thom, J.A., Trotman, A.: Overview of the INEX 2009 ad hoc track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 4–25. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14556-8_4
Chapter Google Scholar
Gobeill, J., Ruch, P.: Bitem site report for the claims to passage task in CLEF-IP 2012. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy, 17–20 September 2012, CEUR Workshop Proceedings, vol. 1178. CEUR-WS.org (2012). http://ceur-ws.org/Vol-1178/CLEF2012wn-CLEFIP-GobeillEt2012.pdf
Guo, J., et al.: A deep look into neural ranking models for information retrieval. Inf. Process. Manag. 57(6), 102067 (2020)
Google Scholar
Han, F., Niu, D., Lai, K., Guo, W., He, Y., Xu, Y.: Inferring search queries from web documents via a graph-augmented sequence to attention network. In: The World Wide Web Conference, WWW 2019, pp. 2792–2798. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3308558.3313746
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering (2020)
Google Scholar
Khattab, O., Zaharia, M.: Colbert: efficient and effective passage search via contextualized late interaction over BERT. CoRR abs/2004.12832 (2020). https://arxiv.org/abs/2004.12832
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2017)
Google Scholar
Krikon, E., Kurland, O., Bendersky, M.: Utilizing inter-passage and inter-document similarities for reranking search results. ACM Trans. Inf. Syst. 29(1) (2011). https://doi.org/10.1145/1877766.1877769
Li, X., et al.: Learning better representations for neural information retrieval with graph information. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, CIKM 2020, pp. 795–804. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3340531.3411957
Macdonald, C., McCreadie, R., Santos, R.L., Ounis, I.: From puppy to maturity: experiences in developing terrier. In: Proceedings of OSIR at SIGIR, pp. 60–63 (2012)
Google Scholar
Macdonald, C., Tonellotto, N., Ounis, I.: On single and multiple representations in dense passage retrieval. CoRR abs/2108.06279 (2021). https://arxiv.org/abs/2108.06279
Mahdabi, P., Gerani, S., Huang, J.X., Crestani, F.: Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, pp. 113–122. Association for Computing Machinery, New York (2013). https://doi.org/10.1145/2484028.2484056
Mahdabi, P., Keikha, M., Gerani, S., Landoni, M., Crestani, F.: Building queries for prior-art search. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds.) IRFC 2011. LNCS, vol. 6653, pp. 3–15. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21353-3_2
Chapter Google Scholar
Murdock, V., Croft, W.B.: A translation model for sentence retrieval. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 684–691. Association for Computational Linguistics, Vancouver (2005). https://www.aclweb.org/anthology/H05-1086
Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. CoRR abs/1611.09268 (2016). http://arxiv.org/abs/1611.09268
Norozi, M.A., Arvola, P.: Kinship contextualization: Utilizing the preceding and following structural elements. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, pp. 837–840. Association for Computing Machinery, New York (2013). https://doi.org/10.1145/2484028.2484111
Norozi, M.A., Arvola, P., de Vries, A.P.: Contextualization using hyperlinks and internal hierarchical structure of wikipedia documents. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 734–743. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2396761.2396855
Norozi, M.A., de Vries, A.P., Arvola, P.: Contextualization from the bibliographic structure (2012)
Google Scholar
Piroi, F., Lupu, M., Hanbury, A.: Overview of CLEF-IP 2013 lab. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 232–249. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40802-1_25
Chapter Google Scholar
Robertson, S., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at trec-3, pp. 109–126 (1996)
Google Scholar
Sheetrit, E., Shtok, A., Kurland, O.: A passage-based approach to learning to rank documents (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need (2017)
Google Scholar
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks (2018)
Google Scholar
Wang, X., et al.: Heterogeneous graph attention network. CoRR abs/1903.07293 (2019). http://arxiv.org/abs/1903.07293
Xiong, L., et al.: Approximate nearest neighbor negative contrastive learning for dense text retrieval (2020)
Google Scholar
Xue, X., Croft, W.B.: Automatic query generation for patent search. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 2037–2040. Association for Computing Machinery, New York (2009). https://doi.org/10.1145/1645953.1646295
Yu, J., et al.: Modeling text with graph convolutional network for cross-modal information retrieval (2018)
Google Scholar
Zhang, T., Liu, B., Niu, D., Lai, K., Xu, Y.: Multiresolution graph attention networks for relevance matching. Proceedings of the 27th ACM International Conference on Information and Knowledge Management (2018). https://doi.org/10.1145/3269206.3271806

Download references

Acknowledgement

This work has been partially supported by MIAI@Grenoble Alpes (ANR-19-P3IA-0003), as well as the Association Nationale de la Recherche et de la Technologie (ANRT).

Author information

Authors and Affiliations

Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000, Grenoble, France
Lucas Albarede, Philippe Mulhem & Lorraine Goeuriot
Schneider Electric Industries SAS, Rueil-Malmaison, France
Lucas Albarede, Claude Le Pape-Gardeux, Sylvain Marie & Trinidad Chardin-Segui

Authors

Lucas Albarede
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Mulhem
View author publications
You can also search for this author in PubMed Google Scholar
Lorraine Goeuriot
View author publications
You can also search for this author in PubMed Google Scholar
Claude Le Pape-Gardeux
View author publications
You can also search for this author in PubMed Google Scholar
Sylvain Marie
View author publications
You can also search for this author in PubMed Google Scholar
Trinidad Chardin-Segui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucas Albarede .

Editor information

Editors and Affiliations

Martin Luther University Halle-Wittenberg, Halle, Germany
Matthias Hagen
Leiden University, Leiden, The Netherlands
Suzan Verberne
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Duisburg-Essen, Essen, Germany
Christin Seifert
University of Stavanger, Stavanger, Norway
Krisztian Balog
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Stavanger, Stavanger, Norway
Vinay Setty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Albarede, L., Mulhem, P., Goeuriot, L., Le Pape-Gardeux, C., Marie, S., Chardin-Segui, T. (2022). Passage Retrieval on Structured Documents Using Graph Attention Networks. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-99739-7_2
Published: 05 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99738-0
Online ISBN: 978-3-030-99739-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Passage Retrieval on Structured Documents Using Graph Attention Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Heterogeneous graph attention networks for passage retrieval

A Neural Passage Model for Ad-hoc Document Retrieval

Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Passage Retrieval on Structured Documents Using Graph Attention Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Heterogeneous graph attention networks for passage retrieval

A Neural Passage Model for Ad-hoc Document Retrieval

Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.