TaxoSBERT: Unsupervised Taxonomy Expansion Through Expressive Semantic Similarity

Margiotta, Daniele; Croce, Danilo; Basili, Roberto

doi:10.1007/978-3-031-39059-3_20

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1875))

Included in the following conference series:

International Conference on Deep Learning Theory and Applications

798 Accesses

Abstract

Knowledge graphs are crucial resources for a large set of document management tasks, such as text retrieval and classification as well as natural language inference. Standard examples are large-scale lexical semantic graphs, such as WordNet, useful for text tagging or sentence disambiguation purposes.

The dynamics of lexical taxonomies is a critical problem as they need to be maintained to follow the language evolution across time. Taxonomy expansion, in this sense, becomes a critical semantic task, as it allows for an extension of existing resources with new properties but also to create new entries, i.e. taxonomy concepts, when necessary. Previous work on this topic suggests the use of neural learning methods able to make use of the underlying taxonomy graph as a source of training evidence. This can be done by graph-based learning, where nets are trained to encode the underlying knowledge graph and to predict appropriate inferences.

This paper presents TaxoSBERT as a simple and effective way to model the taxonomy expansion problem as a retrieval task. It combines a robust semantic similarity measure and taxonomy-driven re-rank strategies. This method is unsupervised, the adopted similarity measures are trained on (large-scale) resources out of a target taxonomy and are extremely efficient. The experimental evaluation with respect to two taxonomies shows surprising results, improving far more complex state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
See WordNet for sense 1 of the noun “center field”.
2.
The source code is publicly available at https://github.com/crux82/TaxoSBERT.
3.
Consider that, if using state-of-the-art Transfomer-based architectures, such as the BERT-based ones, the classification of a text pair requires encoding it and it is a computationally expensive task.
4.
Notice that the hyponymy relation does not correspond to a perfect DAG in Wordnet, as multiple inheritances are occasionally needed for some synsets (nodes) in Wordnet. However, for the Taxonomy Enrichment task, this assumption is always satisfied, so that, in the scope of this paper, our definition is thus fully consistent.
5.
https://huggingface.co/sentence-transformers/all-mpnet-base-v2.

References

Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text (2019)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018). https://doi.org/10.48550/ARXIV.1810.04805, https://arxiv.org/abs/1810.04805
Jiang, M., Song, X., Zhang, J., Han, J.: TaxoEnrich: self-supervised taxonomy completion via structure-semantic representations. In: Proceedings of the ACM Web Conference 2022. ACM (2022). https://doi.org/10.1145/3485447.3511935
Manzoor, E., Li, R., Shrouty, D., Leskovec, J.: Expanding taxonomies with implicit edge semantics. In: Proceedings of The Web Conference 2020, pp. 2044–2054. WWW ’20, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3366423.3380271
Mao, Y., et al.: Octet: online catalog taxonomy enrichment with self-supervision. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2247–2257. KDD 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3394486.3403274
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(1), 39–41 (1995). https://doi.org/10.1145/219717.219748
Article Google Scholar
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding (2019)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2019), https://arxiv.org/abs/1908.10084
Roller, S., Kiela, D., Nickel, M.: Hearst patterns revisited: automatic hypernym detection from large text corpora. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 358–363. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-2057, https://aclanthology.org/P18-2057
Shen, J., Shen, Z., Xiong, C., Wang, C., Wang, K., Han, J.: TaxoExpan: self-supervised taxonomy expansion with position-enhanced graph neural network. In: Proceedings of The Web Conference 2020, pp. 486–497. WWW 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3366423.3380132
Sutskever, I., Salakhutdinov, R., Tenenbaum, J.B.: Modelling relational data using Bayesian clustered tensor factorization. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems, pp. 1821–1828. NIPS 2009, Curran Associates Inc., Red Hook, NY, USA (2009)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances In Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vrandečić, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of the 21st International Conference on World Wide Web, pp. 1063–1064. WWW 2012 Companion, Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2187980.2188242
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (2020). https://www.aclweb.org/anthology/2020.emnlp-demos.6
Yang, C., Zhang, J., Han, J.: Co-embedding network nodes and hierarchical labels with taxonomy based generative adversarial networks. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 721–730 (2020). https://doi.org/10.1109/ICDM50108.2020.00081
Yu, Y., Li, Y., Shen, J., Feng, H., Sun, J., Zhang, C.: STEAM: self-supervised taxonomy expansion with mini-paths. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM (2020). https://doi.org/10.1145/3394486.3403145
Zeng, Q., Lin, J., Yu, W., Cleland-Huang, J., Jiang, M.: Enhancing taxonomy completion with concept generation via fusing relational representations. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery. ACM (2021). https://doi.org/10.1145/3447548.3467308
Zhang, J., et al.: Taxonomy completion via triplet matching network. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 February (2021), pp. 4662–4670 (2021). https://ojs.aaai.org/index.php/AAAI/article/view/16596

Download references

Acknowledgements

We would like to thank the Istituto di Analisi dei Sistemi ed Informatica - Antonio Ruberti (IASI) for supporting the experimentations through access to dedicated computing resources. We acknowledge financial support from the PNRR MUR project PE0000013-FAIR.

Author information

Authors and Affiliations

University of Rome, Tor Vergata, Italy
Daniele Margiotta, Danilo Croce & Roberto Basili

Authors

Daniele Margiotta
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Croce
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Basili
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniele Margiotta .

Editor information

Editors and Affiliations

Université de Tours, Tours, France
Donatello Conte
Instituto de Telecomunicações and University of Lisbon, Lisbon, Portugal
Ana Fred
Ford Motor Company, Commerce Township, MI, USA
Oleg Gusikhin
University of Naples Federico II, Naples, Italy
Carlo Sansone

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Margiotta, D., Croce, D., Basili, R. (2023). TaxoSBERT: Unsupervised Taxonomy Expansion Through Expressive Semantic Similarity. In: Conte, D., Fred, A., Gusikhin, O., Sansone, C. (eds) Deep Learning Theory and Applications. DeLTA 2023. Communications in Computer and Information Science, vol 1875. Springer, Cham. https://doi.org/10.1007/978-3-031-39059-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-39059-3_20
Published: 31 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39058-6
Online ISBN: 978-3-031-39059-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TaxoSBERT: Unsupervised Taxonomy Expansion Through Expressive Semantic Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

TaxoSBERT: Unsupervised Taxonomy Expansion Through Expressive Semantic Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.