YAKE! Collection-Independent Automatic Keyword Extractor

Campos, Ricardo; Mangaravite, Vítor; Pasquali, Arian; Jorge, Alípio Mário; Nunes, Célia; Jatowt, Adam

doi:10.1007/978-3-319-76941-7_80

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10772))

Included in the following conference series:

European Conference on Information Retrieval

5969 Accesses
7 Altmetric

Abstract

In this paper, we present YAKE!, a novel feature-based system for multi-lingual keyword extraction from single documents, which supports texts of different sizes, domains or languages. Unlike most systems, YAKE! does not rely on dictionaries or thesauri, neither it is trained against any corpora. Instead, we follow an unsupervised approach which builds upon features extracted from the text, making it thus applicable to documents written in many different languages without the need for external knowledge. This can be beneficial for a large number of tasks and a plethora of situations where the access to training corpora is either limited or restricted. In this demo, we offer an easy to use, interactive session, where users from both academia and industry can try our system, either by using a sample document or by introducing their own text. As an add-on, we compare our extracted keywords against the output produced by the IBM Natural Language Understanding (IBM NLU) and Rake system. YAKE! demo is available at http://bit.ly/YakeDemoECIR2018. A python implementation of YAKE! is also available at PyPi repository (https://pypi.python.org/pypi/yake/).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Benchmarking Unsupervised Keyword Extraction Algorithms from Online Senegalese News Articles

Keyword Extraction: A Modern Perspective

Article Open access 15 December 2022

Introduction to CKIP’s Language Resources and Their Applications

References

Campos, R., Mangaravite, V., Pasquali, A., Jorge, A., Nunes, C., Jatowt, A.: YAKE! collection-independent automatic keyword extractor. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018, LNCS, vol. 10772, pp. 806–810. Springer, Cham (2018)
Google Scholar
Levenshtein, V.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl 10(8), 707–710 (1966)
MathSciNet MATH Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: EMNLP 2004, pp. 404–411, Barcelona, Spain, 25–26 July 2004
Google Scholar
Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. In: Text Mining: Theory and Applications (2010)
Google Scholar
Turney, P.: Learning algorithms for keyphrase extraction. Inf. Retr. J. 2(4), 303–336 (2000)
Article Google Scholar
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: AAAI 2008, pp. 855–860, 13–17 July 2008
Google Scholar
Witten, I., Paynter, G., Frank, E., Gutwin, C., Nevill-Manning, C.: KEA: practical automatic keyphrase extraction. In: JCDL 2004, pp. 254–255, 7–11 June 1999
Google Scholar

Download references

Acknowledgements

This work is partially funded by the ERDF through the COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT as part of project UID/EEA/50014/2013 and of project UID/MAT/00212/2013. It was also financed by MIC SCOPE (171507010) and by Project “TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER-000020” which is financed by the NORTE 2020, under the Portugal 2020, and through the ERDF.

Author information

Authors and Affiliations

Polytechnic Institute of Tomar, Tomar, Portugal
Ricardo Campos
LIAAD – INESC TEC, Porto, Portugal
Ricardo Campos, Vítor Mangaravite, Arian Pasquali & Alípio Mário Jorge
DCC – FCUP, University of Porto, Porto, Portugal
Alípio Mário Jorge
University of Beira Interior, Covilhã, Portugal
Célia Nunes
Kyoto University, Kyoto, Japan
Adam Jatowt

Authors

Ricardo Campos
View author publications
You can also search for this author in PubMed Google Scholar
Vítor Mangaravite
View author publications
You can also search for this author in PubMed Google Scholar
Arian Pasquali
View author publications
You can also search for this author in PubMed Google Scholar
Alípio Mário Jorge
View author publications
You can also search for this author in PubMed Google Scholar
Célia Nunes
View author publications
You can also search for this author in PubMed Google Scholar
Adam Jatowt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ricardo Campos .

Editor information

Editors and Affiliations

Department of Informatics, Systems, and Communication, University of Milano-Bicocca, Milan, Italy
Gabriella Pasi
LIP6 – UPMC/CNRS, University Pierre et Marie Curie, Paris, France
Benjamin Piwowarski
University of Glasgow, Glasgow, United Kingdom
Leif Azzopardi
Technical University of Vienna, Vienna, Austria
Allan Hanbury

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Campos, R., Mangaravite, V., Pasquali, A., Jorge, A.M., Nunes, C., Jatowt, A. (2018). YAKE! Collection-Independent Automatic Keyword Extractor. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science(), vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_80

Download citation

DOI: https://doi.org/10.1007/978-3-319-76941-7_80
Published: 01 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76940-0
Online ISBN: 978-3-319-76941-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

YAKE! Collection-Independent Automatic Keyword Extractor

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Benchmarking Unsupervised Keyword Extraction Algorithms from Online Senegalese News Articles

Keyword Extraction: A Modern Perspective

Introduction to CKIP’s Language Resources and Their Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

YAKE! Collection-Independent Automatic Keyword Extractor

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Benchmarking Unsupervised Keyword Extraction Algorithms from Online Senegalese News Articles

Keyword Extraction: A Modern Perspective

Introduction to CKIP’s Language Resources and Their Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.