Abstract
In the last few years, a huge amount of temporal written information has become widely available on the Internet with the advent of forums, blogs and social networks. This gave rise to a new challenging problem called future retrieval, which consists of extracting future temporal information, that is known in advance, from web sources in order to answer queries that combine text of a future temporal nature. This paper aims to confirm whether web snippets can be used to form an intelligent web that can detect future expected events when their dates are already known. Moreover, the objective is to identify the nature of future texts and understand how these temporal features affect the classification and clustering of the different types of future-related texts: informative texts, scheduled texts and rumor texts. We have conducted a set of comprehensive experiments and the results show that web documents are a valuable source of future data that can be particularly useful in identifying and understanding the future temporal nature of a given implicit temporal query.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aue, A., Gamon, M.: Customizing Sentiment Classiers to New Domains: a Case Study. In: RANLP 2005, Borovets, Bulgaria, September 21-23 (2005)
Baeza-Yates, R.: Searching the Future. In: MFIR 2005 associated to SIGIR 2005, Salvador, Brazil, August 15-19 (2005)
Berberich, K., Bedathur, S., Alonso, O., Weikum, G.: A language modeling approach for temporal information needs. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 13–25. Springer, Heidelberg (2010)
Boey, E., Hens, K., Deschacht, K., Moens, M.-F.: Automatic Sentiment Analysis of On-Line Text. In: ELPUB 2007, Vienna, Austria, June 13-15 (2007)
Campos, R., Dias, G., Jorge, A.M.: What is the Temporal Value of Web Snippets? In: TWAW2011 Associated to WWW 2011, Hyderabad, India, March 28 (2011)
Fleiss, J.L.: Measuring Nominal Scale Agreement Among many Raters. Psychological Bulletin 76(5), 378–382 (1971)
Google Insights for Search, http://www.google.com/insights/search
Jatowt, A., Kawai, H., Kanazawa, K., Tanaka, K., Kunieda, K.: Analyzing Collective View of Future, Time-referenced Events on the Web. In: WWW 2010, Raleigh, USA, April 26 - 30, pp. 1123–1124 (2010)
Jatowt, A., Kawai, H., Kanazawa, K., Tanaka, K., Kunieda, K.: Supporting Analysis of Future-Related Information in News Archives and the Web. In: JCDL 2009, Austin, USA, June 15-19, pp. 115–124 (2009)
Liu, Y., Huang, X., An, A., Yu, X.: ARSA: A Sentiment-Aware Model for Predicting Sales Performance Using Blogs. In: SIGIR 2007, Amsterdam, Netherlands, pp. 607–614 (July 2007)
Mishne, G., Glance, N.: Predicting Movie Sales from Blogger Sentiment. In: CAAW 2006 Associated to AAAI 2006, Boston, USA, July 16-20 (2006)
Radinsky, K., Davidovich, S., Markovitch, S.: Predicting the News of Tomorrow Using Patterns inWeb Search Queries. In: WIC 2008, Sydney, Australia, pp. 363–367 (2008)
Recorded Future, http://www.recordedfuture.com/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Campos, R., Dias, G., Jorge, A. (2011). An Exploratory Study on the Impact of Temporal Features on the Classification and Clustering of Future-Related Web Documents. In: Antunes, L., Pinto, H.S. (eds) Progress in Artificial Intelligence. EPIA 2011. Lecture Notes in Computer Science(), vol 7026. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24769-9_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-24769-9_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24768-2
Online ISBN: 978-3-642-24769-9
eBook Packages: Computer ScienceComputer Science (R0)