Abstract
Detecting and tracking events from the text stream data is critical to social network society and thus attracts more and more research efforts. However, there exist two major limitations in the existing topic detection and tracking models, i.e. noise words and multiple sub-events. In this paper, a novel event detection and tracking algorithm, topic event detection and tracking (TEDT), was proposed to tackle these limitations by clustering the co-occurrent features of the underlying topics in the text stream data and then the evolution of events was analyzed for the event tracking purpose. The evaluation was performed on two real datasets with the promising results demonstrating that (1) the proposed TEDT algorithm is superior to the state-of-the-art topic model with respect to event detection; (2) the proposed TEDT algorithm can successfully track the event changes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
He, T., Qu, G., Li, S., Tu, X., Zhang, Y., Ren, H.: Semi-automatic hot event detection. In: ADMA 2006. LNCS (LNAI), vol. 4093, pp. 1008–1016. Springer, Heidelberg (2006)
Wang, C., Zhang, M., Ma, S., Ru, L.: Automatic online news issue construction in web environment. In: Proceedings of the 17th International Conference on World Wide Web, pp. 457–466 (2008)
Wang, Y., Xi, Y.H., Wang, L.: Mining the hottest topics on chinese webpage based on the improved k-means partitioning. In: International Conference on Proceedings of Machine Learning and Cybernetics, vol. 1, pp. 255–260 (2009)
Wang, X., Zhai, C., Hu, X., Sproat, R.: Mining correlated bursty topic patterns from coordinated text streams. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 784–793 (2007)
Hurst, M.F.: Temporal text mining. In: Proceedings of AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, pp. 73–77 (2006)
Eda, T., Yoshikawa, M., Uchiyama, T., Uchiyama, T.: The effectiveness of latent semantic analysis for building up a bottom-up taxonomy from folksonomy tags. In: Proceedings of World Wide Web, pp. 421–440 (2009)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)
Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 198–207 (2005)
Osborne, M., Petrovic, S., McCreadie, R., Macdonald, C., Ounis, I.: Bieber no more: First story detection using twitter and wikipedia. In: Proceedings of the SIGIR Workshop on Time-aware Information Access (2012)
Lin, C.X., Zhao, B., Mei, Q., Han, J.: Pet: A statistical model for popular events tracking in social communities. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 929–938 (2010)
Yao, J., Cui, B., Huang, Y., Jin, X.: Temporal and social context based burst detection from folksonomies. In: Proceedings of AAAI (2010)
Yao, J., Cui, B., Huang, Y., Zhou, Y.: Bursty event detection from collaborative tags. Proceedings of World Wide Web 15(2), 171–195 (2012)
Fung, G.P.C., Yu, J.X., Yu, P.S., Lu, H.: Parameter free bursty events detection in text streams. In: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB Endowment, pp. 181–192 (2005)
Singh, V.K., Gao, M., Jain, R.: Social pixels: Genesis and evaluation. In: Proceedings of the International Conference on Multimedia, pp. 481–490 (2010)
AlSumait, L., Barbará, D., Domeniconi, C.: On-line lda: Adaptive topic models for mining text streams with applications to topic detection and tracking. In: Proceedings of Eighth IEEE International Conference on Data Mining, pp. 3–12 (2008)
Hoffman, M., Bach, F.R., Blei, D.M.: Online learning for latent dirichlet allocation. In: Proceedings of Advances in Neural Information Processing Systems, pp. 856–864 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, C., Ye, Y., Zhang, X., Chu, D., Deng, S., Xu, X. (2014). Clustering Based Topic Events Detection on Text Stream. In: Nguyen, N.T., Attachoo, B., Trawiński, B., Somboonviwat, K. (eds) Intelligent Information and Database Systems. ACIIDS 2014. Lecture Notes in Computer Science(), vol 8397. Springer, Cham. https://doi.org/10.1007/978-3-319-05476-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-05476-6_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05475-9
Online ISBN: 978-3-319-05476-6
eBook Packages: Computer ScienceComputer Science (R0)