Abstract
Sliding-window aggregation summarizes a collection of recent streaming data, capturing the most recent happenings as well as some history. Algorithms for this problem are required to maintain an aggregate value as new data items are inserted into the window when they arrive, and old data items are evicted from the window when they expire. Supporting this efficiently poses algorithmic challenges, especially for non-invertible aggregation functions such asmax, for which there is no way to “subtract off” expiring items. This chapter provides a brief overview of this area of research and explores a number of sliding-window aggregation algorithms, including both simple and sophisticated algorithms. Real-world use cases are also given to showcase problem scenarios where sliding-window aggregation can be applicable.
Similar content being viewed by others
References
Arasu A, Widom J (2004) Resource sharing in continuous sliding window aggregates. In: Conference on very large data bases (VLDB), pp 336–347
Arasu A, Cherniack M, Galvez E, Maier D, Maskey AS, Ryvkina E, Stonebraker M, Tibbetts R (2004) Linear road: a stream data management benchmark. In: Conference on very large data bases (VLDB), pp 480–491
Arasu A, Babu S, Widom J (2006) The CQL continuous query language: semantic foundations and query execution. J Very Large Data Bases 15(2):121–142
Bloom BH (1970) Space/time trade-offs in hash coding with allowable errors. Commun ACM 13(7):422–426
Blount M, Ebling MR, Eklund JM, James AG, McGregor C, Percival N, Smith K, Sow D (2010) Real-time analysis for intensive care: development and deployment of the Artemis analytic system. IEEE Eng Med Biol Mag 29:110–118
Carbone P, Traub J, Katsifodimos A, Haridi S, Markl V (2016) Cutty: aggregate sharing for user-defined windows. In: Conference on information and knowledge management (CIKM), pp 1201–1210
Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. J Algorithms 55(1):58–75
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Symposium on operating systems design and implementation (OSDI), pp 137–150
Flajolet P, Fusy E, Gandouet O, Meunier F (2007) HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Conference on analysis of algorithms (AofA), pp 127–146
Garcia-Molina H, Ullman JD, Widom J (2008) Database systems: the complete book, 2nd edn. Pearson/Prentice Hall, New Dehli
Gedik B (2013) Generic windowing support for extensible stream processing systems. Softw Pract Exp 44(9): 1105–1128
Gray J, Bosworth A, Layman A, Pirahesh H (1996) Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International conference on data engineering (ICDE), pp 152–159
Hirzel M, Rabbah R, Suter P, Tardieu O, Vaziri M (2016) Spreadsheets for stream processing with unbounded windows and partitions. In: Conference on distributed event-based systems (DEBS), pp 49–60
Hutton G (1999) A tutorial on the universality and expressiveness of fold. J Funct Program 9(1):355–372
Krishnamurthy S, Wu C, Franklin M (2006) On-the-fly sharing for streamed aggregation. In: International conference on management of data (SIGMOD), pp 623–634
Krishnamurthy S, Franklin MJ, Davis J, Farina D, Golovko P, Li A, Thombre N (2010) Continuous analytics over discontinuous streams. In: International conference on management of data (SIGMOD), pp 1081–1092
Li J, Maier D, Tufte K, Papadimos V, Tucker PA (2005) No pane, no gain: efficient evaluation of sliding-window aggregates over data streams. ACM SIGMOD Rec 34(1):39–44
Okasaki C (1995) Simple and efficient purely functional queues and deques. J Funct Program 5(4): 583–592
Sajaniemi J, Pekkanen J (1988) An empirical analysis of spreadsheet calculation. Softw Pract Exp 18(6):583–596
Schneider S, Hirzel M, Gedik B, Wu KL (2015) Safe data parallelism for general streaming. IEEE Trans Comput 64(2):504–517
Shein AU, Chrysanthis PK, Labrinidis A (2017) FlatFIT: accelerated incremental sliding-window aggregation for real-time analytics. In: Conference on scientific and statistical database management (SSDBM), pp 5:1–5:12
Srivastava U, Widom J (2004) Flexible time management in data stream systems. In: Principles of database systems (PODS), pp 263–274
Tangwongsan K, Hirzel M, Schneider S, Wu KL (2015) General incremental sliding-window aggregation. In: Conference on very large data bases (VLDB), pp 702–713
Tangwongsan K, Hirzel M, Schneider S (2017) Low-latency sliding-window aggregation in worst-case constant time. In: Conference on distributed event-based systems (DEBS), pp 66–77
Treleaven P, Galas M, Lalchand V (2013) Algorithmic trading review. Commun ACM 56(11):76–85
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this entry
Cite this entry
Tangwongsan, K., Hirzel, M., Schneider, S. (2018). Sliding-Window Aggregation Algorithms. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_157-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_157-1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Living Reference MathematicsReference Module Computer Science and Engineering
Publish with us
Chapter history
-
Latest
Sliding-Window Aggregation Algorithms- Published:
- 17 March 2022
DOI: https://doi.org/10.1007/978-3-319-63962-8_157-2
-
Original
Sliding-Window Aggregation Algorithms- Published:
- 05 February 2018
DOI: https://doi.org/10.1007/978-3-319-63962-8_157-1