Abstract
Lexicographical order dependencies (ODs) are proposed to describe the relationships between two lexicographical ordering specifications with respect to lists of attributes, and are proved to be useful in query optimizations concerning ordered attributes. To take full advantage of ODs, the data instance is supposed to satisfy OD specifications. In practice, data are often found to violate given ODs, as demonstrated in recent studies on discovery of ODs. This highlights the quest for data repairing techniques for ODs, to restore consistency of the data with respect to ODs. New challenges arise since ODs convey order semantics beyond functional dependencies, and are specified on lists of attributes. In this paper, we make a first effort to develop techniques for repairing data violations with ODs. (1) We formalize the data repairing problem for ODs, and prove that it is NP-hard in the size of the data. (2) Despite the intractability, we develop effective heuristic algorithms to address the problem. (3) We experimentally evaluate the effectiveness and efficiency of our algorithms, using both real-life and synthetic data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bohannon, P., Fan, W., Flaster, M., Rastogi, R.: A cost based model and effective heuristic for repairing constraints by value modification. In: SIGMOD (2005)
Beskales, G., Ilyas, I., Golab, L., Galiullin, A.: Sampling from repairs of conditional functional dependency violations. VLDB J. 23(1), 103–128 (2014)
Cong, G., Fan, W., Geerts, F., Jia, X., Ma, S.: Improving data quality: consistency and accuracy. In: VLDB (2007)
Chu, X., Ilyas, I., Papotti, P.: Holistic data cleaning: putting violations into context. In: ICDE (2013)
Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2009)
Dallachiesa, M., Ebaid, A., Eldawy, A. Elmagarmid, A., Ilyas, I., Ouzzani, M., Tang, N.: NADEEF: a commodity data cleaning system. In: SIGMOD (2013)
Fan, W., Li, J., Ma, S., Tang, N., Yu, W.: Towards certain fixes with editing rules and master data. VLDB J. 21(2), 213–238 (2012)
Ginsburg, S., Hull, R.: Order dependency in the relational model. TCS 26(1), 149–195 (1983)
Kolahi, S., Lakshmanan, L.: On approximating optimum repairs for functional dependency violations. In: ICDT (2009)
Langer, P., Naumann, F.: Efficient order dependency detection. VLDB J. 25(2), 223–241 (2016)
Ng, W.: An extension of the relational data model to incorporate ordered domains. TODS 26(3), 344–383 (2001)
Song, S., Chen, L.: Differential dependencies: reasoning and discovery. TODS 36(3), 16:1–16:41 (2011)
Szlichta, J., Godfrey, P., Gryz, J.: Fundamentals of order dependencies. PVLDB 5(11), 1220–1231 (2012)
Szlichta, J., Godfrey, P., Golab, L., Kargar, M., Srivastava, D.: Effective and complete discovery of order dependencies via set-based axiomatization. PVLDB 10(7), 721–732 (2017)
Szlichta, J., Godfrey, P., Gryz, J., Zuzarte, C.: Expressiveness and complexity of order dependencies. PVLDB 6(14), 1858–1869 (2013)
Wang, J., Tang, N.: Towards dependable data repairing with fixing rules. In: SIGMOD (2014)
Zhang, A., Song, S., Wang, J.: Sequential data cleaning: a statistical approach. In: SIGMOD (2016)
Acknowledgements
This work is supported by NSFC 61572135, NSFC 61370157, National High Technology Research and Development Program (863 Program) of China (2015AA050203), State Grid Rsearch Project No. 52094016000A, Shanghai Science and Technology Project (No. 16DZ1100200, 16DZ1110102), Aircraft Risk Management Database Project, National Nonprofit Ocean Research Project (No. 201405031-04).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Qiu, Y., Tan, Z., Yang, K., Yang, W., Zhou, X., Guo, N. (2018). Repairing Data Violations with Order Dependencies. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-91458-9_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)