Abstract
In recent years, privacy breaches have been a great concern on the published data. Only removing one’s personal identification information is not sufficient to protect individual’s privacy. Privacy preservation technology for published data is devoted to preventing re-identification and retaining the useful information in published data. In this work, we propose a novel algorithm to deal with sensitive and quasi-identifier items, respectively, in transactional data. The proposed algorithm maintains at least the same or a stronger privacy level for transactional data with 1/k. In numerical experiments, our proposed algorithm shows better running time and better data utility.
Similar content being viewed by others
References
Aggarwal, C.C., Yu, P.S.: This book title: Privacy-Preserving Data Mining: Models and Algorithms. Springer, New York (2008)
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A.: Anonymizing tables, In Proc. of the 10th International Conference on Database Theory, pp. 246–258, (2005)
Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., and Zhu, A.: Achieving anonymity via clustering, In Proc. of ACM SIGMOD conference, (2006), pp. 153–162
Barbaro, M. and Zeller, T. Jr: A face is exposed for AOL serach no. 4417749, New York Times, (2006)
Bayardo, R.J. and Agrawal, R., Data Privacy through optimal k-Anonymization, In Proc. of ICDE, (2005), pp. 217–228
Casino, F., Patsakis, C., Puig, C., and Solanas, A.: On privacy preserving collaborative filtering: current trends, open problems and new issues, In Proc. of IEEE 10th International Conference on e-Business Engineering, (2013)
Casino, F., Domingo-Ferrer, J., Patsakis, C., Puig, D., Solanas, A.: A k-anonymous approach to privacy preserving collaborative filtering. J. Comput. Syst. Sci. 81(6), 1000–1011 (2015)
Dwork, C.: Differential privacy, In Proc. of the 33th International conference on Automata Languages and Programming-Volume PartII (ICALP), (2006), pp. 1–12
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey on recent developments. ACM Comput. Surv. 42(4), (2010)
Ghinita, G., Tao, Y., and Kalnis, P.: On the anonymization of sparse high-dimensional data, In Proc. of ICDE, (2008), pp. 715–724
Ghinita, G., Kalnis, P., Tao, Y.: Anonymous publication of sensitive transactional data. IEEE Trans. Knowl. Data Eng. 33(2), 161–174 (2011)
Gkountouna, O., Angeli, S., Zigomitros, A., Terrovitis, M., and Vassiliou, Y.: Km-anonymity for continuous data using dynamic hierarchies, In Proc. of International Conference on Privacy in Statistical Databases, (2014), pp. 156–169
Gokila, S., Venkateswari, P.: A survey on privacy preserving data publishing. IJCI. 3(1), (2014)
He, Y. and Naughton, J.F., Anonymization of set-valued data via top-down, local generalization, In Proc. of VLDB conference, (2009), pp. 934–945
IBM Quest Market-Basket Synthetic Data Generator, http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData
Kabou, S., Benslimane, S.M., Mosteghanemi, M.: A survey on privacy preserving dynamic data publishing. IJOCI. 8(4), 1–20 (2018)
Li, N., Li, T., and Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity, In Proc. of ICDE conference, (2007), pp. 106–115
Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: a new approach for privacy preserving data publishing. IEEE Trans. Knowl. Data Eng. 23(3), 561–574 (2012)
Liu, X., Xie, Q., and Wang, L.: Personalized extended (α, k)-anonymity model for privacy-preserving data publishing, Concurrency and Computation Practice and Experience, (2016)
Loukides, G., Gkoulalas-Divanis, A.: Utility-preserving transaction data anonymization with low information loss. Expert Syst. Appl. 39(10), 9764–9777 (2012)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data. 1(1), (2007)
Meyerson, A., Williames, R.: On the complexity of optimal k-anonymity, In Proc. of ACM PODS conference, (2004), pp. 223–228
Mirashe, M.S., Hande, K.N.: Efficient technique for annonymized microdata preservation using slicing. International Journal of Computer Science and Information Technologies(IJCSIT). 6(4), 3701–3705 (2015)
Motwani, R. and Nabar, S.U.: Anonymizing unstructured data, arXiv: 0810.5582v2, [cs.DB], (2008)
Ni, S., Xie, M., Qian, Q.: Clustering based K-anonymity algorithm for privacy preservation. Int. J. Netw. Secur. 19(6), 1062–1071 (2017)
Park, H., Shim, K., Approximate algorithms for k-anonymity, In Proc. of ACM SIGMOD conference, (2007), pp. 67–78
Rao, P.R.M., Krishna, S.M., Kumar, A.P.S.: Privacy preservation techniques in big data analytics: a survey. J. Big Data. (2018)
Rumbold, J., Pierscionek, B.: Contextual anonymization for secondary use of big data in biomedical research: proposal for an anonymization matrix. JMIR Med. Inform. 6(4), (2018)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Samarati, P. and Sweeny, L., Generalizing data to provide anonymity when disclosing information, In Proc. of ACM Symposium on Principles of Database Systems, (1998), pp. 188
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems. 10(5), 571–588 (2002)
Sweeny, L.: K-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems. 10(5), 557–570 (2002)
Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J. 20(1), 83–106 (2011)
Wang, S.L., Tsai, Y.C., Kao, H.Y., Hong, T.P.: Extending suppression for anonymization on set-valued data. Int. J. Innov. Comput. Inf. Control. 7(1), 6849–6863 (2011)
Wang, S.L., Tsai, Y.C., Kao, H.Y., Hong, T.P.: On anonymizing transactions with sensitive items. Appl. Intell. 41(4), 1043–1058 (2014)
Wang, J., Du, K., Luo, X., Li, X.: Two privacy-preserving approaches for data publishing with identity reservation. Knowl. Inf. Syst. 1–42 (2018)
Xu, T., Wang, K., Fu, A.W.C., and Yu, P.S.: Anonymizing transaction databases for publication, In Proc. of SIGKDD, (2008), pp. 767–775
Xu Y., Fung, B.C.M., Wang, K., Fu, A.W.C., and Pei, J.: Publishing sensitive transactions for itemset utility, In Proc. of ICDM, (2008), pp. 1109–1114
Zhang, H., Zhou, Z., Ye, L., Du, X.: Towards privacy preserving publishing of set-valued data on hybrid cloud. IEEE Trans. Cloud. Comput. 6(2), 316–329 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tsai, YC., Wang, SL., Ting, IH. et al. Flexible sensitive K-anonymization on transactions. World Wide Web 23, 2391–2406 (2020). https://doi.org/10.1007/s11280-020-00798-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-020-00798-8