Abstract
High-utility itemset mining (HUIM) is an effective technique for discovering significant information in data. However, data containing sensitive and private information may cause privacy concerns. Therefore, privacy preserving utility mining (PPUM) has recently become a critical research area. PPUM is the process of transforming a quantitative transactional database into a sanitised one, thus ensuring that utility mining algorithms cannot discover sensitive information. The sanitisation process can have several side effects, including the loss of non-sensitive information and the introduction of redundant information. Additionally, the running times of heuristic algorithms for sanitising data are high. To minimise negative effects and lower the execution time of the hiding process, we propose the G-ILP algorithm with a GPU parallel programming method for preprocessing and a new efficient constraint satisfaction problem for hiding data. The experimental evaluations of G-ILP show the algorithm’s efficiency in terms of running time and its ability to minimise side effects in large datasets.














Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Yun U, Kim D (2017) Analysis of privacy preserving approaches in high utility pattern mining. In: Park, J.J.J.H., Pan, Y., Yi, G., Loia, V. (eds.) Advances in Computer Science and Ubiquitous Computing, Singapore, pp. 883–887. https://doi.org/10.1007/978-981-10-3023-9_137
Yeh J-S, Hsu P-C (2010) HHUIF and MSICF: Novel algorithms for privacy preserving utility mining. Expert Syst Appl 37(7):4779–4786. https://doi.org/10.1016/j.eswa.2009.12.038
Lin C-W, Hong T-P, Wong J-W, Lan G-C, Lin W-Y (2014) A GA-based approach to hide sensitive high utility itemsets. Sci World J 2014:2356–6140. https://doi.org/10.1155/2014/804629
Lin JC-W, Hong T-P, Fournier-Viger P, Liu Q, Wong J-W, Zhan J (2017) Efficient hiding of confidential high-utility itemsets with minimal side effects. J Exp Theoretical Artif Intell 29(6):1225–1245. https://doi.org/10.1080/0952813X.2017.1328462
Lin JC-W, Wu T-Y, Fournier-Viger P, Lin G, Zhan J, Voznak M (2016) Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining. Eng Appl Artif Intell 55:269–284. https://doi.org/10.1016/j.engappai.2016.07.003
Li S, Mu N, Le J, Liao X (2019) A novel algorithm for privacy preserving utility mining based on integer linear programming. Eng Appl Artif Intell 81:300–312. https://doi.org/10.1016/j.engappai.2018.12.006
Liu X, Chen G, Wen S, Song G (2020) An improved sanitization algorithm in privacy-preserving utility mining. Mathematical Problems in Engineering 2020:1–14. https://doi.org/10.1155/2020/7489045
Liu X, Wen S, Zuo W (2020) Effective sanitization approaches to protect sensitive knowledge in high-utility itemset mining. Appl Intell 50(1):169–191. https://doi.org/10.1007/s10489-019-01524-2
Jangra S, Toshniwal D (2022) Efficient algorithms for victim item selection in privacypreserving utility mining. Future Gener Comput Syst 128:219–234. https://doi.org/10.1016/j.future.2021.10.008
Ashraf M, Rady S, Abdelkader T, Gharib TF (2023) Efficient privacy preserving algorithms for hiding sensitive high utility itemsets. Comput Sec 103360. https://doi.org/10.1016/j.cose.2023.103360
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 482–486. https://doi.org/10.1137/1.9781611972740.51
Liu Y, Liao W-K, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Ho TB, Cheung D, Liu H (eds.) Advances in Knowledge Discovery and Data Mining, Berlin, Heidelberg, pp. 689–695. https://doi.org/10.1007/1143091979
Lin C-W, Hong T-P, Lu W-H (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424. https://doi.org/10.1016/j.eswa.2010.12.082
Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) Up-growth: An efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD-10, pp. 253–262. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1835804.1835839
Tseng VS, Shie B, Wu C, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786. https://doi.org/10.1109/TKDE.2012.59
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. CIKM-12, pp. 55–64, New York, NY, USA. https://doi.org/10.1145/2396761.2396773
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen T, Christiansen H, Cubero J-C, Ra’s ZW (eds.) Foundations of Intelligent Systems, Cham, pp. 83–92. https://doi.org/10.1007/978-3-319-08326-19
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381. https://doi.org/10.1016/j.eswa.2014.11.001
Zida S, Fournier Viger P, Lin C-W, Wu C-W, Tseng V (2016) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51:595–625. https://doi.org/10.1007/s10115-016-0986-0
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P (2015) Mining high-utility itemsets with multiple minimum utility thresholds. In: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering. C3S2E-15, pp. 9–17, New York, NY, USA. https://doi.org/10.1145/2790798.2790807
Gan W, Lin JC-W, Chao H-C, Fournier-Viger P, Wang X, Yu PS (2020) Utility-driven mining of trend information for intelligent system. ACM Trans Manag Inf Syst 11(3). https://doi.org/10.1145/3391251
Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases. IEEE Access 8:85890–85899. https://doi.org/10.1109/ACCESS.2020.2992729
Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78. https://doi.org/10.1016/j.future.2019.09.024
Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165. https://doi.org/10.1016/j.eswa.2014.08.037
Nguyen D (2022) Le B (2022) A fast algorithm for privacy-preserving utility mining. J Inf Technol Commun 1:12–22. https://doi.org/10.32913/mic-ict-research.v2022.n1.1026
Wu C, Fournier-Viger P, Gu J, Tseng VS (2015) Mining closed+ high utility itemsets without candidate generation. In: 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 187–194. https://doi.org/10.1109/TAAI.2015.7407089
Li X, Shih P-C, Overbey J, Seals C, Lim A (2016) Comparing programmer productivity in openacc and cuda: An empirical investigation. Intern J Comput Sci, Eng Appl (IJCSEA) 6(5):1–15
Dong J, Han M (2007) BitTableFI: An efficient mining frequent itemsets algorithm. Knowl-Based Syst 20(4):329–335. https://doi.org/10.1016/j.knosys.2006.08.005
Liu J, Wang K, Fung BC (2012) Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989. https://doi.org/10.1109/ICDM.2012.20. IEEE
Gurobi Optimization L (2020) Gurobi Optimizer Reference Manual. http://www.gurobi.com
Acknowledgements
This research is funded by University of Science, VNU-HCM under grant number CNTT 2021-05
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nguyen, D., Tran, MT. & Le, B. A new algorithm using integer programming relaxation for privacy-preserving in utility mining. Appl Intell 53, 25106–25118 (2023). https://doi.org/10.1007/s10489-023-04913-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04913-w