Abstract
Market analysis is crucial for companies to remain invincible in the increasingly fierce market competition. A typical application is to find the most influential product, which attracts the largest number of customers, from a collection of candidate products. Previous work assumes a random distribution of the candidates. However, in many cases, there is a set of constraints on the distribution of candidate products. In this paper, we study the most influential product problem under constraints of the distribution. We model the constraints as both non-linear and linear constraints, where the candidate products reside in a hyper-rectangle and hyper-plane of the data space, respectively. We capitalize on reverse skyline queries to define the most influential product as the product with the largest reverse skyline set. We propose a general framework to solve the problem efficiently by taking advantage of candidate distributions. More specifically, we introduce a constraint-based filtering scheme, which prunes searching space and enables quick identification of some reverse skyline points, through pre-processing based on distribution constraints. We also propose a distance-based ordering technique, such that the processing results of a candidate can be utilized for data pruning of subsequent candidates. By combining the filtering scheme and ordering technique, we present two algorithms for handling different constraint models. Our experimental results with both real and synthetic datasets demonstrate the effectiveness and efficiency of our proposed algorithms.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Weng C-H, Huang TC-K (2015) Knowledge discovery of customer purchasing intentions by plausible-frequent itemsets from uncertain data. Appl Intell 43(3):598–613
Syaekhoni MA, Lee C, Kwon YS (2016) Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 10:1–21
Huang J, Zhu K, Zhong N (2016) A probabilistic inference model for recommender systems. Appl Intell 45(3):686–694
Vlachou A, Doulkeridis C, Kotidis Y, Nørvåg K (2010) Reverse top-k queries. In: Proceedings of 26th international conference on data engineering (ICDE). IEEE, pp 365–376
Vlachou A, Doulkeridis C, Nørvåg K, Kotidis Y (2010) Identifying the most influential data objects with reverse top-k queries. Proc VLDB Endow 3(1–2):364–372
Koh J-L, Lin C-Y, Chen AL (2014) Finding k most favorite products based on reverse top-t queries. VLDB J 23(4):541–564
Gkorgkas O, Vlachou A, Doulkeridis C, Nørvåg K (2015) Finding the most diverse products using preference queries. In: Proceedings of the 18th international conference on extending database technology (EDBT), pp 205–216
Wang S, Cheema MA, Zhang Y, Lin X (2015) Selecting representative objects considering coverage and diversity. In: Proceedings of the 2nd international ACM workshop on managing and mining enriched geo-spatial data. ACM, pp 31–38
Zhang Z, Jin C, Kang Q (2014) Reverse k-ranks query. Proc VLDB Endow 7(10):785–796
Yang J, Zhang Y, Zhang W, Lin X (2016) Influence based cost optimization on user preference. In: Proceedings of 32nd international conference on data engineering (ICDE). IEEE, pp 709–720
Peng P, Wong RC-W (2015) k-hit query: top-k query with probabilistic utility function. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data (SIGMOD). ACM, pp 577–592
Gao Y, Liu Q, Chen G, Zheng B, Zhou L (2015) Answering why-not questions on reverse top-k queries. Proc VLDB Endow 8(7):738–749
Dellis E, Seeger B (2007) Efficient computation of reverse skyline queries. In: Proceedings of the 33rd international conference on very large data bases (VLDB), VLDB Endowment, pp 291–302
Gao Y, Liu Q, Zheng B, Chen G (2014) On efficient reverse skyline query processing. Exp Syst Appl 41(7):3237–3249
Arvanitis A, Deligiannakis A, Vassiliou Y (2012) Efficient influence-based processing of market research queries. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM). ACM, pp 1193–1202
Islam M S, Liu C (2016) Know your customer: computing k-most promising products for targeted marketing. VLDB J 25(4):545–570
Lian X, Chen L (2008) Monochromatic and bichromatic reverse skyline search over uncertain databases. In: Proceedings of the 2008 ACM SIGMOD International conference on management of data (SIGMOD). ACM, pp 213–226
Wu X, Tao Y, Wong RC-W, Ding L, Yu JX (2009) Finding the influence set through skylines. In: Proceedings of the 12th international conference on extending database technology: advances in database technology. ACM, pp 1030–1041
Borzsony S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of 17th international conference on data engineering (ICDE). IEEE, pp 421–430
Korn F, Muthukrishnan S (2000) Influence sets based on reverse nearest neighbor queries. In: ACM sigmod record, vol 29. ACM, pp 201–212
Koizumi K, Eades P, Hiraki K, Inaba M (2018) Bjr-tree: fast skyline computation algorithm using dominance relation-based tree structure. Int J Data Sci Anal 1–18
Kim J, Kim MH (2018) An efficient parallel processing method for skyline queries in mapreduce. J Supercomput 74(2):886– 935
Wang G, Xin J, Chen L, Liu Y (2012) Energy-efficient reverse skyline query processing over wireless sensor networks. IEEE Trans Knowl Data Eng 24(7):1259–1275
Deshpande PM, Deepak P (2011) Efficient reverse skyline retrieval with arbitrary non-metric similarity measures. In: Proceedings of the 14th international conference on extending database technology (EDBT). ACM, pp 319–330
Park Y, Min J-K, Shim K (2013) Parallel computation of skyline and reverse skyline queries using mapreduce. Proc VLDB Endow 6(14):2002–2013
Islam MS, Liu C, Rahayu W, Anwar T (2016) Q + tree: an efficient quad tree based data indexing for parallelizing dynamic and reverse skylines. In: Proceedings of the 25th ACM international on conference on information and knowledge management (CIKM). ACM, pp 1291–1300
Islam MS, Zhou R, Liu C (2013) On answering why-not questions in reverse skyline queries. In: Proceedings of 29th international conference on data engineering (ICDE). IEEE, pp 973– 984
Gao Y, Liu Q, Chen G, Zhou L, Zheng B (2016) Finding causality and responsibility for probabilistic reverse skyline query non-answers. IEEE Trans Knowl Data Eng 28(11):2974– 2987
Lin C-Y, Koh J-L, Chen AL (2013) Determining k-most demanding products with maximum expected number of total customers. IEEE Trans Knowl Data Eng 25(8):1732–1747
Zhou X, Li K, Xiao G, Zhou Y, Li K (2016) Top k favorite probabilistic products queries. IEEE Trans Knowl Data Eng 28(10):2808–2821
Xu S, Lui J (2016) Product selection problem: improve market share by learning consumer behavior. ACM Trans Knowl Discov Data 10(4):34
Wan Q, Wong RC-W, Peng Y (2011) Finding top-k profitable products. In: Proceedings of 27th international conference on data engineering (ICDE). IEEE, pp 1055–1066
Peng Y, Wong RC-W, Wan Q (2012) Finding top-k preferable products. IEEE Trans Knowl Data Eng 24(10):1774–1788
Lin X, Yuan Y, Zhang Q, Zhang Y (2007) Selecting stars: the k most representative skyline operator. In: Proceedings of 23rd international conference on data engineering (ICDE). IEEE, pp 86–95
Tao Y, Ding L, Lin X, Pei J (2009) Distance-based representative skyline. In: Proceedings of 25th international conference on data engineering (ICDE). IEEE, pp 892–903
Wang S, Cheema MA, Zhang Y, Lin X (2015) Selecting representative objects considering coverage and diversity. In: Proceedings of 2nd international ACM workshop on managing and mining enriched geo-spatial data. ACM, pp 31–38
Magnani M, Assent I, Mortensen ML (2014) Taking the big picture: representative skylines based on significance and diversity. VLDB J 23(5):795–815
Sarma AD, Lall A, Nanongkai D, Lipton RJ, Xu J (2011) Representative skylines using threshold-based preference distributions. In: Proceedings of 27th international conference on data engineering (ICDE). IEEE, pp 387–398
Huang J, Zhu K, Zhong N (2016) A probabilistic inference model for recommender systems. Appl Intell 45(3):686–694
Yu Y, Wang C, Wang H, Gao Y (2017) Attributes coupling based matrix factorization for item recommendation. Appl Intell 46(3):521–533
Mehlawat MK, Gupta P (2015) Cots products selection using fuzzy chance-constrained multiobjective programming. Appl Intell 43(4):732–751
Cui B, Lu H, Xu Q, Chen L, Dai Y, Zhou Y (2008) Parallel distributed processing of constrained skyline queries by filtering. In: Proceedings of 24th international conference on data engineering (ICDE). IEEE, pp 546–555
Acknowledgements
This research was supported by the Natural Science Foundation of Hunan Province under Grant Number 2016JJ3012.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Yin, B., Wei, X. & Liu, Y. Finding the most influential product under distribution constraints through dominance tests. Appl Intell 49, 723–740 (2019). https://doi.org/10.1007/s10489-018-1293-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1293-0