Abstract
Web Usage Mining is the application of data mining techniques to large Web data repositories in order to extract usage patterns. As with many data mining application domains, the identification of patterns that are considered interesting is a problem that must be solved in addition to simply generating them. Aneces sary step in identifying interesting results is quantifying what is considered uninteresting in order to form a basis for comparison. Several research efforts have relied on manually generated sets of uninteresting rules. However, manual generation of a comprehensive set of evidence about beliefs for a particular domain is impractical in many cases. Generally, domain knowledge can be used to automatically create evidence for or against a set of beliefs. This paper develops a quantitative model based on support logic for determining the interestingness of discovered patterns. For Web Usage Mining, there are three types of domain information available; usage, content, and structure. This paper also describes algorithms for using these three types of information to automatically identify interesting knowledge. These algorithms have been incorporated into the Web Site Information Filter (WebSIFT) system and examples of interesting frequent itemsets automatically discovered from real Web data are presented.
Supported by NSF grant EHR-9554517
Supported by ARL contract DA/DAKF11-98-P-0359
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Funnel web professional. http://www.activeconcepts.com.
Hit list commerce. http://www.marketwave.com.
Webtrends log analyzer. http://www.webtrends.com.
World wide web committee web usage characterization activityhttp://www.w3.org/WCA.
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of the 20th VLDB Conference, pages 487–499, Santiago, Chile, 1994.
J. F. Baldwin. Evidential support logic programming. Fuzzy Sets and Systems, 24(1):1–26, 1987.
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In ACM SIGMOD International Conference on Management of Data, 1997.
Alex Buchner and Maurice D Mulvenna. Discovering internet marketing intelligence through online analytical web usage mining. SIGMOD Record, 27(4):54–61, 1998.
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In 16th International Conference on Distributed Computing Systems, pages 385–392, 1996.
Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava. Web mining: Information and pattern discovery on the world wide web. In International Conference on Tools with Artificial Intelligence, pages 558–567, Newport Beach, 1997. IEEE.
Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava. Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1(1), 1999.
Robert Cooley, Pang-Ning Tan, and Jaideep Srivastava. Discovery of interesting usage patterns from web data. Technical Report TR 99-022, University of Minnesota, 1999.
T. Joachims, D. Freitag, and T. Mitchell. Webwatcher: At our guide for the world wide web. In The 15th International Conference on Artificial Intelligence, Nagoya, Japan, 1997.
Bing Liu, Wynne Hsu, and Shu Chen. Using general impressions to analyze discovered classification rules. In Third International Conference on Knowledge Discovery and Data Mining, 1997.
H. Mannila, H. Toivonen, and A. I. Verkamo. Discovering frequent episodes in sequences. In Proc. of the First Int’l Conference on Knowledge Discovery and Data Mining, pages 210–215, Montreal, Quebec, 1995.
Olfa Nasraoui, Raghu Krishnapuram, and Anupam Joshi. Mining web access logs using a fuzzy relational clustering algorithm based on a robust estimator. In Eighth International World Wide Web Conference, Toronto, Canada, 1999.
D.S.W. Ngu and X. Wu. Sitehelper: Alo calized agent that helps incremental exploration of the world wide web. In 6th International World Wide Web Conference, Santa Clara, CA, 1997.
Balaji Padmanabhan and Alexander Tuzhilin. A belief-driven method for discovering unexpected patterns. In Fourth International Conference on Knowledge Discovery and Data Mining, pages 94–100, New York, New York, 1998.
Mike Perkowitz and Oren Etzioni. Adaptive web sites: Automatically synthesizing web pages. In Fifteenth National Conference on Artificial Intelligence, Madison, WI, 1998.
G. Piatetsky-Shapiro and C. J. Matheus. The interestingness of deviations. In AAAI-94 Workshop on Knowledge Discovery in Databases, pages 25–36, 1994.
Peter Pirolli, James Pitkow, and Ramana Rao. Silk from a sow’s ear: Extracting usable structures from the web. In CHI-96, Vancouver, 1996.
James E Pitkow. Summary of www characterizations. In Seventh International World Wide Web Conference, 1998.
A. L. Ralescu and J. F. Baldwin. Concept learning from examples and counter examples. International Journal of Man-Machine Studies, 30(3):329–354, 1989.
G. Schafer. Mathematical Theory of Evidence. Princeton University Press, 1976.
Cyrus Shahabi, Amir M Zarkesh, Jafar Adibi, and Vishal Shah. Knowledge discovery from users web-page navigation. In Workshop on Research Issues in Data Engineering, Birmingham, England, 1997.
A. Silberschatz and A. Tuzhilin. What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Eng., 8(6):970–974, 1996.
Myra Spiliopoulou and Lukas C Faulstich. Wum: Aw eb utilization miner. In EDBT Workshop WebDB98, Valencia, Spain, 1998. Springer Verlag.
Shivakumar Vaithaynathan. Data mining on the internet — a kdd-98 exhibit presentation. http://www.epsilon.com/kddcup98/mining/, 1998.
L. A. Zadeh. A theory of approximate reasoning. Machine Intelligence, 9:149–194, 1979.
O. R. Zaiane, M. Xin, and J. Han. Discovering web access patterns and trends by applying olap and data mining technology on web logs. In Advances in Digital Libraries, pages 19–29, Santa Barbara, CA, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cooley, R., Tan, PN., Srivastava, J. (2000). Discovery of Interesting Usage Patterns from Web Data. In: Masand, B., Spiliopoulou, M. (eds) Web Usage Analysis and User Profiling. WebKDD 1999. Lecture Notes in Computer Science(), vol 1836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44934-5_10
Download citation
DOI: https://doi.org/10.1007/3-540-44934-5_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67818-2
Online ISBN: 978-3-540-44934-8
eBook Packages: Springer Book Archive