0% found this document useful (0 votes)
47 views5 pages

An Efficient Algorithm (Fufm) For Mining Frequent Item Sets

The document describes an efficient algorithm called Frequent Utility Frequent Mining (FUFM) for mining frequent item sets from transactional databases. FUFM finds all utility-frequent itemsets that satisfy given utility and support threshold constraints. It divides itemsets into four categories: high utility high frequency, high utility low frequency, low utility high frequency, and low utility low frequency. The algorithm works by generating candidate itemsets of increasing length and calculating their extended support to identify utility-frequent itemsets.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views5 pages

An Efficient Algorithm (Fufm) For Mining Frequent Item Sets

The document describes an efficient algorithm called Frequent Utility Frequent Mining (FUFM) for mining frequent item sets from transactional databases. FUFM finds all utility-frequent itemsets that satisfy given utility and support threshold constraints. It divides itemsets into four categories: high utility high frequency, high utility low frequency, low utility high frequency, and low utility low frequency. The algorithm works by generating candidate itemsets of increasing length and calculating their extended support to identify utility-frequent itemsets.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

International Journal of Application or Innovation in Engineering & Management (IJAIEM)

Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 8, August 2013 ISSN 2319 - 4847

AN EFFICIENT ALGORITHM (FUFM) FOR MINING FREQUENT ITEM SETS


Nazeer.Shaik1, N.L. Prasanna2
1 2

Pursuing M.Tech in CSE at Vignan's LARA Institute Of Technology and Science, Vadlamudi, Guntur Dist., A.P., India. Asst.Prof, Department of CSE, Vignan's LARA Institute Of Technology & Science, Vadlamudi Guntur Dist., A.P., India.

ABSTRACT
As the trends in the technology developing data mining turns to the advanced aspects. This paper explains about the item set mining. Frequent item sets are the one occurring randomly while mining the transactional data base. Utility based data mining is a new research area interested in all types of utility factors in data mining processes and targeted at incorporating utility considerations in data mining tasks. Advanced area in this field is the fast utility mining process which gives accurate results. Frequent Utility Frequent Mining(FUFM) is the new algorithm introduced here to retrieve the item sets fast from transactional database. The main aim in this paper is to retrieve the frequent utility itemsets and cluster those item sets with keyword or by number assignment. The results will be displayed without any loss of data.

Keywords: Frequent Utility Frequent Mining(FUFM), Umining, Knowledge Discovery in Databases (KDD) ,UP growth.

1. INTRODUCTION
Data mining and knowledge discovery from data bases has received much attention in recent years. Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, previously unknown and potentially useful patterns in data. These patterns are used to make predictions or classifications about new data, explain existing data, summarize the contents of a large database to support decision making and provide graphical data visualization to aid humans in discovering deeper patterns. The main aim in this paper is identifying and grouping the frequently used item sets from the transactional database. While in the auditing process of data base, the items which are purchased or collected frequently and clustering the frequent items displays as the better mining process.

2. BACKGROUND WORK
KDD: The KDD process comprises of a few steps leading from raw data to some form of new knowledge. The volume of data contained in a database often exceeds the ability to analyze it efficiently, resulting in a gap between the collection of data and its understanding. A new concept is proposed for generating different kinds of itemsets namely High utility and high frequent itemsets (HUHF), High utility and low frequent itemsets (HULF), Low utility and high frequent itemsets (LUHF) and Low utility and low frequent itemsets (LULF). These itemsets are generated using the basic framework of FUM and FUFM algorithms. Customer Relationship Management (CRM) is incorporated into the system by generating a list of customers who are frequent buyers of these four different kind of itemsets.

3. OVERVIEW OF EXISTING SYSTEM


1 The traditional association rule mining (ARM) is used to identify frequently occurring patterns of item sets. 2 ARM model treats all the items in the database equally by only considering if an item is present in a transaction or not. 3 Though, frequency of occurrence may not express the semantics of applications, because the user's interest may be related to other factors, such as cost, profit, or aesthetic value. . 4 For example, a sales manager may not be interested in frequent item sets that do not generate significant profit. The frequent item set mining approach may not satisfy a sales managers goal. The support measure reflects the statistical correlation of items, but it does not reflect their semantic significance. In other words, statistical correlation may not measure how useful an item set is in accordance with a users preferences (i.e., profit). The profit of an item set depends not only on the support of the item set, but also on the prices of the items in that item set. Frequent Utility Frequent Mining(FUFM) consists of different methods. They are as follows a. HUHF b. HULF c. LUHF d. LULF

Volume 2, Issue 8, August 2013

Page 81

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 8, August 2013 ISSN 2319 - 4847
Frequent Utility-Frequent Mining(FUFM) which finds all utility-frequent itemsets within the given utility and support constraints threshold. Utility-frequent itemsets are a special form of high utility itemsets using Selective Item Replication. There are two divisions are maintained. That is 1. High Utility High Frequency(HUHF), High Utility Low Frequency(HULF) 2. Low Utility High Frequency(LUHF), Low Utility Low Frequency(LULF) a. HUHF: High utility and high frequency itemsets by incorporating support into FUM algorithm. First phase of this algorithm is to generate high utility itemsets H. In the second phase, support value is calculated for each itemset in H b. HULF: high utility and low frequent itemset by support both FUM and FUFM and generated.algorithms. The first phase is to generate high utility itemsets using FUM algorithm. The second phase high utility high frequent itemsets are generated using FUFM(HU). HUHF itemsets are

c. LUHF: To generate Low utility and high frequent itemsets. It follows the basic frame work of FUFM algorithm. d. LULF: Low utility and low frequent First phase using exhaustive search low utility itemsets are determined. Second phase, using set difference function low utility low frequent itemsets are generated from LU and LUHF.

4. ALGORITHM FUFM
Task: Discovery of Utility Frequent Itemsets Input Database DB Constraints minUtil and minSup Output High Utility High Frequent itemsets (HUHF) [1] L = 1 [2] Find the set of candidates of length L with support >= minSup [3] Compute extended support for all candidates and output utility frequent itemsets [4] L += 1 [5] Use the frequent itemset mining algorithm to obtain new set of frequent candidates of length L from the old set of frequent candidates [6] Stop if the new set is empty otherwise go to step[3] Algorithm Working process The above steps proved success in finding the frequently occurred high utility itemsets. This is completely based on the threshold value which we assumed. Each and every stage is compared with the assumed value. Initial step here is to assigning the length of the candidate and comparing the value with the minimum support value. If it is greater than or equal to the minimum support, length of the set of candidates are displayed. Next step is for calculating the frequently occurred item sets and arranging the item sets into ascending order. With the use of frequent item set mining algorithm we get the frequent candidates of length L from the old set of frequent itemsets. While in the rotation of this process if we occur a new set with empty then stop the performance if not repeat the calculation process again and again until it get for empty set. Then proceed to stop the process and note the results occurred.

5. DATA FLOW DIAGRAM

The above diagram depicts the complete chain process of calculating and displaying the frequent itemsets. In this comparing with threshold value gives the frequent utility item sets as the results.

Volume 2, Issue 8, August 2013

Page 82

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 8, August 2013 ISSN 2319 - 4847 6. RESULTS
Step 1 Entries in Umining algorithm

Step 2 Then open FUM-F algorithm

Step 3 Is opening four different mining algorithms. 1. HUHF- High Utility High Frequent Mining

To view customer details press customer detail button

2. HULF-High Utility Low Frequent Mining

Volume 2, Issue 8, August 2013

Page 83

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 8, August 2013 ISSN 2319 - 4847
To view customer details press customer detail button

3. LUHF-Low Utility High Frequent Mining

To view customer details press customer detail button 4. LULF- Low Utility Low Frequent Mining

To view customer details press customer detail button

7. CONCLUSION
The UMining and FUM algorithms are for mining all high utility item sets. FUFM and FUM-F algorithms use both the statistical and the utility measures. From the basic framework of these algorithms the different kinds of item sets namely high utility high frequent, high utility low frequent, low utility high frequent and low utility low frequent are generated. Then Customer Relationship Management (CRM) is incorporated into the system by tracking the customers who are frequent buyers of the different kinds of item sets.

REFERENCES
[1] A. Erwin, R. P. Gopalan and N. R. Achuthan, Efficient mining of high utility itemsets from large datasets, in Proc. of PAKDD 2008, LNAI 5012, pp. 554-561 [2] H. F. Li, H. Y. Huang, Y. C. Chen, Y. J. Liu and S. Y. Lee, Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams, in Proc. of the 8th IEEE Int'l Conf. on Data Mining, pp. 881-886, 2008. [3] Y. Liu, W. Liao and A. Choudhary, A fast high utility itemsets mining algorithm, in Proc. of the Utility-Based Data Mining Workshop, 2005. [4] R. Agrawal and R. Srikant. Fast algorithms for mining association rules, in Proc. of the 20th VLDB Conf., pp. 487-499, 1994 [5] R. Agrawal and R. Srikant, Mining Sequential Patterns, in Proc. of the 11th Intl Conference on Data Engineering, pp. 3-14, Mar., 1995. [6] C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong and Y.-K. Lee. Efficient tree structures for high utility pattern mining in incremental databases, IEEE Transactions on Knowledge and Data Engineering, Vol. 21, Issue 12, pp.1708-1721, 2009. [7] Nazeer shaik, B. Renuka Devi, N L Prasanna, V.Satish kumar An Algorithm Used For Mining Frequent Pattern Sets From Very Large Databases in the international conference. [8] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules, Proc. 1994 Intl Conf. Very Large Data Bases (VLDB 94), pp. 487-499, Sept. 1994. [9] D. Burdick, M. Calimlim, and J. Gehrke, MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases, Proc.2001 Intl Conf. Data Eng. (ICDE 01), pp. 443-452, Apr. 2001.

Volume 2, Issue 8, August 2013

Page 84

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 8, August 2013 ISSN 2319 - 4847
[10] A.W.-C. Fu, R.W.-W. Kwong, and J. Tang, Mining n-Most Interesting Itemsets, Proc. 2000 Intl Symp. Methodologies for Intelligent Systems (ISMIS 00), pp. 59-67, Oct. 2000. [11] H. F. Li, S. Y. Lee, & M. K. Shan An efficient algorithm for mining frequent itemsets over the entire history of data streams Proc. Int. Workshop on Knowledge Discovery in Data Streams, 2004. [12] J. Chang, W. Lee, Finding recently frequent itemsets adaptively over online transactional data streams, Information Systems, vol. 31 (8), pp. 849-869, 2006. [13] Y.-C. Li, J.-S. Yeh and C.-C. Chang, "A fast algorithm for mining share-frequent itemsets," in Proc. APWeb 2005, 417-428. [14] Frequent Itemset Mining Dataset Repository (FIMDR), http://fimi.cs.helsinki.fi/data/ (accessed 2009). [15] Pei, J., Han, J., Lakshmanan, L.V.S.: Mining frequent itemsets with convertible constraints. In: Proc. IEEE ICDE 2001, pp. 433442 (2001) [16] Hai Duong, Tin Truong, Bac Le An Efficient Algorithm for Mining Frequent Itemsets with Single Constraint.

AUTHOR PROFILE
Nazeer.Shaik, pursuing M.Tech in Computer Science Engineering at Vignan's LARA Institute Of Technology and Science, Vadlamudi, Guntur Dist., A.P., India. His research interests are Image Processing, Pattern Recognition and Data Mining. E-mail id: nazeer723@gmail.com.

N.L.Prasanna, Asst.Prof, Department of CSE, Vignan's LARA Institute Of Technology & Science, Vadlamudi Guntur Dist., A.P., India. Her research interests are Data Mining Data Warehousing and Image Processing. Email id: prasanna.manu@gmail.com.

Volume 2, Issue 8, August 2013

Page 85

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy