IJSRD - International Journal for Scientific Research & Development| Vol. 2, Issue 08, 2014 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 181
Parallel Key Value Pattern Matching Model
R. Senthamil Selvi1
Dr. T. Abdul Razak2
1
Assistant Professor 2
Associate Professor
1,2
Department of Computer Science
1,2
Jamal Mohamed College (Autonomous), Tiruchirappalli
Abstract— Mining frequent itemsets from the huge
transactional database is an important task in data mining.
To find frequent itemsets in databases involves big decision
in data mining for the purpose of extracting association
rules. Association rule mining is used to find relationships
among large datasets. Many algorithms were developed to
find those frequent itemsets. This work presents a
summarization and new model of parallel key value pattern
matching model which shards a large-scale mining task into
independent, parallel tasks. It produces a frequent pattern
showing their capabilities and efficiency in terms of time
consumption. It also avoids the high computational cost. It
discovers the frequent item set from the database.
Keywords: Data mining, FP Growth, Frequent Item Set
Mining, Association rule Mining
I. INTRODUCTION
Data Mining is a collection of processes for efficient
discovery of previously unknown, valid, useful and
understandable patterns in large databases. The patterns
should be actionable. So that they may be used in an
enterprise’s resolution processes. It has many software and
tools; they are used to analyze the data from large databases.
Mining Frequent Pattern is an important concept
for data mining. It gives the minimum support for threshold
in frequent itemset. Association rule mining discovers
relations between variables in large databases. Maximal
Frequent Itemset is an item set that occur maximum number
of times to the other itemset. The main purpose is to produce
a large number of results as a pattern.
Closed Frequent Itemset is linked to all frequent
itemsets. Each item can be linked to other item and to form
closed group itemsets. For example, if four itemsets are
taken as, s1, s2, s3 and s4. The first three items can be
linked to each other like, (s1, s2), (s1, s3) and (s2, s3). So
the three items form a group between them and have a
closed itemset. The main purpose is to produce a large
number of results as a pattern.
II. MOTIVATION
Business data are stored in computer and it allows users to
navigate through the data in real time. The evolution of data
mining is to support three technologies. They are data
collection, high performance computing and data mining
algorithm. Data mining has many algorithms in frequent
itemset mining. Every algorithm can perform well.
Especially, the FP-growth algorithm avoids the generation
of large numbers of candidate sets. The main idea of the
algorithm is to maintain a frequent pattern tree. All
algorithms of frequent itemset mining do not use the concept
of parallel key value model. It distributes the work through
the program to easily search and retrieve the frequent pattern
data.
III. RELATED WORK
Jiawei Han et al. [1] proposed FP-growth approach for
mining frequent itemsets without candidate generation. It is
an extended prefix-tree structure for storing quantitative
information about frequent patterns. And also some
optimizations are available to speed up FP-growth.
Christian Borgelt proposed a C implementation of a
FP-growth algorithm. The pruning concept is achieved by
traversing the levels of the FP-tree from top to bottom [2].
In implementation, the initial FP-tree is built from top to
bottom and built from a main memory representation of the
transaction database as a simple list of integer arrays. FP
growth algorithm behaves exactly the opposite way as
Apriori, which in implementation usually runs faster if items
are sorted in the ascending order.
Aiman Moyaid Said et al. [3] proposed a
comparative study of FP-growth variation. It is an
alternative method to the Apriori-based approach. It
represents the frequent itemset into a frequent pattern tree or
FP-tree, which retains the information of itemset. Using the
compact tree structure, the FP-growth algorithm mines all
the frequent itemsets.
B. Santhosh Kumar et al. [4] proposed a
comparison of memory usage and time usage in Apriori
algorithm and FP growth algorithm. It uses a compact data
structure and eliminates the repeated database scan. The
algorithm has some advantages like completeness and
compactness.
Haoyuan li et al [5] proposed FP-growth based on
the principle of divide and conquer way. That is to
decompose a mining task into a smaller task and totally
avoid candidate generation. In this paper, parallel algorithms
were developed for reducing memory use and computational
cost on every machine. Recent work in parallelizing FP-
growth suffers from high communication cost. Here a
MapReduceModel of parallel FP-growth algorithm (PFP)
which cleverly slices a large-scale mining task into
autonomous computational tasks and maps them into
MapReduce jobs achieving non-linear speedup was
proposed. The paper is based on novel data and
computation distribution scheme, which virtually eliminates
communication among computers and use map reduce
model. It is effective in mining tag-tag associations and
webpage-webpage associations to support query
recommendation or related search.
Bharat Gupta et al. proposed FP-growth algorithm
[6] that compresses the database of frequent itemsets into
frequent pattern tree recursively in the same order of
magnitude as the numbers of frequent patterns. It then
divides the compressed database into a set of conditional
databases. The FP-growth technique constructs conditional
frequent pattern tree and conditional pattern base from
database which satisfy the minimum support.
Parallel Key Value Pattern Matching Model
(IJSRD/Vol. 2/Issue 08/2014/044)
All rights reserved by www.ijsrd.com 182
Marek wojciechowski et al. proposed the common
counting method to work with FP-Growth algorithm and
evaluate the efficiency of both methods when FP-Growth
basically used as a mining algorithm. [7] They consider the
problem of optimizing batches of frequent itemset queries.
This paper uses multiple query optimization methods, like
common counting and mine merge. This methods reduces
the I/O cost for common execution tasks and executes them
only once for the whole data. The experiment shows that
common counting for FP-Growth reduces the overall
processing time.
E R Naganathan et al. [8] proposed structured data
mining. It is a major research topic in Data Mining. One of
the common types of representation of structured data is
graph. Graph-based data mining show a number of methods
to mine the relational aspects of data. Graph is an alternate
approach of modeling the objects. Graph-based data mining
(GDM) is the task of finding novel, and understandable
graph-theoretic patterns in a graph representation of data. It
presents a new process to find out the Normalization
Technique for the sub graphs obtained from the FP-growth
model. This process may be one of the perfect ranking
schemes among the sub graphs mined and this ranking
scheme will play an efficient role in the sub graph
applications.
IV. EXISTING PARALLEL FP-GROWTH MODEL
Parallel FP-Growth (PFP) means mining the complete set of
frequent patterns by pattern fragment growth in parallel.
Generally it depends on distributed machines. Each machine
executes on an independent group of mining tasks. The FP-
Growth algorithm runs much faster than the Apriori, but the
parallel FP-Growth algorithm is too faster than the FP-
Growth algorithm. It converts the DB into new databases of
group-dependent transactions. So that the FP-trees built
from different group-dependent transactions are
independent. It is used to eliminate the computational
dependencies between machines. And also it demonstrates
that PFP to be promising for supporting query
recommendation for search engines.
The PFP explains the resource challenges for FP-
Growth algorithm. They are storage, computation
distribution, costly communication and support threshold
value in FP-growth. Given a set of transaction database, PFP
uses three MapReduce phases to parallelize FP-Growth.
The PFP framework has five stages of
computation. They are shard, parallel counting, graphing
items, parallel FP-Growth and aggregating. PFP using
parallel counting is a classical application of MapReduce
approach.
PFP using MapReduce approach is used to shard a
large-scale mining task into independent computational
tasks. And also it is able to address the issues of memory use
and fault tolerance. So PFP is effective in mining tag-tag
associations and webpage-webpage associations to support
query recommendation. And the disadvantage of this
method is distributed machines. Because it will increase cost
of each machines.
V. THE PROPOSED MODEL
This model uses a frequent pattern to work faster than other
methods. Here, two tasks are used. They are XModel and
PModel. This Frequent Pattern proves that PModel task is
better and works faster than XModel. Also in this PModel
computing time is saved. This model is suitable for all the
algorithms in data mining.
The processing of XModel is used to retrieve data
from the user interface and generate a frequent itemset. If
items are equal, the process will end. Otherwise it will
return to the temporary database and again execute the
whole process until the condition is true.
The Processing of PModel is used to retrieve data
from the user interface and distribute the work using the
key-value for preparing frequent item sets. The intersect
operation between all frequent item sets are executed and
another frequent list called the F-list is produced. Group of
all F-list is called G-list. Then it checks whether the G-list
has more equal frequent items and the process will end.
Otherwise the whole process will be repeated until the
frequent item sets are equal.
A. Searching Algorithm
The searching algorithm illustrated in Fig. 1 sorts the set of
items in descending order and connects a database using
JdbcOdbcDriver; this process is showed in steps 8 to 11.
Then it checks if the condition record set is null, and moves
to the next record set. Otherwise it will be cleared. To select
a frequent item from database using a command execute a
Query statement; select a product from transaction table.
After selection process is completed, the database
connection is closed. This process is showed in Steps 18 to
20. All items are collected to generate groups.
Fig. 1: The Searching Algorithm
Parallel Key Value Pattern Matching Model
(IJSRD/Vol. 2/Issue 08/2014/044)
All rights reserved by www.ijsrd.com 183
B. XModel Algorithm
Fig. 2: The XModel Algorithm
The XModel illustrated in Fig. 2. is used to execute
a whole database and this takes more time for execution.
First, to read databases, then to select all items and execute
them; therefore the frequent items can be displayed at end of
the program. The main work of XModel is to print the
starting and ending time. This process is showed in steps 14
to 18. The total time duration can be calculated by the
difference between ending time and finishing time. So, the
duration of time taken to finish is in the order of
milliseconds.
C. PModel Algorithm
The PModel Algorithm illustrated in Fig. 3, works as
follows:
(1) Scan the transaction database once to find all
frequent items and their supports.
(2) Sort the frequent items in descending order of their
support.
(3) Get the first transaction from the transaction
database. Remove all non-frequent items and list
the remaining items according to the order in the
sorted frequent items.
(4) Get the next transaction from the transaction
database. Remove all non-frequent items and list
the remaining items according to the order in the
sorted frequent items.
(5) Group all sorted frequent itemsets and display the
start time and end time.
(6) Continue with step 4 until all transactions of the
database are processed.
Fig. 3: The PModel Algorithm
D. Architectural Diagram
Fig. 4: Parallel Key Value Pattern Matching Model Using
XModel
In Fig. 4, a single program for searches a data from
the database with the help of single key value. So it
produces a large frequent item set. It is a difficult process to
get an exact frequent item set. Because it takes a lot of time
to execute the processes. Now, the user gets more equal
Parallel Key Value Pattern Matching Model
(IJSRD/Vol. 2/Issue 08/2014/044)
All rights reserved by www.ijsrd.com 184
frequent item sets, and then the process will end. Otherwise
the process will be repeated until the user gets the exact
frequent item set.
Fig. 5: Parallel Key Value Pattern Matching using PModel
Architectural Diagram for Parallel Key Value
Pattern Matching model using PModel is illustrated in Fig.
5. The PModel contains a program for searching a data from
the database with the help of key value. So each program
contains some frequent itemsets that can be denoted as
frequent set1, frequent set2, and frequent set3. Then it
performs the operation of intersection between frequent set
1 and frequent set 2, frequent set 2 and frequent set 3,
frequent set3 and frequent set 1. After performing this
operation, it gives another frequent list; it is called F-list I,
F-list II, and F-list III respectively. After that, group all F-
lists, and then it gives frequent item set. This set can be
denoted as “G-List”. Now, the users get more equal frequent
item sets, and then the process will end. Otherwise the
process will be repeated until the user gets the exact
frequent item set.
VI. RESULTS AND DISCUSSION
This section contains the comparison table and graph of
XModel and PModel. The proposed model is applied into
the data of transactions.
The following table provides the summing up of results.
Number of Records
Finishing Time (Milliseconds)
X Model P Model
25 844 194
50 1567 1477
75 3010 2270
100 7929 3509
Table 1: Values for Comparison Graph
The execution time of XModel and PModel is
differentiated from each other in milliseconds. The
execution time is based on the number of records. This is
shown in Table 1. For example, in 100 records, the XModel
can take 7929 milliseconds for execution while the PModel
can execute in 3509 milliseconds only.
Fig. 6: XModel Vs PModel
Fig. 6. shows the comparison graph of XModel vs.
PModel. Number of records represents X-axis and
milliseconds Y-axis. The bar chart differentiates the
XModel and PModel. The experiments show that Parallel
Key Value Matching Model reduces the overall processing
time.
VII. CONCLUSION
The Parallel Key Value Pattern Matching Model is suitable
for all algorithms in frequent itemset mining, which are
usually of large scale distribution. It demonstrated that
parallel key value pattern matching model is effective for
discovering frequent itemsets. This model contained two
methods, they are XModel and PModel. The existing model
and proposed model are denoted as XModel and PModel
respectively. The XModel takes more time to execute a
program. The comparison is based on the performance of
speedup and efficiency. The PModel produced better results
of speedup and efficiency than XModel.
REFERENCES
[1] Jia Wei Han, Jian Pei, Yiwen Yin, Runying Mao,
“Mining Frequent Patterns without Candidate
Generation: A Frequent-Pattern Tree Approach”,
Data Mining and Knowledge Discovery, 8, 53 – 87,
Kluwer Academic Publishers, Netherlands, 2004.
[2] Christian Borgelt, “An Implementation of the FP-
Growth Algorithm”, Department of Knowledge
Processing and Language Engineering, Germany,
2005.
[3] Aiman Moyaid Said, Dr. P D D.Dominic, Dr.
Azween B Abdullah,” A Comparative Study of FP-
Growth Variations”, Department of Computer and
Information Sciences, International Journal of
Computer Science and Network Security, Vol.9,
No.5, Petronas, May 2009.
[4] B. Santhosh Kumar and K.V. Rukmani,
“Implementation of Web Usage Mining using Apriori
and FP Growth Algorithms”, Department of
Computer Science, Int.J. Of Advanced Networking
Parallel Key Value Pattern Matching Model
(IJSRD/Vol. 2/Issue 08/2014/044)
All rights reserved by www.ijsrd.com 185
and Applications, Vol.1, Issue: 06, Pages: 400-404,
Ketti, the Nilgiris, and Feb – April 2010.
[5] Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang,
Edward Chang, “PFP: Parallel FP-Growth for Query
Recommendation”, Google Beijing Research, china,
2010.
[6] Bharat Gupta and Dr. Deepak Garg, “FP-Tree Based
Algorithm Analysis: FP-Growth, COFI-Tree and CT-
PRO”, Department of Computer Science,
International Journal on Computer Science and
Engineering (IJCSE), ISSN: 0975-3397, Vol. 3, No.
7, Patiala, India, July 2011.
[7] Marek Wojciechowski, Krzysztof Galecki, and
Krzysztof Gawronek, “Concurrent Processing of
Frequent Itemset Queries using FP-Growth
Algorithm”, Department of Computer Science,
Poland.
[8] E R Naganathan, S.Narayanan and K. Ramesh
Kumar, “FP-growth Based new normalization for sub
graph ranking”, Department of Computer
Application, International Journal of Database
Management System(IJDMS), Vol. 3, No.1, Tamil
Nadu, February 2011.

Parallel Key Value Pattern Matching Model

  • 1.
    IJSRD - InternationalJournal for Scientific Research & Development| Vol. 2, Issue 08, 2014 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 181 Parallel Key Value Pattern Matching Model R. Senthamil Selvi1 Dr. T. Abdul Razak2 1 Assistant Professor 2 Associate Professor 1,2 Department of Computer Science 1,2 Jamal Mohamed College (Autonomous), Tiruchirappalli Abstract— Mining frequent itemsets from the huge transactional database is an important task in data mining. To find frequent itemsets in databases involves big decision in data mining for the purpose of extracting association rules. Association rule mining is used to find relationships among large datasets. Many algorithms were developed to find those frequent itemsets. This work presents a summarization and new model of parallel key value pattern matching model which shards a large-scale mining task into independent, parallel tasks. It produces a frequent pattern showing their capabilities and efficiency in terms of time consumption. It also avoids the high computational cost. It discovers the frequent item set from the database. Keywords: Data mining, FP Growth, Frequent Item Set Mining, Association rule Mining I. INTRODUCTION Data Mining is a collection of processes for efficient discovery of previously unknown, valid, useful and understandable patterns in large databases. The patterns should be actionable. So that they may be used in an enterprise’s resolution processes. It has many software and tools; they are used to analyze the data from large databases. Mining Frequent Pattern is an important concept for data mining. It gives the minimum support for threshold in frequent itemset. Association rule mining discovers relations between variables in large databases. Maximal Frequent Itemset is an item set that occur maximum number of times to the other itemset. The main purpose is to produce a large number of results as a pattern. Closed Frequent Itemset is linked to all frequent itemsets. Each item can be linked to other item and to form closed group itemsets. For example, if four itemsets are taken as, s1, s2, s3 and s4. The first three items can be linked to each other like, (s1, s2), (s1, s3) and (s2, s3). So the three items form a group between them and have a closed itemset. The main purpose is to produce a large number of results as a pattern. II. MOTIVATION Business data are stored in computer and it allows users to navigate through the data in real time. The evolution of data mining is to support three technologies. They are data collection, high performance computing and data mining algorithm. Data mining has many algorithms in frequent itemset mining. Every algorithm can perform well. Especially, the FP-growth algorithm avoids the generation of large numbers of candidate sets. The main idea of the algorithm is to maintain a frequent pattern tree. All algorithms of frequent itemset mining do not use the concept of parallel key value model. It distributes the work through the program to easily search and retrieve the frequent pattern data. III. RELATED WORK Jiawei Han et al. [1] proposed FP-growth approach for mining frequent itemsets without candidate generation. It is an extended prefix-tree structure for storing quantitative information about frequent patterns. And also some optimizations are available to speed up FP-growth. Christian Borgelt proposed a C implementation of a FP-growth algorithm. The pruning concept is achieved by traversing the levels of the FP-tree from top to bottom [2]. In implementation, the initial FP-tree is built from top to bottom and built from a main memory representation of the transaction database as a simple list of integer arrays. FP growth algorithm behaves exactly the opposite way as Apriori, which in implementation usually runs faster if items are sorted in the ascending order. Aiman Moyaid Said et al. [3] proposed a comparative study of FP-growth variation. It is an alternative method to the Apriori-based approach. It represents the frequent itemset into a frequent pattern tree or FP-tree, which retains the information of itemset. Using the compact tree structure, the FP-growth algorithm mines all the frequent itemsets. B. Santhosh Kumar et al. [4] proposed a comparison of memory usage and time usage in Apriori algorithm and FP growth algorithm. It uses a compact data structure and eliminates the repeated database scan. The algorithm has some advantages like completeness and compactness. Haoyuan li et al [5] proposed FP-growth based on the principle of divide and conquer way. That is to decompose a mining task into a smaller task and totally avoid candidate generation. In this paper, parallel algorithms were developed for reducing memory use and computational cost on every machine. Recent work in parallelizing FP- growth suffers from high communication cost. Here a MapReduceModel of parallel FP-growth algorithm (PFP) which cleverly slices a large-scale mining task into autonomous computational tasks and maps them into MapReduce jobs achieving non-linear speedup was proposed. The paper is based on novel data and computation distribution scheme, which virtually eliminates communication among computers and use map reduce model. It is effective in mining tag-tag associations and webpage-webpage associations to support query recommendation or related search. Bharat Gupta et al. proposed FP-growth algorithm [6] that compresses the database of frequent itemsets into frequent pattern tree recursively in the same order of magnitude as the numbers of frequent patterns. It then divides the compressed database into a set of conditional databases. The FP-growth technique constructs conditional frequent pattern tree and conditional pattern base from database which satisfy the minimum support.
  • 2.
    Parallel Key ValuePattern Matching Model (IJSRD/Vol. 2/Issue 08/2014/044) All rights reserved by www.ijsrd.com 182 Marek wojciechowski et al. proposed the common counting method to work with FP-Growth algorithm and evaluate the efficiency of both methods when FP-Growth basically used as a mining algorithm. [7] They consider the problem of optimizing batches of frequent itemset queries. This paper uses multiple query optimization methods, like common counting and mine merge. This methods reduces the I/O cost for common execution tasks and executes them only once for the whole data. The experiment shows that common counting for FP-Growth reduces the overall processing time. E R Naganathan et al. [8] proposed structured data mining. It is a major research topic in Data Mining. One of the common types of representation of structured data is graph. Graph-based data mining show a number of methods to mine the relational aspects of data. Graph is an alternate approach of modeling the objects. Graph-based data mining (GDM) is the task of finding novel, and understandable graph-theoretic patterns in a graph representation of data. It presents a new process to find out the Normalization Technique for the sub graphs obtained from the FP-growth model. This process may be one of the perfect ranking schemes among the sub graphs mined and this ranking scheme will play an efficient role in the sub graph applications. IV. EXISTING PARALLEL FP-GROWTH MODEL Parallel FP-Growth (PFP) means mining the complete set of frequent patterns by pattern fragment growth in parallel. Generally it depends on distributed machines. Each machine executes on an independent group of mining tasks. The FP- Growth algorithm runs much faster than the Apriori, but the parallel FP-Growth algorithm is too faster than the FP- Growth algorithm. It converts the DB into new databases of group-dependent transactions. So that the FP-trees built from different group-dependent transactions are independent. It is used to eliminate the computational dependencies between machines. And also it demonstrates that PFP to be promising for supporting query recommendation for search engines. The PFP explains the resource challenges for FP- Growth algorithm. They are storage, computation distribution, costly communication and support threshold value in FP-growth. Given a set of transaction database, PFP uses three MapReduce phases to parallelize FP-Growth. The PFP framework has five stages of computation. They are shard, parallel counting, graphing items, parallel FP-Growth and aggregating. PFP using parallel counting is a classical application of MapReduce approach. PFP using MapReduce approach is used to shard a large-scale mining task into independent computational tasks. And also it is able to address the issues of memory use and fault tolerance. So PFP is effective in mining tag-tag associations and webpage-webpage associations to support query recommendation. And the disadvantage of this method is distributed machines. Because it will increase cost of each machines. V. THE PROPOSED MODEL This model uses a frequent pattern to work faster than other methods. Here, two tasks are used. They are XModel and PModel. This Frequent Pattern proves that PModel task is better and works faster than XModel. Also in this PModel computing time is saved. This model is suitable for all the algorithms in data mining. The processing of XModel is used to retrieve data from the user interface and generate a frequent itemset. If items are equal, the process will end. Otherwise it will return to the temporary database and again execute the whole process until the condition is true. The Processing of PModel is used to retrieve data from the user interface and distribute the work using the key-value for preparing frequent item sets. The intersect operation between all frequent item sets are executed and another frequent list called the F-list is produced. Group of all F-list is called G-list. Then it checks whether the G-list has more equal frequent items and the process will end. Otherwise the whole process will be repeated until the frequent item sets are equal. A. Searching Algorithm The searching algorithm illustrated in Fig. 1 sorts the set of items in descending order and connects a database using JdbcOdbcDriver; this process is showed in steps 8 to 11. Then it checks if the condition record set is null, and moves to the next record set. Otherwise it will be cleared. To select a frequent item from database using a command execute a Query statement; select a product from transaction table. After selection process is completed, the database connection is closed. This process is showed in Steps 18 to 20. All items are collected to generate groups. Fig. 1: The Searching Algorithm
  • 3.
    Parallel Key ValuePattern Matching Model (IJSRD/Vol. 2/Issue 08/2014/044) All rights reserved by www.ijsrd.com 183 B. XModel Algorithm Fig. 2: The XModel Algorithm The XModel illustrated in Fig. 2. is used to execute a whole database and this takes more time for execution. First, to read databases, then to select all items and execute them; therefore the frequent items can be displayed at end of the program. The main work of XModel is to print the starting and ending time. This process is showed in steps 14 to 18. The total time duration can be calculated by the difference between ending time and finishing time. So, the duration of time taken to finish is in the order of milliseconds. C. PModel Algorithm The PModel Algorithm illustrated in Fig. 3, works as follows: (1) Scan the transaction database once to find all frequent items and their supports. (2) Sort the frequent items in descending order of their support. (3) Get the first transaction from the transaction database. Remove all non-frequent items and list the remaining items according to the order in the sorted frequent items. (4) Get the next transaction from the transaction database. Remove all non-frequent items and list the remaining items according to the order in the sorted frequent items. (5) Group all sorted frequent itemsets and display the start time and end time. (6) Continue with step 4 until all transactions of the database are processed. Fig. 3: The PModel Algorithm D. Architectural Diagram Fig. 4: Parallel Key Value Pattern Matching Model Using XModel In Fig. 4, a single program for searches a data from the database with the help of single key value. So it produces a large frequent item set. It is a difficult process to get an exact frequent item set. Because it takes a lot of time to execute the processes. Now, the user gets more equal
  • 4.
    Parallel Key ValuePattern Matching Model (IJSRD/Vol. 2/Issue 08/2014/044) All rights reserved by www.ijsrd.com 184 frequent item sets, and then the process will end. Otherwise the process will be repeated until the user gets the exact frequent item set. Fig. 5: Parallel Key Value Pattern Matching using PModel Architectural Diagram for Parallel Key Value Pattern Matching model using PModel is illustrated in Fig. 5. The PModel contains a program for searching a data from the database with the help of key value. So each program contains some frequent itemsets that can be denoted as frequent set1, frequent set2, and frequent set3. Then it performs the operation of intersection between frequent set 1 and frequent set 2, frequent set 2 and frequent set 3, frequent set3 and frequent set 1. After performing this operation, it gives another frequent list; it is called F-list I, F-list II, and F-list III respectively. After that, group all F- lists, and then it gives frequent item set. This set can be denoted as “G-List”. Now, the users get more equal frequent item sets, and then the process will end. Otherwise the process will be repeated until the user gets the exact frequent item set. VI. RESULTS AND DISCUSSION This section contains the comparison table and graph of XModel and PModel. The proposed model is applied into the data of transactions. The following table provides the summing up of results. Number of Records Finishing Time (Milliseconds) X Model P Model 25 844 194 50 1567 1477 75 3010 2270 100 7929 3509 Table 1: Values for Comparison Graph The execution time of XModel and PModel is differentiated from each other in milliseconds. The execution time is based on the number of records. This is shown in Table 1. For example, in 100 records, the XModel can take 7929 milliseconds for execution while the PModel can execute in 3509 milliseconds only. Fig. 6: XModel Vs PModel Fig. 6. shows the comparison graph of XModel vs. PModel. Number of records represents X-axis and milliseconds Y-axis. The bar chart differentiates the XModel and PModel. The experiments show that Parallel Key Value Matching Model reduces the overall processing time. VII. CONCLUSION The Parallel Key Value Pattern Matching Model is suitable for all algorithms in frequent itemset mining, which are usually of large scale distribution. It demonstrated that parallel key value pattern matching model is effective for discovering frequent itemsets. This model contained two methods, they are XModel and PModel. The existing model and proposed model are denoted as XModel and PModel respectively. The XModel takes more time to execute a program. The comparison is based on the performance of speedup and efficiency. The PModel produced better results of speedup and efficiency than XModel. REFERENCES [1] Jia Wei Han, Jian Pei, Yiwen Yin, Runying Mao, “Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach”, Data Mining and Knowledge Discovery, 8, 53 – 87, Kluwer Academic Publishers, Netherlands, 2004. [2] Christian Borgelt, “An Implementation of the FP- Growth Algorithm”, Department of Knowledge Processing and Language Engineering, Germany, 2005. [3] Aiman Moyaid Said, Dr. P D D.Dominic, Dr. Azween B Abdullah,” A Comparative Study of FP- Growth Variations”, Department of Computer and Information Sciences, International Journal of Computer Science and Network Security, Vol.9, No.5, Petronas, May 2009. [4] B. Santhosh Kumar and K.V. Rukmani, “Implementation of Web Usage Mining using Apriori and FP Growth Algorithms”, Department of Computer Science, Int.J. Of Advanced Networking
  • 5.
    Parallel Key ValuePattern Matching Model (IJSRD/Vol. 2/Issue 08/2014/044) All rights reserved by www.ijsrd.com 185 and Applications, Vol.1, Issue: 06, Pages: 400-404, Ketti, the Nilgiris, and Feb – April 2010. [5] Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, Edward Chang, “PFP: Parallel FP-Growth for Query Recommendation”, Google Beijing Research, china, 2010. [6] Bharat Gupta and Dr. Deepak Garg, “FP-Tree Based Algorithm Analysis: FP-Growth, COFI-Tree and CT- PRO”, Department of Computer Science, International Journal on Computer Science and Engineering (IJCSE), ISSN: 0975-3397, Vol. 3, No. 7, Patiala, India, July 2011. [7] Marek Wojciechowski, Krzysztof Galecki, and Krzysztof Gawronek, “Concurrent Processing of Frequent Itemset Queries using FP-Growth Algorithm”, Department of Computer Science, Poland. [8] E R Naganathan, S.Narayanan and K. Ramesh Kumar, “FP-growth Based new normalization for sub graph ranking”, Department of Computer Application, International Journal of Database Management System(IJDMS), Vol. 3, No.1, Tamil Nadu, February 2011.