Implementation of MRPrePost Parallel Algorithm based on Hadoop Platform for Large-Data Mining
The volume, velocity and variety of the data have increased several folds in the past few years. The conventional algorithms and techniques used to mine such huge data are found to be less efficient because these algorithms consider only the large threshold value due to which the number of candidates can be reduced, but this will lead mining association rules production to be inaccurate due to low utilization of data. Hence the idea of parallelization is a viable option. This paper focuses on a Hadoop platform based parallel algorithm called MRPrePost; a hybrid of big data mining methods, in order to mine frequent item sets to derive association rules. MRPrePost is based on the idea of parallel design which improves PrePost by way of adding a prefix pattern. It uniformly partitions the search space way instead of the original database, making this algorithm adapt efficiently to mining large data's association rules. The paper intends to conduct experiments implement MRPrePost algorithm on a large data set on Hadoop platform.
Keywords: Parallelization, hadoop, MRPrePost algorithm, KKD (Knowledge Discovery in Databases), HDFS (Hadoop Distributed File System)
Cite this Article
Aishwarya Rani M R, Shivanand R D. Implementation of MRPrePost Parallel Algorithm based on Hadoop Platform for Large-Data Mining. Recent Trends in Parallel Computing. 2017; 4(2): 10–20p.
- There are currently no refbacks.