Open Access Open Access  Restricted Access Subscription or Fee Access

Comparative Study of Data Mining Tools

Nidhi Sharma, KL Bansal

Abstract


Data mining is a process which finds useful patterns from large amount of data for an appropriate application. It is a well-built technology with great potential to help companies whose focus is on important information in huge databases. It uses machine learning, statistical and visualization technique to discover and predict knowledge in a form which is easily understandable to the user. Classification is important technique of data mining which employs a set of pre-classified examples to develop a model that can classify the data and discover relationship between independent and dependent data. There are various algorithms in classification technique of data mining. Some of them are KNN, C4.5, ID3 and Rnd Tree. Different classification algorithms employ different theories to achieve the goal. There are many data mining tools available for extracting the hidden information from the structured and unstructured datasets. The tools used for comparison are WEKA, KEEL, KNIME, RAPIDMINER, ORANGE, and TANAGRA. Different data mining tools have got their own pros and cons. The main consequence of this fact is formulated by the ‘no-free lunch theorem’, which states that there is no universally best data mining tool. This triggers the need to select the appropriate data mining tool for a given application domain. So the objective is to study and classify the performance of various existing data mining tools.

 

Cite this Article
Nidhi Sharma, Bansal KL. Comparative Study of Data Mining Tools. Journal of Advanced Database Management & Systems. 2015; 2(2): 35–41p.


Keywords


Data mining, C4.5, KNN, ID3, Rnd Tree, Weka, Knime, Rapid miner, Tanagra, Keel, Orange

Full Text:

PDF

References


Devi Prasad Bhukya, Ramachandram S. Decision Tree Induction: An Approach for Data Classification Using AVL-Tree. International Journal of Computer and Electrical Engineering. Aug 2010; 2(4).

http://www.zentut.com/data-mining/data-mining-techniques. 3. Linoff Gordon S, Berry Michael JA. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. 3rd Edn. Morgan Kaufmann Publishers. Mar 2011.

4.http://www.tutorialspoint.com/data_mining/dm_classification_prediction.htm.

Christian Bockermann, Hendrik Blom. Processing Data Streams with the Rapid Miner Streams Plug in. Jan 2015.

Sarumathi S, Shanthi N, Vidhya S, et al. A Review: Comparative Study of Diverse Collection of Data Mining Tools. International Journal of Computer, Control, Quantum and Information Engineering. 2014; 8(6). 7. Stephan Beisken, Thorsten Meinl, Bernd Wiswedel, et al. KNIME-CDK: Workflow-Driven Cheminformatics. BMC Bioinformatics. 2013; 14: 257p.

Bernhard Pfahringer, Peter Reutemann, Witten Ian H, et al. The WEKA Data Mining Software: An Update. SIGKDD Explorations. 2009; 11(1): 10–18p.

Shelly Gupta, Dharminder Kumar, Anand Sharma. Performance Analysis of Various Data Mining Classification Techniques on Healthcare Data. International Journal of Computer Science & Information Technology (IJCSIT). Aug 2011; 3(4).

http://en.wikipedia.org/wiki/Weka_(machine_learning).

Manju Narwal, Pooja Mittal. Keel A Data Mining Tool: Analysis With Genetic. International Journal of Computer Science & Management Studies (IJCSMS). Jun 2012; 12.

http://en.wikipedia.org/wiki/Tanagra_%28machine_learning%29.

Vidyullatha Pellakuri, Deepthi Gurram, Rajeswara Rao D, et al. Performance Analysis and Optimization of Supervised Learning Techniques for Medical Diagnosis Using Open Source Tools. International Journal of Computer Science and Information Technologies (IJCSIT). 2015; 6(1).

http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html. 15. Ian Witten, Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques. 3rd Edn. Morgan Kaufmann Publishers. 2010.


Refbacks

  • There are currently no refbacks.


This site has been shifted to https://stmcomputers.stmjournals.com/