Open Access Open Access  Restricted Access Subscription or Fee Access

Extracting High Dimensional Data by Using Fast Clustering Based Feature Subset Selection System

C. Koteeswaran, G.V. Ramesh Babu, M. Padmavathamma

Abstract


Abstract

This paper presents the feature assortment involve identifying a subset of the majority valuable features with the intention of produce companionable results the same as the inventive complete set of features. A feature assortment system could exist evaluate commencing both the effectiveness and efficiency points of analysis. Although the effectiveness concerns the point in time necessary in the direction of locate a subset of features, the efficiency is linked toward the excellence of the subset of features. Based on this criterion a fast clustering based feature selection system is planned along with experimentally evaluated within this paper. The fast system installation into two steps, during the initial step features are separated into cluster by means of using graph theoretic clustering method. In the subsequent step the essentially representative feature so as to be robustly correlated to objective module is elected on or after every cluster to appearance a subset of features. Features in dissimilar clusters are moderately supreme the clustering based approaches of fast have an elevated possibility of producing a subset of helpful with sovereign features. To make sure the effectiveness of fast we assume the competent bare minimum spanning tree clustering method. The effectiveness and efficiency of the fast system be evaluated throughout an experimental revise.

 

Keywords: FAST Clustering, Subset Selection Algorithm, Distributed Clustering, Spanning tree, Sensitivity Analysis

Cite this Article

Koteeswaran C, Ramesh Babu GV, Padmavathamma M. Extracting High Dimensional Data by Using Fast Clustering Based Feature Subset Selection System. Journal of Software Engineering Tools & Technology Trends. 2016; 3(3): 1–13p.


Full Text:

PDF

References


Zhao Z and Liu H, ‘Searching for Interacting Features in Subset Selection’, Journal Intelligent Data Analysis, 13(2), pp 207-228, 2009.

Butterworth R, Piatetsky-Shapiro G and Simovici D.A, ‘On Feature Selection through Clustering’, In Proceedings of the Fifth IEEE international Conference on Data Mining, pp 581-584, 2005.

Yu L and Liu H, ‘Feature selection for high-dimensional data: a fast correlation-based filter solution’, in Proceedings of 20th International Conference on Machine Leaning, 20(2), pp 856-863, 2003.

Molina L.C, Belanche L and Nebot A, ‘Feature selection algorithms: A survey and experimental evaluation’, in Proc. IEEE Int. Conf. Data Mining, pp 306-313, 2002.

John G.H, Kohavi R and Pfleger K, ‘Irrelevant Features and the Subset Selection Problem’, In the Proceedings of the Eleventh International Conference on Machine Learning, pp 121-129, 1994.

Almuallim H and Dietterich T.G, ‘Algorithms for Identifying Relevant Features’, In Proceedings of the 9th Canadian Conference on AI, pp 38-45, 1992.

Schlimmer J.C, ‘Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning’, In Proceedings of Tenth International Conference on Machine Learning, pp 284-290, 1993.

Kira K and Rendell L.A, ‘The feature selection problem: Traditional methods and a new algorithm’, In Proceedings of Nineth National Conference on Artificial Intelligence, pp 129-134, 1992.

Hall M.A, ‘Correlation-Based Feature Selection for Discrete and Numeric Class Machine Learning’, In Proceedings of 17th International Conference on Machine Learning, pp 359-366, 2000.

Yu L and Liu H, ‘Efficient feature selection via analysis of relevance and redundancy’, Journal of Machine Learning Research, 10(5), pp 1205-1224, 2004.


Refbacks

  • There are currently no refbacks.