Open Access Open Access  Restricted Access Subscription or Fee Access

An Improved K-means Clustering Algorithm for Classification of Odor/Gas Sensor Data Using Normalized Cosine Distance Parameter

Shivam Choudhary, Ravi Kumar

Abstract


This paper presents a novel approach of K-means Clustering for classification of odors/gases (E-nose) using cosine distance as a distance parameter. A sensor array constituting five sensors is exposed to four different types of gases to extract data. The problem of classifying the data into respective classes in considered as a K-means Clustering task. To quantify the amount of similitude between the data corresponding to same classes; usually euclidean distance is used as a distance parameter. The distance parameter used in this paper is normalized cosine distance as the euclidean distance suffers from misclassification when the classes are highly spread in the pattern space. Cosine distance measure the angle as a distance parameter between the data vectors. It was observed that employing cosine distance measure resulted in better or similar classification performance as compared to euclidean distance over ten folds of cross validation scheme.

 

Cite this Article
Shivam Choudhary, Ravi Kumar.
An Improved K-means Clustering Algorithm for Classification of Odor/Gas Sensor Data Using Normalized Cosine Distance Parameter. Journal of Image Processing & Pattern Recognition Progress. 2015; 2(2): 56–60p.


Keywords


E-Nose, K-means clustering, euclidean distance, cosine distance

Full Text:

PDF

References


Hevesi P, Wille S, Pirkl G, When N, Lukowicz P. Monitoring household activities and user location with a cheap, unobtrusive thermal sensor array. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing Sept. 13-17 Seattle, USA.2014: 141–145p.

Wang B, Xu S, Sun DW. Application of the electronic nose to the identification of different milk flavorings. Food Research international.2010; 43: 255–262p.

Kumar R, Das RR, Mishra VN, Dwivedi R. A Neuro-Fuzzy classifier-cum-quantifier for analysis of alcohols and alcoholic beverages using responses of thick film tin oxide sensor array. IEEE Sensors Journal. 2010; 10: 1461–1468p.

Kumar R. Game theoretic pattern analysis for identification of odors/gases using response of a poorly selective sensor array. IEEE Sensors Journal. 2013; 13: 1110–1116p.

Xie Juanying, Jiang Shuai. A simple and fast algorithm for global K-means clustering. in Second International Workshop on Education Technology and Computer Science. 2010; 2: 36–40p.

Ishikawa K, Morishita S. Asimple but powerful heuristic method for accelerating K-means clustering of large scale data in life science. IEEE Transaction on Computational Biology and Bioinformatics. 2014; 11: 681–692p.

Na Shi, Yong Guan, Xumin Liu. Research on K-means clustering algorithm: An improved K- means clustering. International Symposium on Intelligent Information Technology and Security Information. 2010: 63–67p.

Xu Mantao, Franti Pasi. A heuristic K-means Clustering algorithm by kernel PCA. International conference on Image Processing. 2014; 5: 3503–3506p.

Sahu L, Mohan BR. An improved K-means algorithm using modified cosine distance measure for document clustering using Mahout and Hadoop. International Conference on Industrial and Information Systems. 2014: 1–5p.


Refbacks

  • There are currently no refbacks.


This site has been shifted to https://stmcomputers.stmjournals.com/