Research Article

Target Class Supervised Feature Subsetting

by  P. Nagabhushan, H. N. Meenakshi
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 91 - Issue 12
Published: April 2014
Authors: P. Nagabhushan, H. N. Meenakshi
10.5120/15932-5157
PDF

P. Nagabhushan, H. N. Meenakshi . Target Class Supervised Feature Subsetting. International Journal of Computer Applications. 91, 12 (April 2014), 11-23. DOI=10.5120/15932-5157

                        @article{ 10.5120/15932-5157,
                        author  = { P. Nagabhushan,H. N. Meenakshi },
                        title   = { Target Class Supervised Feature Subsetting },
                        journal = { International Journal of Computer Applications },
                        year    = { 2014 },
                        volume  = { 91 },
                        number  = { 12 },
                        pages   = { 11-23 },
                        doi     = { 10.5120/15932-5157 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2014
                        %A P. Nagabhushan
                        %A H. N. Meenakshi
                        %T Target Class Supervised Feature Subsetting%T 
                        %J International Journal of Computer Applications
                        %V 91
                        %N 12
                        %P 11-23
                        %R 10.5120/15932-5157
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Dimensionality Reduction may result in contradicting effects- the advantage of minimizing the number of features coupled with the disadvantage of information loss leading to incorrect classification or clustering. This could be the problem when one tries to extract all classes present in a high dimensional population. However in real life, it is often observed that one would not be interested in looking into all classes present in a high dimensional space but one would focus on one or two or few classes at any given instant depending upon the purpose for which data is analyzed. The proposal in this research work is to make the dimensionality reduction more effective, whenever one is interested specifically in a target class, not only in terms of minimizing the number of features, also in terms of enhancing the accuracy of classification particularly with reference to the target class. The objective of this research work hence is to realize effective feature subsetting supervised by the specified target class. A multistage algorithm is proposed- in the first stage least desired features which do not contribute substantial information to extract the target class are eliminated, in the second stage redundant features are identified and are removed to overcome redundancy, and in the final stage more optimal set of features are derived from the resultant subset of features. Suitable computational procedures are devised and reduced feature sets at different stages are subjected for validation. Performance is analysed through extensive experiments. The multistage procedure is also tested on a hyperspectral AVIRIS Indiana Pine data set.

References
  • R. Duda, P. Hart, and D. Stork, Pattern Classification (2nd Edition): Wiley-Interscience, 2000.
  • Nagabhushan . P. An efficient method for classifying remotely sensed data(incorporating dimensionality reduction). Ph. D thesis, University of Mysore, 1988.
  • Milliken, G. A. , and D. E. Johnson, Analysis of Messy Data, Volume 1: Designed Experiments, Chapman & Hall, 1992.
  • Iffat A. Gheys, Leslie S. smith, Feature subset Selection in large dimensionality Domains, Pattern Recognition letters,31 May, 2009
  • Yue Han, Lei Yu, A Variance Reduction Framework for Stable Feature Selection, International Conference on Data Mining, 2010 IEEE
  • Lalitha Rangarajan , P. Nagabhushan, Content driven Dimensionality Reduction at block level in the design of an efficient classifier for spatial multispectral images , Pattern Recognition Letters, 23 September 2004
  • Songyoot Nakariyakul, David P. Casasent, "An improvement on floating search algorithms for feature subset selection", Pattern Recognition Letters, 42, 1932-1940, 2009
  • Eugene Tuv, Alexander Borisov, George Runger, Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination, Journal of Machine Learning Research 10 (2009) 1341-1366
  • Addenor Hacine-Gharbi, Phillipie Ravier, rachid Harba, Tayeb Mohamadi ,Low bias histogram-based estimation of mutual information for feature selection, Pattern recognition letters,10 March 2012
  • Kulkarni Linqanagouda , P. Nagabhushan & K. Chidananda Gowda,a new approach for feature transformation to Euclidean space useful in the analysis of Multispectral Data,1994, IEEE
  • P. Nagabhushan, K,Chidananda Gowda, Edwin Diday, "Dimensionality reduction of symbolic data, Pattern Recognition letters, 219-223, 1994
  • Searle, S. R. , F. M. Speed, and G. A. Milliken, "Population marginal means in the linear model: an alternative to least squares means," American Statistician,1980, pp. 216-221
  • Chuan-Xian ren, Dao-Qing Dai, Bilinear Lancoz Components for fast dimensionality reduction and feature Selection, Pattern Recognition Letters, April 2010
  • Vikas Sindhwani, Subrata Rakshit, Dipti Deodhare, Deniz Erdogmus,Jose . C. Principe, Partha Niyogi, "Feature Selection in MLPs and SVMs Based on Maximum Output Information" IEEE Transactions on Neural Networks, Vol. 15. No 4. July 2004
  • Thomas M. Cover, Joy A. Thomas, Elements of Information Theory, John Wiley & sons,1991
  • Jaesung Lee, Dae-Won Kim, "Feature selection for multi-label Classification using multivariate mutual information", Pattern Recognition Letters,34, 349-357,2013 summon
  • Javier Grande, Maria Del Rosario Suarez , Jose Ramon Vinar, "A feature Selection Method Using a fuzzy mutual information Measure", Innovation in Hybrid Intelligent systems, Springer-Verlag, ASC 44, 56-63 2007
  • Lei Yu Huan Liu, Efficient Feature Selection via Analysis of Relevance and Redundancy, Journal of Machine Learning Research 5 (2004) 1205–1224
  • Liangpei Zhang, , Lefei Zhang, Dacheng Tao, Xin Huang, Tensor Discriminative Locality Alignment for Hyperspectral Image Spectral–Spatial Feature Extraction, 0196-2892,2012 IEEE
  • Vasileios Ch Korfiatis, Pantelis A. Asvestas, Konstantinos K. Delibasis, "A classification system based on a new wrapper feature selection algorithm for diagnosis of primary and secondary polycythemia", Computers in Biology and Medicine, 2118-2126,2013
  • Eugene Tuv, Alexander Borisov, George Runger, Fetaure selection with Ensembles , Artificial Variables and Redundancy Elimination, Journal of Machine Learning Research, 1341-1366,2009
  • Annalisa Appice , Michelangelo Ceci, Redundant Feature Elimination for Multi-Class Problems, 21st International Conference on Machine Learning, Banff, Canada, 2004.
  • Riccardo Leardi, Amparo Lupianez Gonzalez, "Genetic algorithms applied to feature selection in PLS regression: how and when to use them", 0169-7439, 1998, Elsevier Science
  • Newton Spolaor, Everton Alvares Cherman. "A comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach", Electronic Notes in Theoretical Computer Science 292, 135-151-2013
  • Jun Zheng, Emma Regentova, Wavelet Based Feature Reduction Methods for Effective Classification of Hyper spectral Data, Proceedings of the International Conference on Information Technology: Computer and Communication, 0-7695-1916-4, IEEE ,2003
  • Songyot nakariyakul, david P. Casasent, An improvement on floating search algorithms for feature subset selection, Pattern Recognition letters, 42, 1932-1940,2009
  • Silvia Casado Yusta, "Different metaheuristic strategies to solve the feature selection problem", Pattern Recognition Letters,30, 525-534,2009
  • Xiaofei He, Ming Ji, Chiyuan Zhang, and Hujun Bao,A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization, IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 10, october 2011 2013
  • Zahn, C. T. , "Graph-theoretical methods for detecting and describing Gestalt clusters," IEEE Transactions on Computers, C 20, pp. 68-86, 1971.
  • Shuicheng Yan, Dong Xu, Benyu Zhang, Hong-Jiang Zhang, Qiang Yang, and Stephen Lin, " Graph Embedding and Extensions: A General Framework for Dimensionality Reduction " , IEEE ,0162-8828, 2007
  • Muhammad Sohaib, Ihsan-ul-Haq, Qaisar Mushtaq. ,Dimensionality Reduction of Hyperspectral Image Data Using Band Clustering and Selection Through K-Means Based on Statistical Characterstics of Band Images, International Journal of Advanced Computer Science, Vol2, No4 , 146-151, Apr 2012
  • R. Pradeep Kumar, P. Nagabhushan. , "Wavelength for knowledge mining in Multi-dimensional Generic Databases, Thesis, university of Mysore, 2007
  • Supriyanto, C. ; Yusof, N. ; Nurhadiono, B. ; Sukardi, "Two-level feature selection for naive bayes with kernel density estimation in question classification based on Bloom's cognitive levels", Information Technology and Electrical Engineering, IEEE Conference Publications, 2013 , 237 - 241
  • Yongkoo Han ; Kisung Park ; Young-Koo Lee ,"Confident wrapper-type semi-supervised feature selection using an ensemble classifier ",Artificial Intelligence, Management Science and Electronic Commerce , IEEE Conference Publications,2011 , 4581 - 4586
  • Onpans, J. ; Rasmequan, S. ; Jantarakongkul, B. ; Chinnasarn, K. ; Rodtook, A. ,"Intrusion feature selection using Modified Heuristic Greedy Algorithm of Item set", IEEE Conference Publications ,2013 , 627 - 632
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Feature sub setting Target Class Sum of Squared Error Stability Factor Convergence Inference Factor and Homogeneity.

Powered by PhDFocusTM