International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 166 - Issue 11 |
Published: May 2017 |
Authors: Dixa Saxena, S. K. Saritha, K. N. S. S. V. Prasad |
![]() |
Dixa Saxena, S. K. Saritha, K. N. S. S. V. Prasad . Survey Paper on Feature Extraction Methods in Text Categorization. International Journal of Computer Applications. 166, 11 (May 2017), 11-17. DOI=10.5120/ijca2017914145
@article{ 10.5120/ijca2017914145, author = { Dixa Saxena,S. K. Saritha,K. N. S. S. V. Prasad }, title = { Survey Paper on Feature Extraction Methods in Text Categorization }, journal = { International Journal of Computer Applications }, year = { 2017 }, volume = { 166 }, number = { 11 }, pages = { 11-17 }, doi = { 10.5120/ijca2017914145 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2017 %A Dixa Saxena %A S. K. Saritha %A K. N. S. S. V. Prasad %T Survey Paper on Feature Extraction Methods in Text Categorization%T %J International Journal of Computer Applications %V 166 %N 11 %P 11-17 %R 10.5120/ijca2017914145 %I Foundation of Computer Science (FCS), NY, USA
As the world is moving towards globalization, digitization of text has been escalating a lot and the need to organize, categorize and classify text has become obligatory. Disorganization or little categorization and sorting of text may result in dawdling response time of information retrieval. There has been the ‘curse of dimensionality’ (as termed by Bellman)[1] problem, namely the inherent sparsity of high dimensional spaces. Thus, the search for a possible presence of some unspecified structure in such a high dimensional space can be difficult. This is the task of feature reduction methods. They obtain the most relevant information from the original data and represent the information in a lower dimensionality space. In this paper, all the applied methods on feature extraction on text categorization from the traditional bag-of-words model approach to the unconventional neural networks are discussed.