CFP last date
20 May 2024
Reseach Article

Spam Filtering using K mean Clustering with Local Feature Selection Classifier

by Anand Sharma, Vedant Rastogi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 108 - Number 10
Year of Publication: 2014
Authors: Anand Sharma, Vedant Rastogi
10.5120/18951-0096

Anand Sharma, Vedant Rastogi . Spam Filtering using K mean Clustering with Local Feature Selection Classifier. International Journal of Computer Applications. 108, 10 ( December 2014), 35-39. DOI=10.5120/18951-0096

@article{ 10.5120/18951-0096,
author = { Anand Sharma, Vedant Rastogi },
title = { Spam Filtering using K mean Clustering with Local Feature Selection Classifier },
journal = { International Journal of Computer Applications },
issue_date = { December 2014 },
volume = { 108 },
number = { 10 },
month = { December },
year = { 2014 },
issn = { 0975-8887 },
pages = { 35-39 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume108/number10/18951-0096/ },
doi = { 10.5120/18951-0096 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:42:40.418733+05:30
%A Anand Sharma
%A Vedant Rastogi
%T Spam Filtering using K mean Clustering with Local Feature Selection Classifier
%J International Journal of Computer Applications
%@ 0975-8887
%V 108
%N 10
%P 35-39
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, we present a comprehensive review of recent developments in the application of machine learning algorithms to Spam filtering, focusing on textual approaches. We are trying to introduce various spam filtering methods from Naïve Bias to Hybrid methods for spam filtering, we are also introducing types of filters recently used for spam filtering along with architecture of spam filter and its types . In this paper we are proposing a technique using Local feature classification methods with K mean clustering algorithm in classifier, for spam filtering term selection we are using Document frequency method, for feature extraction we are using bag of words model for classification we are using k-mean clustering method along with local concentration based extraction of content. This method gives good results along with all parameters.

References
  1. Thiago S. Guzella *, Walmir M. Caminhas "A review of machine learning approaches to Spam filtering", Elsevier Expert Systems with Applications 36 (2009) 10206–10222
  2. Enrique Puertas Sanz, José María Gómez Hidalgo, José Carlos Cortizo Pérez Email Spam Filtering Universidad Europea de Madrid Villaviciosa de Odón, 28670 Madrid, SPAIN.
  3. Saadat Nazirova," Survey on Spam Filtering Techniques" Communications and Network, 2011, 3, 153-160 doi:10. 4236/cn. 2011. 33019 Published Online August 2011 Scientific Research.
  4. Meghali Das1 and Vijay Prasad, "ANALYSIS OF AN IMAGE SPAM IN EMAIL BASED ON CONTENT ANALYSIS," International Journal on Natural Language Computing (IJNLC) Vol. 3, No. 3, June 2014.
  5. Alia Taha Sabri Adel Hamdan Mohammads, Bassam Al-Shargabi, Maher Abu "HamdehDeveloping New Continuous Learning Approach for spam Detection using Artificial Neural Network (CLA_ANN)," European Journal of Scientific Research ISSN 1450-216X Vol. 42 No. 3 (2010), pp. 511-521
  6. Meharn Shami,Susan Dumais, David Hekerman, Eric Horvitz "A Bysean Appraoch to Filtering Junk e mail "Microsoft Research.
  7. Shugang Liu & Kebin Cui "Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering" Modern Applied Science ccsenet journal Volume 3, No, 10 October 2009
  8. Andrew Secker, Alex A. Freitas, Jon Timmis "AISEC: an Artificial Immune System for E-mail Classification" Evolutionary Computation, 2003. CEC '03. The 2003 Congress on (Volume:1 ) 2003
  9. Chris Miller, Group Product Manager Enterprise Email Security "Neural Network-based Antispam Heuristics" Symantec Enterprise Security
  10. Surendra Kumar Rakse , Sanyam Shukla "Spam Classification using new kernel function inSupport Vector Machine " (IJCSE) International Journal on Computer Science and Engineering, Vol. 02, No. 05, 2010, 1819-1823
  11. Qiang Wang Yi Guan Xiaolong Wang SVM-Based Spam Filter with Active and Online Learning
  12. Saadat Nazirova "Survey on Spam Filtering Techniques" Communications and Network, 2011, 3, 153-160 Published Online August 2011
  13. Yuanchun Zhu and Ying Tan "A Local- Concentration-Based Feature Extraction Approach for Spam Filtering" IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 6, NO. 2, JUNE 2011
  14. Ion Androutsopoulos,, Paliouras, G. , & Michelakis, E. "Learning to filter unsolicited commercial e-mail," Tech. rep. 2004/2, NCSR ''Demokritos".
  15. A. Ishiguru, Y. Watanabe, and Y. Uchikawa, "Fault Diagnosis of Plant Systems using Immune Networks," Proc. IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, pp. 34–42, 1994.
Index Terms

Computer Science
Information Sciences

Keywords

Spam filtering K mean.