Research Article

Unbalanced Data Set- State-of-the-art and its Research Challenges

by  Deeksha Dhapola, Janmejay Pant
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 178 - Issue 15
Published: May 2019
Authors: Deeksha Dhapola, Janmejay Pant
10.5120/ijca2019918931
PDF

Deeksha Dhapola, Janmejay Pant . Unbalanced Data Set- State-of-the-art and its Research Challenges. International Journal of Computer Applications. 178, 15 (May 2019), 62-64. DOI=10.5120/ijca2019918931

                        @article{ 10.5120/ijca2019918931,
                        author  = { Deeksha Dhapola,Janmejay Pant },
                        title   = { Unbalanced Data Set- State-of-the-art and its Research Challenges },
                        journal = { International Journal of Computer Applications },
                        year    = { 2019 },
                        volume  = { 178 },
                        number  = { 15 },
                        pages   = { 62-64 },
                        doi     = { 10.5120/ijca2019918931 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2019
                        %A Deeksha Dhapola
                        %A Janmejay Pant
                        %T Unbalanced Data Set- State-of-the-art and its Research Challenges%T 
                        %J International Journal of Computer Applications
                        %V 178
                        %N 15
                        %P 62-64
                        %R 10.5120/ijca2019918931
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Real world application often found the problem of unbalanced dataset. This then create the problem in machine learning methods . In this paper we have surveyed the imbalance dataset problem at the algorithmic level . By over sampling and under sampling some researchers artificially prove that updated svm ,cost sensitive classifier ,class orientation methods can be good on imbalanced dataset. This imbalance problem is also switching towards hybrid algorithm.

References
  • 1. Szil´ard Vajda, Gernot A. ―Fink Strategies for Training Robust Neural Network Based Digit Recognizers on Unbalanced Data Set 2010‖ 12th International Conference on Frontiers in Handwriting Recognition
  • C.V. KrishnaVeni,T. Sobha Rani On the Classification of Imbalanced Datasets‖ IJCST Vol . 2, SP 1, December 2015
  • Nitesh V. Chawla, Nathalie Japkowicz,
  • Special Issue on Learning from Imbalanced Data Sets‖ Sigkdd Explorations. Volume 6, Issue 1
  • .Chawla,NBowyer, K., Hall, L. Kegelmeyer, W. ―SMOTE: Synthetic minority over-sampling technique‖ of Artificial Intelligence Research 16, 321–357 (2015)
  • Andrew Estabrooks, Taeho Jo and Nathalie Japkowicz ―Multiple Resampling Method for Learning from Comprtational Intelligence 20 (1) (2009).
  • Taeho Jo, Nathalie Japkowicz ―Class Imbalances versus Small Disjuncts‖. Sigkdd Conference IEEE 2011.
  • Hongyu Guo, Herna L Viktor: ―Learning from Imbalanced Data Sets with Boosting and Data Generation: The DataBoost-IM Approach‖. Sigkdd Explorations 6 (1) (2015).
  • Hui Han, Wen-Yuan Wang, Bing-Huan 4th International conference 2011, Malaysia.
  • David A. Cieslak, Nitesh V. Chawla ―Start Globally, Optimize Locally, Predict Globally: Improving Performance on Imbalanced Data‖ 2012 Eighth IEEE International Conference on Data Mining.
  • Gary M. Weiss, Kate McCarthy, and Bibi Zabar Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error costs?
  • Haibo He, Edwardo A. Garcia, ― Learning from Imbalanced Data‖, 2012.
  • Charles X. Ling, Qiang Yang, Jianning Decisions tree with minimum costs.
  • David A. Cieslak, Nitesh V. Chawla, Learning decision tree for unbalanced datasets. 2015.
  • Wei Liu, Sanjay Chawla, David A. Nitesh v Chawala for imbalanced datasets. 2013
  • David A. Cieslak, T. Ryan The journal of Data mining issue May 2016.
  • Satyam Maheshwari, Prof. Jitendra A New Approach for Classification of imbalanced datasets Evolutionary algorithm 2011.
  • NGUYEN HA VO, YONGGWAN WON ―Classification of Unbalanced Medical Data with Weighted .Convergence of bio science technology 2015.
  • Jie Song, Xiaoling Lu, Xizhi Wu ―An Improved AdaBoost Algorithmfor Unbalanced Classification Data‖ 2009 Sixth International Conference on Fuzzy 2012.
  • Yanmin Sun, Mohamed S. Kamel, Andrew K cost sensitive boosting on imbalanced dataset 2013.
  • Rehan Akbani, Stephen Kwek Nathalie Japkowicz Applying Support Vector Machines to Imbalanced Dataset.
  • TAO Xiao-yan, JI Hong-bing AModifiedPSVM and itsApplicationtoUnbalancedDataClassification.ThirInternational Conference on NaturalICNS 2017
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

cost-sensitive learning imbalanced data set modified SVM oversampling undersampling

Powered by PhDFocusTM