Research Article

Privacy and Utility Preserving Task Independent Data Mining

by  E. Poovammal, M. Ponnavaikko
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Issue 15
Published: February 2010
Authors: E. Poovammal, M. Ponnavaikko
10.5120/313-480
PDF

E. Poovammal, M. Ponnavaikko . Privacy and Utility Preserving Task Independent Data Mining. International Journal of Computer Applications. 1, 15 (February 2010), 104-111. DOI=10.5120/313-480

                        @article{ 10.5120/313-480,
                        author  = { E. Poovammal,M. Ponnavaikko },
                        title   = { Privacy and Utility Preserving Task Independent Data Mining },
                        journal = { International Journal of Computer Applications },
                        year    = { 2010 },
                        volume  = { 1 },
                        number  = { 15 },
                        pages   = { 104-111 },
                        doi     = { 10.5120/313-480 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2010
                        %A E. Poovammal
                        %A M. Ponnavaikko
                        %T Privacy and Utility Preserving Task Independent Data Mining%T 
                        %J International Journal of Computer Applications
                        %V 1
                        %N 15
                        %P 104-111
                        %R 10.5120/313-480
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Today’s world of universal data exchange, there is a need to manage the risk of unintended information disclosure. Publishing the data about the individuals, without revealing sensitive information about them is an important problem. K-anonymization is the popular approach used for data publishing. The limitations of K- anonymity were overcome by methods like L-diversity, T-closeness, (alpha, K) anonymity; but all of these methods focus on universal approach that exerts the same amount of privacy preservation for all persons against linking attack, which result in high loss of information. Privacy was also not guaranteed 100% because of proximity and divergence attack. Our approach is to design micro data sanitization technique to preserve privacy against proximity and divergence attack and also to preserve the utility of the data for any type of mining task. The proposed approach, apply a graded grouping transformation on numerical sensitive attribute and a mapping table based transformation on categorical sensitive attribute. We conduct experiments on adult data set and compare the results of original and transformed table to show that the proposed task independent technique preserves privacy, information and utility.

References
  • Adam N. R., Wortmann J. C., “Security-control methods for statistical databases: A comparative study”, ACM Comput. Surv 21(4), 515–556, 1989
  • Aggarwal C. C., Yu P. S., “A Condensation approach to privacy preserving data mining”, EDBT Conference, 2004
  • Aggarwal C. C., Yu P. S., “On Variable Constraints in Privacy-Preserving Data Mining”, SIAM Conference, 2005
  • Agrawal D., Aggarwal C. C., “On the Design and Quantification of Privacy- Preserving Data Mining Algorithms”, ACM PODS Conference, 2002
  • Agrawal R., Srikant R.,”Privacy-Preserving Data Mining”, ACM SIGMOD Conference, 2000
  • Atallah, M., Elmagarmid, A., Ibrahim, M., Bertino, E., Verykios, V.,”Disclosure limitation of sensitive rules”, Workshop on Knowledge and Data Engineering Exchange, 1999
  • Bayardo. R. J, Rakesh Agrawal, “Data privacy through optimal k- anonymization” , ICDE, 217-228,2005
  • Justin Brickell and Vitaly Shmatikov, “The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data Publishing”, KDD conference, 2008
  • C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, 2005
  • I. Dinur and K. Nissim.,”Revealing information while preserving privacy”, PODS, pages 202–210, 2003
  • J. Li, Raymond chi wing wong, Ada Fu, J. pei, “Anonymization by local recoding in data with attribute hierarchical taxonomies”, IEEE transaction on Knowledge and data Engg, Vol 20, No. 9, pp. 1181-1194,sep 2008
  • Jiexing Li, Yufei Tao, Xiaokui Xiao, “ Preservation of Proximity Privacy in Publishing Numerical Sensitive Data”, ACM SIGMOD, 2008
  • K. Kenthapadi, N. Mishra, and K. Nissim. Simulatable auditing, PODS, 2005
  • K LeFevre, David J. DeWitt, Raghu Ramakrishnan ,“Incognito: Efficient full domain k – anonymity “, SIGMOD, 49-60, 2005
  • K. Lefevre, D. Dewatt, R. Ramakrishnana, “Workload Aware Anonymization”, ACM KDDM, 2006
  • Machanavajjhala A., Gehrke J., Kifer D.,and Venkitasubramaniam M, “l-Diversity: Privacy Beyond k-Anonymity”, pp.24-35, ICDE, 2006
  • D. Martin, D.Kifer, A. Machanavajjhala, J. Gehrke, J. Halpern, “ Worst-case background knowledge in privacy”, ICDE, 2007
  • D.J. Newman, S. Hettich, C.L. Blake, and C.J. Merz, “UCI Repository of Machine Learning Databases”, Available at www .ics. uci. edu/~ mlearn/MLRepository.html, University of California, Irvine, 1998
  • Ninghui Li , Tiancheng Li and Suresh.V, “t-Closeness: Privacy beyond k-anonymity and l-diversity”, ICDE, 2007
  • S. R. M. Oliveira and O. R. Zaïane, "Privacy Preservation When Sharing Data For Clustering", International Workshop on Secure Data Management in a Connected World, 2004
  • Benny Pinkas,” Cryptographic techniques for privacy preserving data mining”, SIGKDD Explorations, Vol. 4, Issue.2, pp 12-19, 2002
  • Pierangela Samarati, “Protecting respondents identities in micro data release”, TKDE, 13(6), 1010-1027, 2001
  • L. Sweeney, "Achieving k-anonymity privacy protection using generalization and suppression," International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10 (5), pp. 571-588, 2002
  • Wenliang Du, Zhijun Zhang, "A Practical Approach to Solve Secure Multi-party Computation," in NSPW '02: 2002 workshop on New security paradigms, pp. 127-135, 2002
  • Wenliang Du and Zhijun Zhan, "Using Randomized Response Techniques for Privacy-Preserving Data Mining”, SIDKDD 2003
  • R.C. W. Wong, J. Li, A. W. C. Fu, K. Wang “(alpha,k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing”, ACM SIGKDD, pp.139-150,2006
  • Xiaokui xiao, Yufei Tao, “Personalized Privacy Preservation”, SIGMOD, 2006
  • Xiaokui xiao, Yufei Tao, “m-Invariance: Towards Privacy Preserving Re-publication of Dynamic datasets”, SIGMOD, 2007
  • Q. Zhang, N. Koudas, D. Srivastava, T. Yu, “Aggregate Query Answering on Anonymized Tables”, ICDE ,2007
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Anonymization Data Publishing Data utility Privacy management micro data sanitization

Powered by PhDFocusTM