Research Article

A Privacy Measure for Data Disclosure to Publish Micro Data using (N,T) - Closeness

by  A. Sunitha, K. Venkata Subba Reddy, B. Vijayakumar
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 51 - Issue 6
Published: August 2012
Authors: A. Sunitha, K. Venkata Subba Reddy, B. Vijayakumar
10.5120/8047-1379
PDF

A. Sunitha, K. Venkata Subba Reddy, B. Vijayakumar . A Privacy Measure for Data Disclosure to Publish Micro Data using (N,T) - Closeness. International Journal of Computer Applications. 51, 6 (August 2012), 22-28. DOI=10.5120/8047-1379

                        @article{ 10.5120/8047-1379,
                        author  = { A. Sunitha,K. Venkata Subba Reddy,B. Vijayakumar },
                        title   = { A Privacy Measure for Data Disclosure to Publish Micro Data using (N,T) - Closeness },
                        journal = { International Journal of Computer Applications },
                        year    = { 2012 },
                        volume  = { 51 },
                        number  = { 6 },
                        pages   = { 22-28 },
                        doi     = { 10.5120/8047-1379 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2012
                        %A A. Sunitha
                        %A K. Venkata Subba Reddy
                        %A B. Vijayakumar
                        %T A Privacy Measure for Data Disclosure to Publish Micro Data using (N,T) - Closeness%T 
                        %J International Journal of Computer Applications
                        %V 51
                        %N 6
                        %P 22-28
                        %R 10.5120/8047-1379
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Closeness is described as a privacy measure and its advantages are illustrated through examples and experiments on a real dataset. In this Paper the closeness can be verified by giving different values for N and T. Government agencies and other organizations often need to publish micro data, e. g. , medical data or census data, for research and other purposes. Typically, such data are stored in a table, and each record (row) corresponds to one individual. Generally if we want to publish micro data A common anonymization approach is generalization, which replaces quasi-identifier values with values that are less-specific but semantically consistent. As a result, more records will have the same set of quasi-identifier values. An equivalence class of an anonymized table is defined to be a set of records that have the same values for the quasi-identifiers To effectively limit disclosure, the disclosure risk of an anonymized table is to be measured. To this end, k-anonymity is introduced as the property that each record is indistinguishable with at least k-1 other records with respect to the quasi-identifier i. e. , k-anonymity requires that each equivalence class contains at least k records. While k-anonymity protects against identity disclosure, it is insufficient to prevent attribute disclosure. To address the above limitation of k-anonymity, a new notion of privacy, called l-diversity is introduced, which requires that the distribution of a sensitive attribute in each equivalence class has at least l "well represented" values. One problem with l-diversity is that it is limited in its assumption of adversarial knowledge. This assumption generalizes the specific background and homogeneity attacks used to motivate l-diversity. The k-anonymity privacy requirement for publishing micro data requires that each equivalence class contains at least k records. But k-anonymity cannot prevent attribute disclosure. The notion of l-diversity has been proposed to address this; l-diversity requires that each equivalence class has at least l well-represented values for each sensitive attribute. L-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Due to these limitations, a new notion of privacy called "closeness" is proposed. First the base model t- closeness is presented, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table. Then a more flexible privacy model called (n, t)-closeness is proposed. The rationale for using

References
  • N. Li, T. Li, and S. Venkatasubramanian, "Closeness: A New Privacy Measure for Data Publishing," Proc. Int'l Conf. Data Eng. (ICDE), pp. 943-956, 2010.
  • D. Lambert, "Measures of Disclosure Risk and Harm," J. OfficialStatistics, vol. 9, pp. 313-331, 1993.
  • L. Sweeney, "k-Anonymity: A Model for Protecting Privacy," Int'l pp. 557-570, J. Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, 2002.
  • A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venk atasubramaniam, "'l-diversity: PrivacyBeyond k- Anonymity," Proc. Int'l Conf. Data Eng. (ICDE), pp. 24, 2006
  • N. Li, T. Li, and S. Venkatasubramanian, "t- Closeness: Privacy beyond k-Anonymity and l- Diversity," Proc. Int'l Conf. Data Eng. (ICDE), pp. 106- 115, 2007.
  • K. LeFevre, D. DeWitt, and R. Ramakrishnan, "Mondrian Multidimensional-Anonymity," Proc. Int'l Conf. Data Eng. (ICDE),p. 25, 2006.
  • N. Li, T. Li, and S. Venkatasubramanian, "t- Closeness: Privacy beyond k-Anonymity and 'Diversity," Proc. Int'l Conf. Data Eng. (ICDE), pp. 106-115, 2007.
  • A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam,"'-Diversity: Privacy Beyond k-Anonymity," Proc. Int'l Conf. Data Eng. (ICDE), p. 24, 2006.
  • L. Sweeney, "k-Anonymity: A Model for Protecting Privacy," Int'lJ. Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 557- 570, 2002.
  • T. Li and N. Li, "Towards Optimal k- Anonymization," Data and Knowledge Eng. , vol. 65, pp. 22-39, 2008.
  • R. J. Bayardo and R. Agrawal, "Data Privacy through Optimal k-Anonymization," Proc. Int'l Conf. Data Eng. (ICDE), pp. 217-228, 2005.
  • B. C. M. Fung, K. Wang, and P. S. Yu, "Top-Down Specialization for Information and Privacy Preservation," Proc. Int'l Conf. Data Eng. (ICDE), pp. 205-216, 2005.
  • T. Li, N. Li, and J. Zhang, "Modeling and Integrating Background Knowledge in Data Anonymization," Proc. Int'l Conf. Data Eng. (ICDE), 2009.
  • M. E. Nergiz, M. Atzori, and C. Clifton, "Hiding the Presence of Individuals from Shared Databases," Proc. ACM SIGMOD,pp. 665-676, 2007.
  • H. Park and K. Shim, "Approximate Algorithms for K-Anonymity,"Proc. ACM SIGMOD, pp. 67-78, 2007.
  • V. Rastogi, S. Hong, and D. Susie, "The Boundary between Privacy and Utility in Data Publishing," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 531-542, 2007.
  • R. C. -W. Wong, J. Li, A. W. -C. Fu, and K. Wang, "k-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing," Proc. ACM SIGKDD, pp. 754-759, 2006.
  • X. Xiao and Y. Tao, "Anatomy: Simple and Effective Privacy Preservation," Proc. Int'l Conf. Very Large Data Bases (VLDB),pp. 139-150, 2006.
  • X. Xiao and Y. Tao, "Personalized Privacy Preservation," Proc. ACM SIGMOD, pp. 229-240, 2006.
  • X. Xiao and Y. Tao, "m-Invariance: Towards Privacy Preserving Republication of Dynamic Datasets," Proc. ACM SIGMOD,pp. 689-700, 2007.
  • V. S. Iyengar, "Transforming Data to Satisfy Privacy Constraints,"Proc. ACM SIGKDD, pp. 279-288, 2002.
  • D. Kifer and J. Gehrke, "Injecting Utility into Anonymized Datasets," Proc. ACM SIGMOD, pp. 217-228, 2006.
  • N. Koudas, D. Srivastava, T. Yu, and Q. Zhang, "Aggregate Query Answering on Anonymized Tables," Proc. Int'l Conf. Data Eng. (ICDE), pp. 116-125, 2007.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Privacy Measure K-Anonomity L-Diversity data Anonymization (n-t) closeness

Powered by PhDFocusTM