A comparative study of various MDAV algorithms
AbstractMicroaggregation is an efficient Statistical Disclosure Control (SDC) perturbative technique for microdata protection. It is a unified approach and naturally satisfies k-Anonymity without generalization or suppression of data. Various microaggregation techniques: fixed-size and data-oriented for univariate and multivariate data exists in the literature. These methods have been evaluated using the standard measures: Disclosure Risk (DR) and Information Loss (IL). Every time a new microaggregation technique was proposed, a better trade-off between risk of disclosing data and data utility was achieved. Though there exists an optimal univariate microaggregation method but unfortunately an optimal multivariate microaggregation method is an NP hard problem. Consequently, several heuristics have been proposed but no such method outperforms the other in all the possible criteria. In this paper we have performed a study of the various microaggregation techniques so that we get a detailed insight on how to design an efficient microaggregation method which satisfies all the criteria.
 CHARU C. AGGARWAL and PHILIP S. YU “PRIVACY¬PRESERVING DATA MINING:MODELS AND ALGORITHMS”
 Sweeney L.: Replacing Personally Identifiable Information in Medical Records, the Scrub System. Journal of the American Medical Informatics Association, 1996.
 Sweeney L.: Guaranteeing Anonymity while Sharing Data, the Datafly System. Journal of the American Medical Informatics Association,1997.
 J.M. Mateo-Sanz and J. Domingo-Ferrer, “A Method for Data Oriented Multivariate Microaggregation,” Proc. Statistical Data Protection’ 98,pp. 89¬99,1999.
 A. Hundepool, A. V. deWetering, R. Ramaswamy, L. Franconi, A. Capobianchi, P.-P. DeWolf, J.Domingo-Ferrer, V. Torra, R. Brand & S. Giessing, (2003) “p-ARGUS version 3.2 Software and User’s Manual”, Voorburg NL: Statistics Netherlands, http://neon.vb.cbs.nl/casc.
 M. Laszlo & S. Mukherjee, (2005) “Minimum spanning tree partitioning algorithm for microaggregation”, IEEE Transactions on Knowledge and Data Engineering, 17(7), pp. 902-911.
 Domingo-Ferrer J, Mateo-Sanz J., Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 2002; 14(1):189-201
 Malin B., Sweeney L.: Determining the identifiability of DNA database entries. Journal of the American Medical Informatics Association, pp. 537¬541, November 2000
 Laszlo, M., Mukherjee, S.: Minimum spanning tree partitioning algorithm for microaggregation. IEEETrans. Knowl. Data Eng. 17(7), 902¬911 (2005)
 J. Domingo-Ferrer and V. Torra, “Ordinal, continuous and heterogenerous k-anonymity through microaggregation,” Data Mining and Knowledge Discovery, vol. 11, no. 2, pp. 195-212, 2005.
 A. Solanas & A. Martlnez-Ballest'e, (2006) “V-MDAV: A multivariate microaggregation with variable group size”, Seventh COMPSTAT Symposium of the IASC, Rome.
This work is licensed under a Creative Commons Attribution 4.0 International License.