Performance Comparison between Two Interpretations of Missing Data using Matrix-Characterized Approximations

  • Myat Myat Min Faculty of Computer Science, University of Computer Studies, Mandalay, Myanmar
  • Thin Thin Soe Web Mining Lab, University of Computer Studies, Mandalay, Myanmar

Abstract

Nowadays, the veracity related with data quality such as incomplete, inconsistent, vague or noisy data creates a major challenge to data mining and data analysis. Rough set theory presents a special tool for handling the incomplete and imprecise data in information systems. In this paper, rough set based matrix-represented approximations are presented to compute lower and upper approximations. The induced approximations are conducted as inputs for data analysis method, LERS (Learning from Examples based on Rough Set) used with LEM2 (Learning from Examples Module, Version2) rule induction algorithm. Analyzes are performed on missing datasets with “do not care” conditions and missing datasets with lost values. In addition, experiments on missing datasets with different missing percent by using different thresholds are also provided. The experimental results show that the system outperforms when missing data are characterized as “do not care” conditions than represented as lost values.

Downloads

Download data is not yet available.

References

[1] J. Stefanowski, and A. Tsoukiàs, “On the Extension of Rough Sets Under Incomplete Information”, International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, Springer, Berlin, Heidelberg, 1999, pp. 73-81.
[2] J. Zhang, J. S. Wong, Y. Pan, and T. Li, “A Parallel Matrix-Based Method for Computing Approximations in Incomplete Information Systems”, IEEE Transactions on Knowledge and Data Engineering, vol. 27, 2015, pp. 326-339.
[3] J. Zhang, Li. Tianrui, D. Ruan, and D. Liu, “Rough Sets based Matrix Approaches with Dynamic Attribute Variation in Set-Valued Information Systems”, International Journal of Approximate Reasoning, vol. 53, 2012, pp. 620-635.
[4] J.W. Grzymala-Busse, “LERS-A System for Learning from Examples based on Rough Sets”, Intelligent Decision Support, vol. 11, Springer, Dordrecht, 1992, pp. 3-18.
[5] J.W. Grzymala-Busse, “Characteristic Relations for Incomplete Data: A Generalization of the Indiscernibility Relation”. Transactions on rough sets IV, Springer, Berlin, Heidelberg, 2005, pp. 58-68.
[6] J.W. Grzymala-Busse, “Rough Set Strategies to Data with Missing Attribute Values”, Foundations and Novel Approaches in Data Mining. Studies in Computational Intelligence, vol. 9, Springer, Berlin, Heidelberg, 2005, pp. 197-212.
[7] J.W. Grzymala-Busse, “Three Approaches to Missing Attribute Values: A Rough Set Perspective”, Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol. 118, Springer, Berlin, Heidelberg, 2008, pp. 139-152.
[8] J.W. Grzymala-Busse, and B. W. Chien Pei, “Classification Methods in Rule Induction”, Proc of the Fifth Intelligent Information Systems Workshop, Deblin, Poland, 1996.
[9] M. Kryszkiewicz, “Rough Set Approach to Incomplete Information Systems”, Information sciences, 112(1-4), 1998, pp. 39-49.
[10] M. Kryszkiewicz, “Rules in Incomplete Information Systems”, Information sciences, 113(3-4), 1999, pp. 271-292.
[11] T.T.Soe, and M.M.Min, “Speeding up Incomplete Data Analysis using Matrix-Represented Approximations”, 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (IEEE/ACIS SNPD 2018), Busan, Korea, June 27-29, 2018, pp. 206-211.
[12] Z. Pawlak, “Rough Sets: Theoretical Aspects of Reasoning about Data”, System Theory, Kluwer Academic Publishers, Boston, London, Dordrecht, 1991.
[13] https://archive.ics.uci.edu/ml/datasets/Mushroom
Published
2019-03-15
How to Cite
MIN, Myat Myat; SOE, Thin Thin. Performance Comparison between Two Interpretations of Missing Data using Matrix-Characterized Approximations. International Journal of Research and Engineering, [S.l.], v. 6, n. 2, p. 589-595, mar. 2019. ISSN 2348-7860. Available at: <https://digital.ijre.org/index.php/int_j_res_eng/article/view/375>. Date accessed: 22 aug. 2019. doi: https://doi.org/10.21276/ijre.2019.6.2.3.