Enhancing imputation techniques performance utilizing uncertainty aware predictors and adversarial learning

Wafaa Mustafa Hameed, Nzar A. Ali

Abstract


One crucial problem for applying machine learning algorithms to real-world datasets is missing data. The objective of data imputation is to fill the missing values in a dataset to resemble the completed dataset as accurately as possible. Many methods are proposed in the literature that mostly differs on the objective function and types of the variables considered. The performance of traditional machine learning methods is low when there is a nonlinear and complex relationship between features. Recently, deep learning methods are introduced to estimate data distribution and generate values for missing entries. However, these methods are originally developed for large datasets and custom data types such as image, video, and text. Thus, adopting these methods for small and structured datasets that are prevalent in real-world applications is not straightforward and often yields unsatisfactory results. Also, both types of methods do not consider uncertainty in the imputed data. We address these issues by developing a simple neural network-based architecture that works well with small and tabular datasets and utilizing a novel adversarial strategy to estimate the uncertainty of imputed data. The estimated uncertainty scores of features are then passed to the imputer module, and it fills the missing values by paying more attention to more reliable feature values. It results in an uncertainty-aware imputer with a promising performance. Extensive experiments conducted on some real-world datasets confirm that the proposed methods considerably outperform state-of-the-art imputers. Meanwhile, their execution time is not costly compared to peer state-of-the-art methods.

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/pen.v10i3.3110

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Wafaa Mustafa Hameed, Nzar A. Ali

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2303-4521

Digital Object Identifier DOI: 10.21533/pen

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License