Enhancing imputation techniques performance utilizing uncertainty aware predictors and adversarial learning
DOI:
https://doi.org/10.21533/pen.v10.i3.669Abstract
One crucial problem for applying machine learning algorithms to real-world datasets is missing data. The objective of data imputation is to fill the missing values in a dataset to resemble the completed dataset as accurately as possible. Many methods are proposed in the literature that mostly differs on the objective function and types of the variables considered. The performance of traditional machine learning methods is low when there is a nonlinear and complex relationship between features. Recently, deep learning methods are introduced to estimate data distribution and generate values for missing entries. However, these methods are originally developed for large datasets and custom data types such as image, video, and text. Thus, adopting these methods for small and structured datasets that are prevalent in real-world applications is not straightforward and often yields unsatisfactory results. Also, both types of methods do not consider uncertainty in the imputed data. We address these issues by developing a simple neural network-based architecture that works well with small and tabular datasets and utilizing a novel adversarial strategy to estimate the uncertainty of imputed data. The estimated uncertainty scores of features are then passed to the imputer module, and it fills the missing values by paying more attention to more reliable feature values. It results in an uncertainty-aware imputer with a promising performance. Extensive experiments conducted on some real-world datasets confirm that the proposed methods considerably outperform state-of-the-art imputers. Meanwhile, their execution time is not costly compared to peer state-of-the-art methods.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.




