HANDLING MISSING VALUES IN NUMERIC DATASET  USING MACHINE LEARNING TECHNIQUES: A REVIEW

Kamaljeet Kaur; Dr. Amrit Kaur; Dr. Navjot Kaur

Authors

Kamaljeet Kaur Department of Computer Science and Engineering, Punjabi University Patiala, India
Dr. Amrit Kaur Department of Computer Science and Engineering, Punjabi University Patiala, India
Dr. Navjot Kaur Department of Computer Science and Engineering, Punjabi University Patiala, India

Keywords:

Data mining, Data classification, K-Nearest Neighbor, Decision Tree, Support Vector Machine, Naïve Bayes

Abstract

Data mining is essential for pre-processing task to ensure the quality of the final product. These tasks include data preparation, cleaning, integration, transformation, reduction, and discretization. Missing values are a common problem that regularly occurs throughout the data cleaning process in various research fields. To complete missing values, eliminate noise and remove inconsistencies is an important process in the preparation of the data. This paper focuses on a review of several classification methods, including their benefits and shortcomings. It is used in a variety of industries, including internet marketing, healthcare, social networking, finance, and insurance. The accuracy of data imputation for machine learning classifiers such as Bayesian Networks, Decision Trees and K-Nearest Neighbors (KNN), as well as Support Vector Machines, is compared in this paper. Based on the findings, Bayesian appears to provide the most promising results when compared to the other classifiers.

HANDLING MISSING VALUES IN NUMERIC DATASET USING MACHINE LEARNING TECHNIQUES: A REVIEW

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Latest publications