HANDLING MISSING VALUES IN NUMERIC DATASET USING MACHINE LEARNING TECHNIQUES: A REVIEW

Authors

  • Kamaljeet Kaur Department of Computer Science and Engineering, Punjabi University Patiala, India
  • Dr. Amrit Kaur Department of Computer Science and Engineering, Punjabi University Patiala, India
  • Dr. Navjot Kaur Department of Computer Science and Engineering, Punjabi University Patiala, India

Keywords:

Data mining, Data classification, K-Nearest Neighbor, Decision Tree, Support Vector Machine, Naïve Bayes

Abstract

Data mining is essential for pre-processing task to ensure the quality of the final product. These  tasks include data preparation, cleaning, integration, transformation, reduction, and discretization.  Missing values are a common problem that regularly occurs throughout the data cleaning process  in various research fields. To complete missing values, eliminate noise and remove inconsistencies  is an important process in the preparation of the data. This paper focuses on a review of several  classification methods, including their benefits and shortcomings. It is used in a variety of  industries, including internet marketing, healthcare, social networking, finance, and insurance. The  accuracy of data imputation for machine learning classifiers such as Bayesian Networks, Decision  Trees and K-Nearest Neighbors (KNN), as well as Support Vector Machines, is compared in this  paper. Based on the findings, Bayesian appears to provide the most promising results when  compared to the other classifiers.   

Downloads

Published

2023-12-30

How to Cite

HANDLING MISSING VALUES IN NUMERIC DATASET USING MACHINE LEARNING TECHNIQUES: A REVIEW. (2023). JOURNAL PUNJAB ACADEMY OF SCIENCES, 23, 217-227. https://jpas.in/index.php/home/article/view/69

Most read articles by the same author(s)