DEV Community

Cover image for Data Preprocessing And Visualization In C++
Ahmed Hashesh
Ahmed Hashesh

Posted on • Originally published at towardsdatascience.com

Data Preprocessing And Visualization In C++

Data preprocessing is the process of converting raw data into computer understandable formats, it’s the first step in any machine learning operation. Data collection is usually loosely controlled and may result in out-of-range values. Data preparation and filtering steps can take a considerable amount of processing time.
Data preprocessing includes:

  • Reading Data from files.
  • Data cleaning.
  • Instance selection.
  • Data standardization.
  • Data transformation.
  • Feature extraction and selection.

The product of data preprocessing is the final training set. In this article, I will address some of the data preprocessing steps while using C++, also data visualization using the Matplotlib-Cpp library.

First, this article is part of a series discussing the implementation of the Machine learning Basics using C++. Please follow

In this article, I will use the iris dataset as an example of the data that we can perform each operation on it, also note that I will be using C++11 in this tutorial.

Reading Data from Files:

After downloading the iris.data file from here. let’s read the data from a file with a simple read file instructions and parse each type of data in a separate vector.

Link to the original article

Top comments (0)