Introduction
With MindsDB, we can begin predicting in SQL right now. In order to predict the Life Expectancy on the basis of the options, we will fetch our dataset and create a model instantly.
This tutorial will teach you how to train a model to predict Life Expectancy based on many variables, such as Country, Status, lifeexpectancy, Infantdeath, Adultmortality etc. From Kaggle, we will use a Life Expectancy dataset.
Then we will log in to MindsDB Cloud, connect it to our data, train a model based on the dataset in our data, and attempt to predict mobile costs.
Importing Datasets as Tables in our Database
Step 1: After you've downloaded the Dataset, save the CSV file in a folder Make sure you have MySQL installed on your system, and that it is active. Use MySQL Workbench to test the Database connections and make sure everything is working.
Step 2: If the connection is successful, you can click on OK to save the connection. Then click on the connection and it should open the Localhost dashboard along with the query editor.
Step 3: Now click on Create a new schema in the connected server, provide a name for the schema and then hit Apply, and then click on Finish.
Step 4: After creating the database, select Schemas from the bottom left and then right-click on the schema you just created and select Table Data Import Wizard. A dialog box should appear for you to select the CSV file and click Next.
Click on Next to create a new table and give it a name as you like. Click Next again to click on Finish and then click Next again twice to complete the import. Now the table data import is complete and we are ready to manipulate the data in our table.
Step 5: Let us run a sample SELECT query to see whether we have all of our data in the table.
show databases;
use MindsDB_Demo;
select * from lifeexpectancydata;
This should return a list of rows from the lifeexpectancydata table in the MindsDB_Demo database. We are now ready with the database and the table to perform further operations.
Connecting the Database to MindsDB Cloud
In this tutorial we are using Life Expectancy dataset.Download the dataset from Kaggle https://www.kaggle.com/datasets/kumarajarshi/life-expectancy-who and then extract it and save the CSV file for later use.
Now allow us to start with our MindsDB Cloud account to require things additional.
Step 1: Login to your existing MindsDB Cloud account or signup for a brand new one here.
Step 2: After signing up or logging in, you will see the MindsDB Cloud Editor. A question can be written in the top panel, the results can be displayed in the bottom panel, and a number of training hub resources can be found in the right panel to make things easier for new users.
Step 3: Click on Add Data from the highest right and on consequent screen that seems, select Databases from left panel and so click on Mysql database.
Step 4: In this pre-filled query, we need to provide certain parameters to establish a connection to our local database. In order to that, we need to expose localhost:3306 to the internet so that we can establish a tcp/ip connection to interact with it.
Step 5: We have to now use ngrok to tunnel our localhost to the internet. This command should do the job for you.
ngrok tcp 3306
Make sure you signup for a ngrok account and connect with your account using the auth token you received after signing up so that you are not time-limited for your tunnelling. You can do the following using this command.
ngrok config add-authtoken YourAuthTokenHere
Step 6: We can now fill in the parameters in the query above to connect to the database and then hit the Run(Shift + Enter) button to fire the query. This query should execute successfully. Have a look at the snip below to learn more about each of the parameters.
Step 7: Now let's run a simple SELECT query now to make sure that MindsDB can now access the data in the database.
select * from life.lifeexpectancydata;
This should return all the records from the table lifeexpectancydata.
Training a Predictor Model
MindsDB makes it extremely easy to define a Predictor model and train it by using a simple SQL syntax.
Step 1: We will now use the CREATE PREDICTOR syntax to create the Predictor. The syntax will be like this.
CREATE PREDICTOR mindsdb.predictor_name (Your Predictor Name)
FROM database_name (Your Database Name)
(SELECT * FROM table_name LIMIT 10000) (Your Table Name)
PREDICT target_parameter; (Your Target Parameter)
The query get's executed and return successful in the terminal.
Step 2: It will take a little while for the model to be created and trained, but we can check its status using the syntax below. If the query returns Complete, the model is ready for use; otherwise, it will return Training.
SELECT status
FROM mindsdb.predictors
WHERE name='name_of_the_predictor_model';
Describing the Predictor Model
MindsDB provides a DESCRIBE statement that we can use to gain some insights into the Predictor Model. We can find more details about the model in the following three ways.
- By Features
- By Model
- By Model Ensemble
By Features
DESCRIBE mindsdb.predictor_model_name.features;
This statement is used to find out the type of encoders used on each column to train the model and the role of each of the columns for the model. A sample output for our model is posted below.
By Model
DESCRIBE mindsdb.predictor_model_name.model;
MindsDB uses multiple models internally to train the data and then select the most optimized one for the model to do the predictions. This statement simply lists out all the candidate models used to train the data along with other details. The model with 1 in its selected column is the one that is the most optimized and accurate.
By Model Ensemble
DESCRIBE mindsdb.predictor_model_name.ensemble;
With the above statement, we can simply query out a JSON object that lists out the multiple attributes used to select the best candidate model to do the predictions.
Querying the Model
The predictor model lifeexpectancy is now ready to predict the Life expectancy percentage for us. MindsDB provides simple SELECT query statements that we can fire to find the predictions based on the specified features.
Let's try predicting the failure rate based on a single feature. The SELECT statement would be as follows.
SELECT Lifeexpectancy
FROM mindsdb.lifeexpectancy
WHERE Status ='Developing';
This should predict the life expectancy percentage of the people living in Developing countries.
Now let's try predicting the failure rate with more than one parameter. The query should look something like this.
SELECT Lifeexpectancy
FROM mindsdb.lifeexpectancy
WHERE Status ='Developing' AND Country="Australia";
This should predict the life expectancy percentage of the people living in Developing country and specific to Australia.
Conclusion
It's time to wrap up the tutorial. As part of this tutorial, we created a MindsDB Cloud account, uploaded a dataset to the cloud interface, trained a predictor model with the dataset, and predicted the Life Expectancy percentage.
Using MindsDB, you can coach your own predictive models using datasets available on the market online. Don't be afraid to give it a try if you want to make all the predictions you want.
As a final note, please LIKE this page if you learn something new and interesting today and feel free to share your feedback below.
Top comments (0)