Introduction
In this tutorial, we will be predicting the C02 emissions and their impact on food production with MindsDB. MindsDB is an open-source machine-learning tool that brings automated machine learning to your database. MindsDB offers predictive capabilities in your database.
To complete this tutorial, you are required to have a working MindsDB connection, either locally or via cloud.mindsdb.com. You can use this guide to connect to the MindsDB cloud.
Data Setup
Connecting the data as a file
Follow the steps below to upload a file to MindsDB Cloud.
- Log in to your MindsDB Cloud account to open the MindsDB Editor.
- Navigate to
Add data
the section by clicking theAdd data
button located in the top right corner.
- Choose the File tab
- Choose the
Import File
option. - Upload a file Food_Production.csv, name a table used to store the file data (here it is
Food_Production
), and click theSave and Continue
button. - Once you are done uploading, you can query the data directly with the;
SELECT * FROM files.Food_Production LIMIT 10;
Understanding the Dataset
Context
As the world’s population has expanded and gotten richer, the demand for food, energy, and water has seen a rapid increase. Not only has demand for all three increased, but they are also strongly interlinked: food production requires water and energy; traditional energy production demands water resources; agriculture provides a potential energy source. This article focuses on the environmental impacts of food. Ensuring everyone in the world has access to a nutritious diet in a sustainable way is one of the greatest challenges we face.
Content
This dataset contains most 43 most common foods grown across the globe and 23 columns as their respective land, water usage, and carbon footprints.
Columns
- Land use change - Kg CO2 - equivalents per kg product
- Animal Feed - Kg CO2 - equivalents per kg product
- Farm - Kg CO2 - equivalents per kg product
- Processing - Kg CO2 - equivalents per kg product
- Transport - Kg CO2 - equivalents per kg product
- Packaging - Kg CO2 - equivalents per kg product
- Retail - Kg CO2 - equivalents per kg product
These represent greenhouse gas emissions per kg of food product(Kg CO2 - equivalents per kg product) across different stages in the lifecycle of food production.
Eutrophication – the pollution of water bodies and ecosystems with excess nutrients – is a major environmental problem. The runoff of nitrogen and other nutrients from agricultural production systems is a leading contributor.
Acknowledgments and Credits
Credit: Vivek
Original Source: *Environment Impact of Food Production*
Creating the Predictor
To being, let’s create a predictor that uses the columns to predict CO2 emission. You can learn more about creating a predictor by checking here. You can predict a classification series model using the following syntax
CREATE PREDICTOR mindsdb.[predictor_name]
FROM [integration_name]
(SELECT [sequential_column], [partition_column], [other_column], [target_column]
FROM [table_name])
PREDICT [target_column]
-
CREATE PREDICTOR
: Creates a predictor with the namepredictor_name
in themindsdb
table. -
FROM files
: Points to the table containing the data. -
PREDICT
: Dictates the column to predict.
CREATE PREDICTOR mindsdb.CO2_emission
FROM files (SELECT * FROM files.Food_Produuction)
PREDICT Total_emissions
On execution we get:
Query successfully completed
Status of a Predictor
A predictor may take a couple of minutes for the training to complete. You can monitor the status of the predictor by using this SQL command:
SELECT status
FROM mindsdb.predictors
WHERE name='CO2_emission';
Your output should be:
+------------+
| status |
+------------+
| complete |
+------------+
Making Predictions
Now that we have our Prediction Model, we can simply execute some simple SQL query statements to predict the target value based on the feature parameters:
Making a Single Prediction
You can make predictions by querying the predictor as if it were a table. The SELECT statement lets you make predictions for the CO2_emission
on the chosen feature parameter.
SELECT Total_emissions, Total_emissions_explain
FROM mindsdb.CO2_emission
WHERE Food_Product = 'Rice'
AND Land_use_charge = 0
AND Animal_Feed = 0
AND Farm = 1.4
AND Processing = 0.1
AND Transport = 0.1
AND Packaging = 0.1
AND Retail = 0.3
Making a Bulk Prediction
Now let’s make bulk predictions or multiple predictions to predict the CO2_emission
by joining our table with the model.
SELECT * FROM mindsdb.CO2_emission
JOIN files.Food_Production
LIMIT 100;
What’s Next?
Have fun while trying it out yourself!
- Star the MindsDB repository on GitHub.
- Sign up for a free MindsDB account
- Engage with the MindsDB community on Slack or GitHub to ask questions and share your ideas and thoughts.
Give a like or a comment if this tutorial was helpful
Top comments (0)