Frameworks, libraries and packages used in building data science projects get updated regularly. Sometimes you might run into errors when trying to re-run your old data science projects due to version conflicts.
Fortunately, version conflicts can be resolved with the help of virtual environments.
As a data scientist, it is essential to know how to manage environments to develop and run various projects on your computer.
Prerequisite
- Familiar with the command line interface. The command-line tool depends on your operating systems, such as Command Prompt or PowerShell on Windows, Terminal on macOS, or the terminal emulator in Linux distributions.
What is a virtual environment
A virtual environment serves as a way of isolating the libraries, frameworks, and packages installed on your computer from the specific versions required by your project.
By creating a virtual environment, you can separate the dependencies and configurations required by your project from the system-level installations on your computer. This helps to avoid version conflicts and ensures that your project can run with the specific versions of the libraries you've specified in the environment.
There are various ways of creating virtual environments. In this article, we will discuss how to create and manage virtual environments for data science projects with conda.
What is conda
Conda is an open-source package management and environment management system that can be used to create different isolated development environments. Conda can be used in place of pipenv to create a virtual environment.
How to use conda
Now you understand what a virtual environment is all about, it's time to see how you can create a virtual environment for your project with Conda.
Step 1: Download anaconda
Anaconda is an open-source distribution of Python, and other data science tools. It provides an alternative package manager called conda.
Anaconda installation varies depending on your operating system. Follow this guide to install Anaconda on your machine.
To verify if Anaconda is installed on your machine, Open your command terminal and run this command:
conda info
You should get an output like this.
Step 2: Create an environment
Run this command to create an environment:
conda create -n your_environment_name
Replace your_environment_name
with the desired name for your virtual environment.
conda will ask for a confirmation 'y' or 'n' which stands for yes or no respectively. Press y and hit enter
Step 3: Activate the virtual environment
After you create the virtual environment, activate it using the following command:
conda activate your_environment_name
And that's it, you have successfully created a virtual environment. Any package or framework you install in this environment is isolated from your machine. This will resolve version conflicts in running your project codes.
To deactivate the virtual environment created with conda, you can use the following command:
conda deactivate
Running this command will deactivate the currently active virtual environment, and you will return to the base environment or the global Python environment.
Final thoughts
Isolating packages within virtual environments provides the benefit of maintaining project-specific dependencies. You can install the required packages in the virtual environment without worrying about conflicting or interfering with other projects or the system-wide packages.
Overall, virtual environments provide a clean and controlled environment for each project, allowing you to manage dependencies, avoid version conflicts, and ensure reproducibility in running your code.
Thanks for following this article up to this point.
You can connect with me on Twitter and on Linkedin. I can't wait to hear from you.
Top comments (2)
That was a wonderful read.
Virtual environment is the easiest way to secure your code from dependency errors.
The only annoying thing is when you need to install similar dependencies over and over again for every new environment you create 😀
Elated you found this insightful