Which Environment Is Yours?
Virtual Environments are a relatively difficult thing for new programmers to understand. One problem I had in understanding Virtual Environments was that I could see my environment existed within an MacOS framework, I was using PyCharm and my code was running, what else did I need?
However, as your career as a Data Scientist or Machine Learning Engineer progresses, you realise that you get these annoying as hell dependency issues between projects and as an amateur who’s self taught in this space (as many readers here), it just takes forever to figure it out.
In what follows, I go through the most common virtual environments and why/when you should use which. To be honest, you should probably use
Docker as it’s the latest technology and it’s what everyone is using (and if you’re interviewing, you’ll be asked about it). I talk about Docker here.
However, it’s super important to appreciate existing technology and how it works. Here it goes!
VirtualEnv (or Venv for short) was (and kind of still is) the default virtual environment for most programmers. You can install is using
pip as follows
pip install virtualenv
and once it’s installed, go to your chosen director and to create a virtual environment, run the following command:
python3 -m venv env
Before you can start installing or using packages in your virtual environment you’ll need to activate it. Activating a virtual environment will put the virtual environment-specific
pip executables into your shell’s
And now that you’re in an activated virtual environment, you can start installing libraries as normal:
pip install requests
Finally, to make your repo reusable, make sure to create a record of everything that’s installed in your new environment, run
pip freeze > requirements.txt
If you are creating a new virtual environment from a requirements.txt file, you can run
pip install -f requirements.txt
If you open your requirements file you will see a different package with its version in each line.
Finally, to deactivate the virtual environment, you can simply use the
deactivate command to close the virtual environment. If you want to re-enter the virtual environment just follow the same instructions above about activating a virtual environment. There’s no need to re-create the virtual environment.
So we can see that so far, we’ve had to manually create a virtual environment, we’ve had to then activate it and then also freeze the session and save everything into a requirements.txt file to make it portable. But what if we didn’t have to have to this two part process?
venv is still the official virtual environment tool that ships with the latest version of Python,
Pipenv is gaining ground in the Python Community.
For example, in what we just described about with
venv, in order to create virtual environments so you could run multiple projects on the same computer you’d need:
- A tool for creating a virtual environment (like
- A utility for installing packages (like
- A tool/utility for managing virtual environments (like
Pipenv includes all of the above, and more, out of the box.
Pipenv handles dependency management really well compared to requirements.txt and
Pipenv works the same as pip when it comes to installing dependencies and if you get a conflict you still have to manage it (although you can issue
pipenv graph to view a full dependency tree, which should help).
But once you‘ve solved the issue,
Pipfile.lock keeps track of all of your application’s interdependencies, including their versions, for each environment so you can basically forget about interdependencies. This is really a step up.
pipenv, you need to install pip first. Then do
pip install pipenv
Next, you create a new environment by using
This will look for a
pipenv file, if it doesn’t exist, it will create a new environment and activate it.
To activate you can simply run the following command:
To install new packages in this environment you can simply use
pip install package , and
pipenv will automatically add the package to the pipenv file that’s called
You can also install package for just the dev environment by calling
pip install <package> --dev
And once you’re ready to ship to production, all you do is:
This will create/update your
Pipfile.lock, which you’ll never need to edit manually. You should always use the generated file. Now, once you get your code and
Pipfile.lock in your production environment, you should install the last successful environment recorded:
pipenv install --ignore-pipfile
pipenv to ignore the
pipfile for installation and use what’s in the
Pipfile.lock. Given this
pipenv will create the exact same environment you had when you ran
pipenv lock, sub-dependencies and all.
The lock file enables deterministic builds by taking a snapshot of all the versions of packages in an environment (similar to the result of a
There you have it! Now we’ve compared
venv and shown that
pipenv is a much easier solution.
Anaconda is distribution of Python that makes it simple to install packages and it’s generally a good place for Python beginners. At the same time,
Anaconda also has its own virtual environment system
conda. Similar to the above, to create the environment:
conda create --name environment_name python=3.6
You can save all the info necessary to recreate the environment in a file by calling
conda env export > environment.yml
To recreate the environment you can do the following:
conda env create -f environment.yml
Last, you can activate your environment with the invocation:
conda activate conda-env
And deactivate it with:
Environments created with
conda live by default in the
envs/ folder of your Conda directory.
Now in my experience,
conda is OK but I prefer the approach taken by
venv for two reasons. ️Firstly, iIt makes it easy to tell if a project utilises an isolated environment by including the environment as a sub-directory.
Further, It allows you to use the same name for all of your environments, meaning you can activate each with the same command. However as
conda puts environments in a certain folder (rather than initiating the environment), it makes it easier to make an environment.
Docker is a library that creates
docker containers. These containers contain images of how your operating system looks like, whereas
virtualenv only looks at the dependency structure of your python project. So, a
virtualenv only encapsulates Python dependencies. A
docker container encapsulates an entire OS.
Because of this, with a Python
virtualenv, you can easily switch between Python versions and dependencies, but you’re stuck with your host
OS. However with a
docker image, you can swap out the entire
OS — install and run Python on Ubuntu, Debian, Alpine, even Windows Server Core.
There are Docker images out there with every combination of
Python versions you can think of, ready to pull down and use on any system with
If you think about each of the environments listed above, you’ll realise that there’s a natural divide between them.
Conda is better suited (naturally) for those whoa re using the
Anaconda distribution (so mostly for beginners in Python) whereas
venv are for those individuals who are more seasoned and know the ropes. Of these two, if you’re something from scratch i’d really recommend to go with
pipenv as it’s just been built with the difficulties of
venv in mind.
Docker is both easy to use and has such widespread recognition that you just have to know how this works. They all actually work out of the tin and do what they need to, but the portability between operating systems is what makes
Docker the real stand out because when it comes to production, you don’t need to worry about the
OS on your server as the container has it all sorted for you.
Thanks for reading! If you have any messages, please let me know!
Keep up to date with my latest articles here!