Which Environment Is Yours?
Virtual Environments are a relatively difficult thing for new programmers to understand. One problem I had in understanding Virtual Environments was that I could see my environment existed within an MacOS framework, I was using PyCharm and my code was running, what else did I need?
However, as your career as a Data Scientist or Machine Learning Engineer progresses, you realise that you get these annoying as hell dependency issues between projects and as an amateur who’s self taught in this space (as many readers here), it just takes forever to figure it out.
In what follows, I go through the most common virtual environments and why/when you should use which. To be honest, you should probably use Docker
as it’s the latest technology and it’s what everyone is using (and if you’re interviewing, you’ll be asked about it). I talk about Docker here.
However, it’s super important to appreciate existing technology and how it works. Here it goes!
VENV
VirtualEnv (or Venv for short) was (and kind of still is) the default virtual environment for most programmers. You can install is using pip
as follows
pip install virtualenv
and once it’s installed, go to your chosen director and to create a virtual environment, run the following command:
python3 -m venv env
Before you can start installing or using packages in your virtual environment you’ll need to activate it. Activating a virtual environment will put the virtual environment-specific python
and pip
executables into your shell’s PATH
.
source env/bin/activate
And now that you’re in an activated virtual environment, you can start installing libraries as normal:
pip install requests
Finally, to make your repo reusable, make sure to create a record of everything that’s installed in your new environment, run
pip freeze > requirements.txt
If you are creating a new virtual environment from a requirements.txt file, you can run
pip install -f requirements.txt
If you open your requirements file you will see a different package with its version in each line.
Finally, to deactivate the virtual environment, you can simply use the deactivate
command to close the virtual environment. If you want to re-enter the virtual environment just follow the same instructions above about activating a virtual environment. There’s no need to re-create the virtual environment.
So we can see that so far, we’ve had to manually create a virtual environment, we’ve had to then activate it and then also freeze the session and save everything into a requirements.txt file to make it portable. But what if we didn’t have to have to this two part process?
Enter pipenv
.
PipEnv
While venv
is still the official virtual environment tool that ships with the latest version of Python, Pipenv
is gaining ground in the Python Community.
For example, in what we just described about with venv
, in order to create virtual environments so you could run multiple projects on the same computer you’d need:
- A tool for creating a virtual environment (like
venv
) - A utility for installing packages (like
pip
oreasy_install
) - A tool/utility for managing virtual environments (like
virtualenvwrapper
orpyenv
)
Pipenv
includes all of the above, and more, out of the box.
Moreover, Pipenv
handles dependency management really well compared to requirements.txt and pip freeze
. Pipenv
works the same as pip when it comes to installing dependencies and if you get a conflict you still have to manage it (although you can issue pipenv graph
to view a full dependency tree, which should help).
But once you‘ve solved the issue, Pipfile.lock
keeps track of all of your application’s interdependencies, including their versions, for each environment so you can basically forget about interdependencies. This is really a step up.
To install pipenv
, you need to install pip first. Then do
pip install pipenv
Next, you create a new environment by using
pipenv install
This will look for a pipenv
file, if it doesn’t exist, it will create a new environment and activate it.
To activate you can simply run the following command:
pipenv shell
To install new packages in this environment you can simply use pip install package
, and pipenv
will automatically add the package to the pipenv file that’s called Pipfile
.
You can also install package for just the dev environment by calling
pip install <package> --dev
And once you’re ready to ship to production, all you do is:
pipenv lock
This will create/update your Pipfile.lock
, which you’ll never need to edit manually. You should always use the generated file. Now, once you get your code and Pipfile.lock
in your production environment, you should install the last successful environment recorded:
pipenv install --ignore-pipfile
This tells pipenv
to ignore the pipfile
for installation and use what’s in the Pipfile.lock
. Given this Pipfile.lock
, pipenv
will create the exact same environment you had when you ran pipenv lock
, sub-dependencies and all.
The lock file enables deterministic builds by taking a snapshot of all the versions of packages in an environment (similar to the result of a pip freeze
).
There you have it! Now we’ve compared pipenv
and venv
and shown that pipenv
is a much easier solution.
Conda Environment
Anaconda
is distribution of Python that makes it simple to install packages and it’s generally a good place for Python beginners. At the same time, Anaconda
also has its own virtual environment system conda
. Similar to the above, to create the environment:
conda create --name environment_name python=3.6
You can save all the info necessary to recreate the environment in a file by calling
conda env export > environment.yml
To recreate the environment you can do the following:
conda env create -f environment.yml
Last, you can activate your environment with the invocation:
conda activate conda-env
And deactivate it with:
conda deactivate
Environments created with conda
live by default in the envs/
folder of your Conda directory.
Now in my experience, conda
is OK but I prefer the approach taken by venv
for two reasons. ️Firstly, iIt makes it easy to tell if a project utilises an isolated environment by including the environment as a sub-directory.
Further, It allows you to use the same name for all of your environments, meaning you can activate each with the same command. However as conda
puts environments in a certain folder (rather than initiating the environment), it makes it easier to make an environment.
Docker
In a previous blog post I talk about Docker
and go into detail explaining how to use it, so I won’t bore you here.
Docker
is a library that creates docker containers
. These containers contain images of how your operating system looks like, whereas virtualenv
only looks at the dependency structure of your python project. So, a virtualenv
only encapsulates Python dependencies. A docker container
encapsulates an entire OS.
Because of this, with a Python virtualenv
, you can easily switch between Python versions and dependencies, but you’re stuck with your host OS
. However with a docker image
, you can swap out the entire OS
— install and run Python on Ubuntu, Debian, Alpine, even Windows Server Core.
There are Docker images out there with every combination of OS
and Python
versions you can think of, ready to pull down and use on any system with docker
installed.
If you think about each of the environments listed above, you’ll realise that there’s a natural divide between them. Conda
is better suited (naturally) for those whoa re using the Anaconda
distribution (so mostly for beginners in Python) whereas pipenv
and venv
are for those individuals who are more seasoned and know the ropes. Of these two, if you’re something from scratch i’d really recommend to go with pipenv
as it’s just been built with the difficulties of venv
in mind.
However, Docker
is both easy to use and has such widespread recognition that you just have to know how this works. They all actually work out of the tin and do what they need to, but the portability between operating systems is what makes Docker
the real stand out because when it comes to production, you don’t need to worry about the OS
on your server as the container has it all sorted for you.
Thanks for reading! If you have any messages, please let me know!
Keep up to date with my latest articles here!
Leave a Reply