You’re living in 1985 if you don’t use Docker for your Data Science Projects

What is it Docker and How to to use it with Python

Photo by Iswanto Arif on Unsplash

One of the hardest problems that new programmers face is understanding the concept of an ‘environment’. An environment is what you could say, the system that you code within. In principal it sounds easy, but later on in your career you begin to understand just how difficult it is to maintain.

The reason being is that libraries and IDE’s and even the Python Code itself goes through updates and version changes, then sometimes, you’ll update one library, and a separate piece of code will fail, so you’ll need to go back and fix it.

Moreover, if we have multiple projects being developed at the same time, there can be dependency conflicts, which is when things really get ugly as code fails directly because of another piece of code.

Also, say you want to share a project to a team mate working on a different OS, or even ship your project that you’ve built on your Mac to a production server on a different OS, would you have to reconfigure your code? Yes, you probably will have to.

So to mitigate any of these issues, containers were proposed as a method to separate projects and the environments that they exist within. A container is basically a place where an environment can run, separate to everything else on the system. Once you define what’s in your container, it becomes so much easier to recreate the environment, and even share the project with teammates.

Requirements

To get started, we need to install a few things to get set up:

Containerise a Python service

Let’s imagine we’re creating a Flask service called server.py and let’s say the contents of the file are as follows:

from flask import Flask
server = Flask(__name__)
@server.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
server.run(host='0.0.0.0')

Now as I said above, we need to keep a record of the dependencies for our code so for this, we can create a requirements.txt file that can contain the following requirement:

Flask==1.1.1

So our package has the following structure:

app
├─── requirements.txt
└─── src
└─── server.py

The structure is pretty logical (source kept is kept in a separate directory). To execute our Python program, all is left to do is to install a Python interpreter and run it.

Now to run the program, we could run it locally but suppose we have 15 projects we’re working through — it makes sense to run it in a container to avoid any conflicts with any other projects.

Let’s move onto containerisation.

Photo by Victoire Joncheray on Unsplash

Dockerfile

To run Python code, we pack the container as a Docker image and then run a container based on it. So as follows:

  1. Create a Dockerfile that contains instructions needed to build the image
  2. Then create an image by the Docker builder
  3. The simple docker run <image> command then creates a container that is running an app

Analysis of a Dockerfile

A Dockerfile is a file that contains instructions for assembling a Docker image (saved as myimage):

# set base image (host OS)
FROM python:3.8
# set the working directory in the container
WORKDIR /code
# copy the dependencies file to the working directory
COPY requirements.txt .
# install dependencies
RUN pip install -r requirements.txt
# copy the content of the local src directory to the working directory
COPY src/ .
# command to run on container start
CMD [ "python", "./server.py" ]

A Dockerfile is compiled line by line so the builder generates an image layer and stacks it upon previous images.

We can also observe in the output of the build command the Dockerfile instructions being executed as steps.

$ docker build -t myimage .
Sending build context to Docker daemon 6.144kB
Step 1/6 : FROM python:3.8
3.8.3-alpine: Pulling from library/python

Status: Downloaded newer image for python:3.8.3-alpine
---> 8ecf5a48c789
Step 2/6 : WORKDIR /code
---> Running in 9313cd5d834d
Removing intermediate container 9313cd5d834d
---> c852f099c2f9
Step 3/6 : COPY requirements.txt .
---> 2c375052ccd6
Step 4/6 : RUN pip install -r requirements.txt
---> Running in 3ee13f767d05

Removing intermediate container 3ee13f767d05
---> 8dd7f46dddf0
Step 5/6 : COPY ./src .
---> 6ab2d97e4aa1
Step 6/6 : CMD python server.py
---> Running in fbbbb21349be
Removing intermediate container fbbbb21349be
---> 27084556702b
Successfully built 70a92e92f3b5
Successfully tagged myimage:latest

Then, we can see that the image is in the local image store:

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
myimage latest 70a92e92f3b5 8 seconds ago 991MB

During development, we may need to rebuild the image for our Python service multiple times and we want this to take as little time as possible.

Note: Docker and virtualenv are quite similar but different. Virtualenv only allows you to switch between Python Dependencies but you’re stuck with your host OS. However with Docker, you can swap out the entire OS — install and run Python on any OS (think Ubuntu, Debian, Alpine, even Windows Server Core). Therefore if you work in a team and want to future proof your technology, use Docker. If you don’t care about it — venv is fine, but remember it’s not future proof. Please reference this if you still want more information.


There you have it! We’ve shown how to containerise a Python service. Hopefully, this process will make it a lot easier and gives your project a longer shelf life as it’ll be less likely to come down with code-bugs as dependencies change.

Thanks for reading, and please let me know if you have any questions!

Keep up to date with my latest articles here!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: