Containers and Docker

Guillaume Eynard-Bontemps, CNES (Centre National d’Etudes Spatiales - French Space Agency)

2020-11-16

Credits and thanks

Didn’t do this

Thanks to Florient Chouteau and Dennis Wilson

for their work on this subject.

I took most of the content from theirs:

Containers

Why containers?

  • How to get software to run reliably when moved from one computing environment to another
    • from a developer’s laptop to a test environment
    • from staging to prod
    • from a Cloud provider to another
  • Packaging application + runtime as a single package
  • Abstract differences in OS and underlying hardware
  • Build once, run anywhere
  • Pet vs Cattle, at another level

Container vs VM

Container vs VM: similarities and drawbacks

Similarities

  • Isolated environments for applications
  • Movable between hosts

VM Drawbacks

  • VM Contains full OS at each install => Install + Resource overhead
  • VM needs pre-allocation of resource for each VM (=> Waste if not used)
  • Communication between VM <=> Communication between computers

Containers Drawbacks

  • Containers are Linux based (but still works on Windows)
  • Isolation is not perfect since containers share underlying kernels
    • (security and stability)

Applications concept

Build, Ship, Run

Containers for Data science

Data Science is about reproducibility

  • Experimental science
  • Communicating results
  • Hands-out to other teams
  • Deployment and versioning of models

So… containers ?

  • … for deployment
  • … for standardized development environments
  • … dependency management
  • … for complex / large scale workflows

Quizz

What’s a container?

  • Answer A: Docker
  • Answer B: A virtual Machine, kind of
  • Answer C: A software package that contains everything the software needs to run (system, apps, dependencies)
Answer

Answer link

Docker

Docker

Docker is a solution that standardizes packaging and execution of software in isolated environments (containers) that share resources and can communicate between themselves

Build, Share, and Run Any App, Anywhere

History

Docker

  • Created in 2013
  • Open Source (some parts)
  • Not a new idea but set a new standard
  • Docker is a company built around its main product (Docker Engine)
  • In charge of dev of everything docker (Docker hub…) + additional paid services

Under the hood

Docker is some fancy tech over linux kernel capabilities (containers)

more info

Using Docker in practice

Vocabulary of Docker

  • Layer: Set of read-only files to provision the system
  • Image: Read-Only layer “snapshot” (or blueprint) of an environment. Can inherit from another Image. Image have a name and a tag
  • Container: Read-Write instance of an Image
  • DockerFile: Description of the process used to build an Image
  • Container Registry: Repository of Docker Images
  • Dockerhub: The main container registry of docker.com

Workflow

workflow

Layers, Container, Image

layers

Layer / Image Analogy

Docker:

FROM python:3.6
RUN pip install torch
CMD ipython
docker build -f Dockerfile -t my-image:1.0 .
docker run my-image

Python:

class BaseImage:
    def __init__(self, a):
       self.a = a

class NewImage(BaseImage):
    def __init__(self, a, b):
       super(NewImage, self).__init__(a=a)
       self.b = b

container = NewImage(a=0,b=1)

Dockerfile

  • Used to build Images
FROM python:3.7
ENV MYVAR="HELLO"
RUN pip install torch
COPY my-conf.txt /app/my-conf.txt
ADD my-file.txt /app/my-file.txt
EXPOSE 9000
WORKDIR "/WORKDIR"
USER MYUSER
ENTRYPOINT ["/BIN/BASH"]
CMD ["ECHO” , "${MYVAR}"]
docker build -f Dockerfile -t my-image:1.0 .
docker run my-image
  • Reproducible (if you include static data)
  • Can be put under version control (simple text file)

Architecture

Registry

  • Local registry: All images/containers in your machine
  • https://hub.docker.com/
  • GCP Container Registry
  • Social Dimension (share docker images to speed up development/deployment)

Alternatives: Singularity

“Docker for HPC”

  • No root daemon process
  • Better (?) security
    • Rootless build and run
  • Better isolation between users, “just” a process
  • Bridge between Dockerimage and Singularity images
  • OCI compliant

Alternatives: Podman

“Rootless Docker for Redhat”

  • Not root daemon
  • Not need to be root
  • Fully compliant with Docker images (and OCI)
  • Better security and isolation

Quizz

What’s Docker typical workflow?

  • Answer A: Pull Build Run
  • Answer B: Pull Run
  • Answer C: Build Ship Run
  • answer D: Build Ship Push Run Pull
Answer

Answer link

Hands on Docker

Play with Docker