Kick start docker for a deeplearning environment

1. what is docker

The first virtualization machine I setup was back to VMWare era. At that time, by using VMWare, you can install linux on a windows machine, and it is very useful too. For me at that time, knowing nothing about linux, I have to google everything, “how to start X window”, “how to restart service”, “how to install software”. So running the linux in an application-like manner, I can convienetly switch back to browser to search online. Perhaps the only defect is the virtual machine runs slower than the host machine, and at that time video card can not be used by VMWare too.

As we stepping into the cloud computing, with the performance of the computer greatly improved, the concept of virtualization have become more and more popular. Now we have virtualization at different levels, bestowing you even more flexibility. Docker, specifically speaking, is much more useful than others. Unlike VMWare that simulate the whole machine, docker only simulate at the application level, or you can consider it as a sandbox (container) that provide all library/OS-related files for your application to run. In the past, if you want to install a software, you have to download it, compile it in the local environment, and exam carefully at the version of “gcc”, “libc”, and other libraries, and sometime it is not easy. If you have tried to install numpy youself, you would understand why python bundle package is so popular (e.g. anaconda, python(x,y)). Besides, I won’t get worried about if I could install another software who use a different version of library in the same machine. Now, everything is simple, you just start the docker service and the software containers and you won’t worry about the conflict they may have because different containers run independently.

Also, it provides additonal flexibility. For a web-server, imaging to setup many encapsulated micro-service in many docker containers instead of one single machine, then whenever one service encoutered problem, you can just restart that containers instead of the whole machine, so that rest of the service won’t be affected. Upgrading is also simple, one script would upgrade all of the service without restart-delay.

Last, since it simulate at a higher level, the performance is much better than low level simulation although still less than real machine. But think of the convience and flexibility it brings, docker is still a great tool!

Probably I should mention, the “windows subsystem for linux” or WSL in windows10 is not the same concept. WSL is possible only because of the structrue of the windows NT system. So generally, the NT core can have many different “front”. Usually, we “mount” windows to the NT core, so it is a windows. Therefore, if you “mount” a linux “front”, it will become a linux machine. Or if you mount “OS/2 warp” front, it will become an IBM OS/2 machine. But the software you developed in one “front”, can not be excuted in another “front”. They are still complied within the “front”.

2. install docker

Docker can be installed in windows or linux or mac, and once docker is installed, the same container can run inside docker in any OS/machine. The container can encapsule anything, from a linux system to software or just an environment. And combined with a git-like service provided by the docker-hub, any changes you made to the container can be committed and pushed back to your docker-hub account. Then if you want to switch to another machine, a simple “docker pull” (comparable to “git pull”) will replicate the in-situ working environment exactly at where you left.

install docker in windows

You have to have windows 10 Pro version to install docker-for-windows. The windows 10 Pro support the hyper-V which is essential to the docker. However, if you don’t have such a version, say your windows is home/student edition, you can use other subsitions, like docker toolbox or minikube and the only difference is the performance.

One thing to metion, is docker needs Type-1 hypervisor, which can not be turned on at the same time with Type-2 hypervisor that is used by some other virsual machine products (e.g. Oracle Virtual Box). One solution is you restart the machine every time after you switch to the other type. Tedious but works. Another solution would be: you first install a virtual machine, and then install the docker inside that virtual machine – problem solved. This the convienece of the virtualization. The only problem is the performance since it is a virtualization on top of anther virtualization. But in some cases, this solution is the only choice. In that case, try Minikube – an existing solution.

Making sure you turn on the hyper-V: bcdedit /set hypervisorlaunchtype auto
Also turn on the hard ware virtulization at the BIOS (VT-x for intel or AMD-v for AMD). You can check by:

systeminfo

Restart the machine
Download docker-for-windows at https://hub.docker.com/editions/community/docker-ce-desktop-windows
Restart the machine
By typing

docker ps

You should find an empty list, since you have run any containers yet. But if there is an error “error during connect …”, it means the docker deamon is not running. Try start it mannually.

install docker in linux/Centos7

Use the package mangement yum is simple (or apt-get/ apt for other linux distributions):

# Centos 
yum install docker

Then we can use the following to set it to start itself

# Centos 7
systemctl socker.service
enable docker.service

install docker in MacOS

As in windows, there is some requirrement for Mac OS. You will need: - 2010 or newer, with Intel¡¯s hardware Memory Management Unit (MMU). - OS X 10.10.3 Yosemite or newer (or macOS). - At least 4 GB of RAM. - You must not have a VirtualBox installation earlier than version 4.3.30 on your system. If you do, you¡¯ll need to uninstall it. (This is again about hypervisor-1 and hypervisor-2)

To install docker, use Homebrew

brew cask install docker

Or download the docker at https://hub.docker.com/editions/community/docker-ce-desktop-mac , then drag dmg file to application folder.

3. Install a sample container - ubuntu

We first define some concepts:

Host: the real machine that run dockers;

Image: the encapsulation of desired object;

Container: The image under executation, or “an instance of the image”. Once an image is created, you can pull it to the local machine multiple times to create several containers;

Repository: the same like in the github, the remote place to store image;

Pull image from the repository to the local: docker pull ${image_URL}:${image_tag} The URL is where you store the image. By default, docker-hub is used as the remote repository. docker pull ubuntu:16.04 Check local image: docker image ls

Once you have downloaded the image: you can start it, attach to it or stop it. Since we add “-it” parameter to the “docker run”, you will be attached to the container once it is running. You can see your prompt has changed.

docker run -it ${image_id}
#or
docker run -it ${Repository:Tag}

#example:
docker run -it Ubuntu:16.04

On the other hand, you can also start the container only and attach to it later:

#start it:
docker run Ubuntu:16.04

#attach to it:
docker attach Ubuntu:16.04

Check whether it is running:

#The up-running container is shown as "excited"

docker ps -a

If you have attached to the container, then when you “quit”, the container will stop. Or if you didn’t attach to it, you can use following to stop it:

docker stop ${Repository:Tag}

4. Install a deep learning environment “Deepo”

Now, we are going to install a container that has everything done right for you— the image is called “Deepo”. In this case, you don’t have to start from installing linux, python and deep learning frameworks, like PyTorch, theano, mxnet,etc. They all readily installed in this image, and all you have to do is to pull it from docker-hub. If you want to make changes, just remember to push your modified version to your account in docker-hub.

Deepo (https://hub.docker.com/r/ufoym/deepo) actually has two verision, CPU-version and GPU version. You can refer to his docker webpage for details. Here we just install the CPU version. After initiate docker, we start by:

docker pull ufoym/deepo:cpu

You are all set. Remember how to run a container?

docker run -it ufoym/deepo:cpu bash

To share the data between the host machine (the machine runs docker) and the container is simple, just modify the initiate command to::

docker run -it -v /host/data:/data -v /host/config:/config ufoym/deepo:cpu bash

This will map the /host/data folder to the /data folder in the container.

Now you can try to run it:

import caffe
caffe --version

Of course, if you don’t want all-in-one solution, you can choose which pieces you lkie and pull them together to form a image to run in the container. Refer to “Build your own customized image with Lego-like modules” section in their docker homepages.

Last thing, but also important, this project is under MIT license, so free feel to modify and enjoy it!

Contents