<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jibin Liu</title>
    <description>The latest articles on DEV Community by Jibin Liu (@jibinliu).</description>
    <link>https://dev.to/jibinliu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F101377%2Fd385e183-818c-42d9-8f43-fd54a35bca0f.jpeg</url>
      <title>DEV Community: Jibin Liu</title>
      <link>https://dev.to/jibinliu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jibinliu"/>
    <language>en</language>
    <item>
      <title>How to persist data in docker container</title>
      <dc:creator>Jibin Liu</dc:creator>
      <pubDate>Sat, 15 Sep 2018 07:13:35 +0000</pubDate>
      <link>https://dev.to/jibinliu/how-to-persist-data-in-docker-container-2m72</link>
      <guid>https://dev.to/jibinliu/how-to-persist-data-in-docker-container-2m72</guid>
      <description>

&lt;h2&gt;TL;DR&lt;/h2&gt;

&lt;p&gt;Containers are supposed to be light-weighted. Adding unnecessary data will make it heavy to create and run. Docker provides several ways to mount storage from the host machine to containers. Volumes are the most commonly used one. It can be used to persist application data, and also share data between multiple containers as well. (local volumes cannot be shared between docker services though. You will need shared storage instead.)&lt;/p&gt;

&lt;h2&gt;Background&lt;/h2&gt;

&lt;p&gt;I've heard docker and container a while ago, however, I'm new to use them. Only recently I started exploring as it helps to build web services and easily deploy on multiple OS. (They are fantastic tools!)&lt;/p&gt;

&lt;p&gt;For one of the web services, its job is to create/update/activate another virtual environment, and run a task using that environment. Different requests will sometimes need a different virtual environment. The &lt;code&gt;requirements.txt&lt;/code&gt; file for each virtual environment is synced from time to time, then &lt;code&gt;pip install&lt;/code&gt; is called to update the virtual environment. &lt;code&gt;pip install&lt;/code&gt; can take time, and need to be called as fewer times as possible. That means the web service need to persist the virtual environments so that when the service restarts, it doesn't have to repeat the create/update environment jobs.&lt;/p&gt;

&lt;p&gt;Here it raises the issue that, every time when a new image was built for the web service, obviously it doesn't have the virtual environments stored in the old container. This makes the service to be "very cold-start". To solve it, I first thought to commit the changes from the old container to the new image. However, this extremely increases the size of the image and container.&lt;/p&gt;

&lt;p&gt;After a few hours of digging in the docker documentation, I realized that so far I've thought of containers to be "fully self-contained", while it has more power when working together with its host machine.&lt;/p&gt;

&lt;h2&gt;Solution&lt;/h2&gt;

&lt;p&gt;Docker provides three ways to mount data to the container: volumes, bind mounts, and tmpfs storage [1].&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Volumes&lt;/strong&gt; are part of the host filesystem, but managed by docker at the specific path and should not be modified by other applications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bind mounts&lt;/strong&gt; can be anywhere on the host, but can be modified by other applications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tmpfs&lt;/strong&gt; are in the host's in-memory space, and never get written into the filesystem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generally speaking, volumes are the go-to solution to solve most of the data persistence issues in a container. Volumes can be either created by &lt;code&gt;docker volume create&lt;/code&gt; command, or created when starting a container.&lt;/p&gt;

&lt;h2&gt;Examples as my solution&lt;/h2&gt;

&lt;p&gt;The docker documentation is &lt;a href="https://docs.docker.com/v17.09/engine/admin/volumes/volumes/"&gt;here&lt;/a&gt; [2].&lt;/p&gt;

&lt;h4&gt;1. Create volume&lt;/h4&gt;

&lt;p&gt;First, let's create a volume named as &lt;code&gt;virtualenv&lt;/code&gt; to serve as the path to store virtual environments.&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker volume create virtualenv
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We can check the volume by the following command&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker volume inspect virtualenv
&lt;span class="o"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"CreatedAt"&lt;/span&gt;: &lt;span class="s2"&gt;"2018-09-15T05:29:36Z"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Driver"&lt;/span&gt;: &lt;span class="s2"&gt;"local"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Labels"&lt;/span&gt;: &lt;span class="o"&gt;{}&lt;/span&gt;,
        &lt;span class="s2"&gt;"Mountpoint"&lt;/span&gt;: &lt;span class="s2"&gt;"/var/lib/docker/volumes/virtualenv/_data"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Name"&lt;/span&gt;: &lt;span class="s2"&gt;"virtualenv"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Options"&lt;/span&gt;: &lt;span class="o"&gt;{}&lt;/span&gt;,
        &lt;span class="s2"&gt;"Scope"&lt;/span&gt;: &lt;span class="s2"&gt;"local"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;2. Create container&lt;/h4&gt;

&lt;p&gt;The structure of the example app looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Dockerfile&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;main.py&lt;/code&gt;: the entrypoint&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;create_env.sh&lt;/code&gt; (used to create another virtual environment)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What &lt;code&gt;main.py&lt;/code&gt; does is to check if the virtual environment "my_env" exists. If not, it will create it. We're going to mount the volume created above as &lt;code&gt;~/.virtualenv&lt;/code&gt; folder in the container.&lt;/p&gt;

&lt;p&gt;I use the following Dockerfile to create the simplest python image:&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.7&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="s"&gt; . /app&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;virtualenv
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["python", "./main.py"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;code&gt;main.py&lt;/code&gt; looks like this:&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;subprocess&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'/root/.virtualenv/my_env'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'my_env already exists'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s"&gt;'bash'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'create_env.sh'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'my_env created'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;'__main__'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;And the one-line &lt;code&gt;create_env.sh&lt;/code&gt;&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/.virtualenv/ &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; virtualenv my_env
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;3. Start container with volume mounted&lt;/h4&gt;

&lt;p&gt;We first build the python image:&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker build &lt;span class="nt"&gt;-t&lt;/span&gt; docker-data-persistence &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Then to mount the volume, we use &lt;code&gt;--mount&lt;/code&gt; argument:&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mount&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;virtualenv,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.virtualenv &lt;span class="se"&gt;\&lt;/span&gt;
  docker-data-persistence

Using base prefix &lt;span class="s1"&gt;'/usr/local'&lt;/span&gt;
New python executable &lt;span class="k"&gt;in&lt;/span&gt; /root/.virtualenv/my_env/bin/python
Installing setuptools, pip, wheel...done.
my_env created
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;As we can see above, when we run the container for the first time, it will create the virtual environment "my_env" as it doesn't exist in the volume yet. If we run it the second time, it will say "my_env" already exists.&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mount&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;virtualenv,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.virtualenv &lt;span class="se"&gt;\&lt;/span&gt;
  docker-data-persistence

my_env already exists
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;4. Inspect the volume&lt;/h4&gt;

&lt;p&gt;We can take a look into the files in the volume (in a hacky way [3]) to verify the contents:&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mount&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;virtualenv,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.virtualenv &lt;span class="se"&gt;\&lt;/span&gt;
  docker-data-persistence &lt;span class="se"&gt;\&lt;/span&gt;
  find /root/.virtualenv/my_env/bin

/root/.virtualenv/my_env/bin
/root/.virtualenv/my_env/bin/python3
/root/.virtualenv/my_env/bin/activate.csh
/root/.virtualenv/my_env/bin/easy_install-3.7
/root/.virtualenv/my_env/bin/python
/root/.virtualenv/my_env/bin/python-config
/root/.virtualenv/my_env/bin/easy_install
/root/.virtualenv/my_env/bin/python3.7
/root/.virtualenv/my_env/bin/activate
/root/.virtualenv/my_env/bin/pip
/root/.virtualenv/my_env/bin/activate.fish
/root/.virtualenv/my_env/bin/pip3
/root/.virtualenv/my_env/bin/wheel
/root/.virtualenv/my_env/bin/activate_this.py
/root/.virtualenv/my_env/bin/pip3.7
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;5. Delete the volume&lt;/h4&gt;

&lt;p&gt;To delete the volume, we can use &lt;code&gt;docker volume rm &amp;lt;volume-name&amp;gt;&lt;/code&gt;. However, you can't delete a volume when there is a container that uses it, even if the container has exited.&lt;/p&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;➤ docker volume &lt;span class="nb"&gt;rm &lt;/span&gt;virtualenv
Error response from daemon: remove virtualenv: volume is &lt;span class="k"&gt;in &lt;/span&gt;use - &lt;span class="o"&gt;[&lt;/span&gt;dc4425b806a67a9002d68703cdd9854feba44e43d591278b4eb2869f43c0da6d]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;References&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[1] &lt;a href="https://docs.docker.com/v17.09/engine/admin/volumes/"&gt;https://docs.docker.com/v17.09/engine/admin/volumes/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;[2] &lt;a href="https://docs.docker.com/v17.09/engine/admin/volumes/volumes/"&gt;https://docs.docker.com/v17.09/engine/admin/volumes/volumes/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;[3] &lt;a href="https://stackoverflow.com/questions/34803466/how-to-list-the-content-of-a-named-volume-in-docker-1-9"&gt;https://stackoverflow.com/questions/34803466/how-to-list-the-content-of-a-named-volume-in-docker-1-9&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


</description>
      <category>docker</category>
      <category>container</category>
      <category>datapersistence</category>
    </item>
  </channel>
</rss>
