DEV Community

Cover image for Mount S3 Objects to Kubernetes Pods

Mount S3 Objects to Kubernetes Pods

Ant(on) Weiss on January 31, 2022

This post describes how to mount an S3 bucket to all the nodes in an EKS cluster and make it available to pods as a hostPath volume. Yes, we're awa...
Collapse
 
eldada profile image
Eldad Assis

Cool setup. Have you tested its speed?

Collapse
 
antweiss profile image
Ant(on) Weiss

No, speed wasn't a consideration here. Main motivation here was providing an easy and transparent way to upload files and make them accessible to pods. S3 gives users an easy and secure UI for that. Goofys is supposedly quite performant compared to other FUSE implementations (i.e s3fs). But we haven't benchmarked this ourselves.

Collapse
 
eldada profile image
Eldad Assis

Thx! Would love to know numbers if you ever do try it :-)

Collapse
 
oleksiihead profile image
Oleksii Smiichuk

Hi, Can set multiple bucketName?
I need to interact with few s3 buckets for different tasks

Collapse
 
antweiss profile image
Ant(on) Weiss

hi @oleksiihead
no support for this right now.
to add this one would need to do smthng like:

  1. modify the Dockerfile to replace the container startup command with an entrypoint script that mounts buckets in a loop.
  2. modify the Helm chart to receive an dictionary of bucket names and mount points and pass these into the DaemonSet

If you get to do this - please submit a PR.

Collapse
 
naturalett profile image
Lidor Ettinger

Amazing approach!
Thx for sharing in details.

Collapse
 
coming2022 profile image
Big Bunny • Edited

I tried to use the sharing method to complete the entire demo, but unfortunately this didn't work. Because the goofys mount directory( /var/s3fs) in daemonset is not the same as the directory I want to share with the host(/var/s3fs:shared);

/otomato # df -h
Filesystem                Size      Used Available Use% Mounted on
/dev/nvme0n1p1           50.0G      4.7G     45.3G   9% /var/s3fs:shared
poc-s3goofys-source 1.0P         0      1.0P   0% /var/s3fs
Enter fullscreen mode Exit fullscreen mode

Is there any configuration I missed?

Daemonset.yaml


apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: s3-mounter
  name: s3-mounter
  namespace: otomount
spec:
  selector:
    matchLabels:
      app: s3-mounter
  template:
    metadata:
      labels:
        app: s3-mounter
    spec:
      serviceAccountName: s3-mounter
      containers:
      - name: mounter 
        image: otomato/goofys
        securityContext:
          privileged: true
        command: ["/bin/sh"]
        args: ["-c", "mkdir -p /var/s3fs && ./goofys --region xxxxx -f poc-s3goofys-source /var/s3fs"]
        volumeMounts:
          - name: devfuse
            mountPath: /dev/fuse
          - name: mntdatas3fs
            mountPath: /var/s3fs:shared
      volumes:
        - name: devfuse
          hostPath:
            path: /dev/fuse
        - name: mntdatas3fs
          hostPath:
            path: /mnt/s3data
Enter fullscreen mode Exit fullscreen mode
Collapse
 
antweiss profile image
Ant(on) Weiss

What's your node OS? Is mount propagation enabled in the container runtime? See this note here: kubernetes.io/docs/concepts/storag...

Collapse
 
dirai09 profile image
dirai09

I have tried this and the other similar option mentioned in this blog. blog.meain.io/2020/mounting-s3-buc.... In neither case, the mounting to hostPath was successful for the cluster managed by AWS EKS.

Collapse
 
antweiss profile image
Ant(on) Weiss

Hi @dirai09 , this was originally tested on AWS EKS. I haven't tested it since but it should in theory still work. What is the error you're getting when trying to mount the hostPath?
Also - can you share your config in a gist?

Collapse
 
scaratec profile image
Randy Gupta • Edited

Nice approach. However, you might want to have a look at JuiceFS:

github.com/juicedata/juicefs

That has quite a good performance due to the combination with Redis and it is made with Kubernetes in mind.

Collapse
 
sarav_ak profile image
Sarav AK

Thanks for the wonderful suggestion @randy

Collapse
 
behroozam profile image
behrooz hasanbeygi

In high number of files its fail you due to nature of s3 api for small files the http response will be bigger than files.

I think mounting s3 is a bad idea, if you have enough developing resources its better to write a client for code to connect directly to s3 and cache list of s3 files ... For better performance.
But its a fun thing to do, also cephfs with rados gateway will give you better performance in kubernetes

Collapse
 
antweiss profile image
Ant(on) Weiss

good to know. not an issue in our case - we have a small number of large files there. And I agree it's not such a great idea in general - both performance wise and because of the hidden complexity. But it solved our specific itch and may help others solve it.

Collapse
 
dirai09 profile image
dirai09

Hi,
I don't think I am able to mount the volumes on the hostPath. Am I missing something here.

Collapse
 
janesoo profile image
Jane

I ran into an issue where goofys doesn't reload the content of a small txt file. It updates the timestamp though. Do you know what could be wrong?
I have goofys run inside a container.