Ceph S3 Object storage from Fluentd(EFK stack)
I find it hard to understand fluentd documentation and utilize Ceph storage (S3) to push Logs from Fluentd. This post helps to Store the Logs in Ceph’s S3 Object storage using Fluentd.
Ceph Storage with Rook
Follow the steps provided in Rook’s Github documentation for setting up Rook with Ceph storage.
*https://github.com/rook/rook/blob/master/Documentation/ceph-quickstart.md*
Setting Up EFK stack on Kubernetes cluster
Easiest way is to clone official kubernetes git repo
git clone https://github.com/kubernetes/kubernetes.git
Navigate to kubernetes/cluster/addons/fluentd-elasticsearch/ to find the deployment YAML’s for
- ElasticSearch (statefulset)
- Fluentd
-
Kibana
cd kubernetes/cluster/addons/fluentd-elasticsearch/
kubectl create -f es-service.yaml
kubectl create -f es-statefulset.yaml
kubectl create -f fluentd-es-configmap.yaml
kubectl create -f fluentd-es-ds.yaml
kubectl create -f fluentd-es-image
kubectl create -f kibana-deployment.yaml
kubectl create -f kibana-service.yaml
**Note: **For Development/Testing purpose you can edit Kibana-service.yaml ‘type’ as **NodePort **to expose Kibana dashboard to access it outside the Cluster.
The out_s3 Output plugin writes records into the Amazon S3 cloud object storage service. By default, it creates files on an hourly basis
Fluentd’s out_s3 also provides support to AWS’s S3 Object storage implementations. Ceph Provides S3-compatible object storage functionality with an interface that is compatible with a large subset of the Amazon S3 RESTful API.
Installation
out_s3 is included in td-agent by default.
**Note: **Fluentd gem users will need to install the fluent-plugin-s3 gem. In order to install it, please refer to the Plugin Management article.
Example Configuration
This config will push all the logs of services running in cluster to Ceph’s S3 Object storage in json format
<match **>
@type s3
aws_key_id **CEPH_S3_KEY_ID**
aws_sec_key **CEPH_S3_SECRET_KEY**
s3_bucket **CEPH_S3_BUCKET_NAME**
s3_endpoint **CEPH_S3_URL_WITH_STORE_NAME**
path logs
# by default Objects are gZipped but you can store as json
store_as json
<buffer tag,time>
@type file
path /var/log/fluent/s3
timekey 3600 # 1 hour partition
timekey_wait 10m
timekey_use_utc true # use utc
chunk_limit_size 256m
</buffer>
</match>
You can connect to Ceph’s s3 using s3cmd tool
s3cmd:
sudo apt-get update
sudo apt-get install s3cmd
To Consume s3 storage
export AWS_HOST**=**<host>
export AWS_ENDPOINT**=**<endpoint>
export AWS_ACCESS_KEY_ID**=**<accessKey>
export AWS_SECRET_ACCESS_KEY**=**<secretKey>
Host: The DNS host name where the rgw service is found in the cluster. Assuming you are using the default rook-ceph cluster, it will be rook-ceph-rgw-my-store.rook-ceph.
Endpoint: The endpoint where the rgw service is listening. Run kubectl -n rook-ceph get svc rook-ceph-rgw-my-store, then combine the clusterIP and the port.
Access key: kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml | grep AccessKey | awk ‘{print $2}’ | base64 — decode
Secret key: kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml | grep SecretKey | awk ‘{print $2}’ | base64 — decode
**s3cmd **Listing files in S3_BUCKET
s3cmd ls s3://S3_BUCKET_NAME --no-ssl --host=$AWS_HOST
Summary
We have deployed EFK stack on Kubernetes and Rook with ceph storage cluster. Created Ceph Object store and used it in Fluentd Conf to connect to S3 using out-s3 Plugin. Access S3 storage using s3cmd Tool.
Top comments (0)