Adnan Rahić for Sematext

Posted on Mar 9, 2020 • Originally published at sematext.com

Running and Deploying Elasticsearch Operator on Kubernetes

#devops #kubernetes #elasticsearch #showdev

Have you ever grown tired of running the same kubectl commands again and again? Well, the good folks over at the Kubernetes team understand you. With the addition of custom resources and the operator pattern, you can now make use of extensions, or addons as I like to call them, to the Kubernetes API that help you manage applications and components.

Operators follow Kubernetes principles including the control loop. The Operator Pattern is set out to help DevOps teams manage a service or set of services by automating repeatable tasks.

This article will show you the pros and cons of using the Operator Pattern versus StatefulSets, as I explained in our previous tutorial about Running and Deploying Elasticsearch on Kubernetes. It will also guide you through installing and running the Elasticsearch Operator on a Kubernetes cluster. I will also explain how to quickly set up basic monitoring with the Sematext Elasticsearch monitoring integration. You can also peek at Kubernetes monitoring integration on your own.

Keep in mind, there are no silver bullets. Both solutions are valid, but are useful for different scenarios. At Sematext we're using the StatefulSet approach, and it's working great for us.

The Elasticsearch Operator I'll be using in this tutorial is the official Operator from Elastic. It automates the deployment, provisioning, management, and orchestration of Elasticsearch on Kubernetes.

With that out of the way, let's jump into the tutorial!

What are Kubernetes Operators?

Operators are extensions to Kubernetes that use custom resources to manage applications. By using the CustomResourceDefinition (CRD) API resource, you can define custom resources. In this tutorial you'll learn how to create a custom resource in a separate namespace.

When you define a CRD object, it creates a new custom resource with a name and schema that you specify. What's so cool about this? Well, you don't have to write a custom configuration to handle the custom resource. The Kubernetes API does it all for you. It serves and handles the storage of your custom resource.

The point of using the Operator Pattern is to help you, the DevOps engineer, automate repeatable tasks. It captures how you can write code to automate a task beyond what Kubernetes itself provides.

You deploy an Operator by adding the Custom Resource Definition and Controller to your cluster. The Controller will normally run outside of the control plane, much as you would run any containerized application. More about that a bit further down. Let me explain what the Elasticsearch Operator is first.

What is the Elasticsearch Operator?

The Elasticsearch Operator automates the process of managing Elasticsearch on Kubernetes.

There are a few different Elasticsearch Operators you can choose from. Some of them are made by active open-source contributors, however only one is written and maintained by Elastic.

However, I won't go into details about any of them except for the official ECK Operator built by Elastic. For the rest of this tutorial, I'll demo how to manage and run this particular Elasticsearch Operator.

ECK simplifies deploying the whole Elastic stack on Kubernetes, giving you tools to automate and streamline critical operations. You can add, remove, and update resources with ease. Like playing with Lego bricks, changing things around is incredibly simple. It also makes it much easier to handle operational and cluster administration tasks. What is streamlined?

Managing multiple clusters
Upgrading versions
Scaling cluster capacity
Changing cluster configuration
Dynamically scaling storage
Scheduling backups

Why Use the Elasticsearch Operator: Pros and Cons?

When I first learned about the Operator Pattern, I had an overwhelming feeling of hype. I wanted it to be better than the "old" way. I was hoping the added automation would make managing and deploying applications on Kubernetes much easier. I was literally hoping it would be the same breakthrough as Helm.

In the end, it's not. Well, at least not yet. If you compare the stars of the most popular Helm charts that configure Elasticsearch StatefulSets versus the official Elasticsearch Operator, they're neck-and-neck. We still seem to be a bit conflicted about what to use.

Elasticsearch Operator vs. StatefulSet

The Elasticsearch Operator essentially creates an additional namespace that houses tools to automate the process of creating Elasticsearch resources in your default namespace. It's literally an addon you add to your Kubernetes system to handle Elasticsearch-specific resources.

This gives you more automation but also abstracts away things you might need more fine-tuned control over. Configuring your own StatefulSets can often be the better approach because this is the way the community is used to configuring Elasticsearch clusters. It also gives you more control.

However, the Operator can do things that are not available with the StatefulSets. It uses Kubernetes resources in the background to automate your work with some additional features:

S3 snapshots of indexes
Automatic TLS - the operator automatically generates secrets
Spread loads across zones
Support for Kibana and Cerebro
Instrumentation with statsd
Secure by default, with encryption enabled and password protected
Official Operator maintained by Elastic

Why Use the Elasticsearch Operator?

If you want to get up and running quickly, choose the Operator. You'll get all of this out of the box:

Elasticsearch, Kibana and APM Server deployments
TLS certificates management
Safe Elasticsearch cluster configuration & topology changes
Persistent volumes usage
Custom node configuration and attributes
Secure settings keystore updates

However, keep in mind there are downsides.

Why Stay Away From the Elasticsearch Operator?

Like with any new and exciting tool, there are a few issues. The biggest one being that it's a totally new tool you need to learn. Here are my reasons for staying away from the Operator:

An additional tool to learn
Additional Kubernetes resources in a separate namespace to worry about
Additional resources create overhead
Less fine-tuned control

Most of what the Elasticsearch Operator offers is already available with prebuilt Helm charts.

With that out of the way. Let's start by building something!

How to Run and Deploy the Elasticsearch Operator on Kubernetes

Installing the Elasticsearch Operator is as simple as running one command. Don't believe me? Follow along and find out for yourself.

Prerequisites

To follow along with this tutorial you’ll need a few things first:

A Kubernetes cluster with role-based access control (RBAC) enabled.
- Ensure your cluster has enough resources available, and if not scale your cluster by adding more Kubernetes Nodes. You’ll deploy a 3-Pod Elasticsearch cluster. I’d suggest you have 3 Kubernetes Nodes with at least 4GB of RAM and 10GB of storage.
The kubectl command-line tool installed on your local machine, configured to connect to your cluster. You can read more about how to install kubectl in the official documentation.

Installing the Elasticsearch Operator

This command will install custom resource definitions and the Operator with RBAC rules:

kubectl apply -f https://download.elastic.co/downloads/eck/1.0.0/all-in-one.yaml

Once you've installed the Operator, you can check the resources by running this command:

kubectl -n elastic-system get all

[Output]
NAME                     READY   STATUS   RESTARTS   AGE
pod/elastic-operator-0   1/1     Running   0         18s

NAME                             TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/elastic-webhook-server   ClusterIP   10.96.52.149   <none>        443/TCP   19s

NAME                               READY   AGE
statefulset.apps/elastic-operator   1/1     19s

As you see the Operator will live under the elastic-system namespace. You can monitor the logs of the Operator's StatefulSet with this command:

kubectl -n elastic-system logs -f statefulset.apps/elastic-operator

A better way of monitoring logs on a cluster-level is to add the Sematext Operator to collect these logs and send them to a central location, alongside performance metrics about your Elasticsearch cluster. It’s pretty straightforward.

kubectl apply -f https://raw.githubusercontent.com/sematext/sematext-operator/master/bundle.yaml

cat <<EOF | kubectl apply -f -
apiVersion: sematext.com/v1alpha1
kind: SematextAgent
metadata:
  name: sematext-agent
spec:
  region: <"US" or "EU">
  containerToken: YOUR_CONTAINER_TOKEN
  logsToken: YOUR_LOGS_TOKEN
  infraToken: YOUR_INFRA_TOKEN
EOF

All you need are these two commands above, and you’re set to go. Next up, let's take a look at the CRDs that were created as well.

kubectl get crd

[Output]
NAME                                           CREATED AT
apmservers.apm.k8s.elastic.co                  2020-02-05T15:46:33Z
elasticsearches.elasticsearch.k8s.elastic.co   2020-02-05T15:46:33Z
kibanas.kibana.k8s.elastic.co                  2020-02-05T15:46:33Z

These are the APIs you'll have access to, in order to streamline the process of creating and managing Elasticsearch resources in your Kubernetes cluster. Next up, let's deploy an Elasticsearch cluster.

Deploying the Elasticsearch Cluster

Once the Operator is installed you'll get the access elasticsearch.k8s.elastic.co/v1 API. Now you can spin up an Elasticsearch server in no time. Run this command to create an Elasticsearch cluster with a single node:

cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  nodeSets:
  - name: default
    count: 1
    config:
      node.master: true
      node.data: true
      node.ingest: true
      node.store.allow_mmap: false
EOF

Give it a minute to start. You can check the cluster health during the creation process:

kubectl get elasticsearch

[Output]
NAME           HEALTH   NODES   VERSION   PHASE   AGE
elasticsearch   green    1       7.5.2     Ready   61s

You now have a running Elasticsearch Pod, which is tied to a StatefulSet in the default namespace. Alongside this, you also have two Services you can expose to access the Pod.

kubectl get all
[Output]
NAME                             READY   STATUS   RESTARTS   AGE
pod/elasticsearch-es-default-0   1/1     Running   0         2m18s
NAME                               TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/elasticsearch-es-default   ClusterIP   None           <none>       <none>     2m18s
service/elasticsearch-es-http     ClusterIP   10.96.192.180   <none>        9200/TCP   2m19s
service/kubernetes                 ClusterIP   10.96.0.1       <none>        443/TCP   2d2h
NAME                                       READY   AGE
statefulset.apps/elasticsearch-es-default   1/1     2m18s

To make sure your Pod is working, check its logs:

kubectl logs elasticsearch-es-default-0
...

If you see logs streaming in, you know it's working. The Services both have ClusterIPs and you get credentials generated automatically.

First, open up another terminal window, there you expose the quickstart-es-http service, so you can access it from your local machine:

kubectl port-forward service/elasticsearch-es-http 9200

A default user named elastic is automatically created with the password stored in a Kubernetes secret. Back in your initial terminal window, run this command to retrieve the password:

PASSWORD=$(kubectl get secret elasticsearch-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode)

Use curl to test the endpoint:

curl -u "elastic:$PASSWORD" -k "https://localhost:9200"

[Output]
{
   "name" : "elasticsearch-es-default-0",
   "cluster_name" : "elasticsearch",
   "cluster_uuid" : "7auDvcXLTwqLmXfBcAXIqg",
   "version" : {
       "number" : "7.5.2",
       "build_flavor" : "default",
       "build_type" : "docker",
       "build_hash" : "8bec50e1e0ad29dad5653712cf3bb580cd1afcdf",
       "build_date" : "2020-01-15T12:11:52.313576Z",
       "build_snapshot" : false,
       "lucene_version" : "8.3.0",
       "minimum_wire_compatibility_version" : "6.8.0",
       "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
   "tagline" : "You Know, for Search"
}

Hey presto! It works. This might be good for starters, but the cluster only has one Pod. Let's spice things up a bit and add a few more.

Upgrade and Configure the Elasticsearch Cluster

Any edits you do to the configuration will automatically upgrade the cluster. The Operator will try to update all the configuration changes you tell it, except for existing volume claims, these cannot be resized. Make sure your Kubernetes cluster has enough resources to handle any resizing you do.

If you want to have 3 Pods, run this command:

cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 7.5.2
nodeSets:
 - name: default
  count: 3
  config:
     node.master: true
     node.data: true
     node.ingest: true
     node.store.allow_mmap: false
EOF

This will bump up the Pod count. Check out this sample to see all the configuration options. Let's check if our Pods have updated:

kubectl get all

[Output]
NAME                             READY   STATUS   RESTARTS   AGE
pod/elasticsearch-es-default-0   1/1     Running   0         25m
pod/elasticsearch-es-default-1   1/1     Running   0         3m8s
pod/elasticsearch-es-default-2   1/1     Running   0         2m46s

NAME                               TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/elasticsearch-es-default   ClusterIP   None           <none>       <none>     25m
service/elasticsearch-es-http     ClusterIP   10.96.192.180   <none>        9200/TCP   25m
service/kubernetes                 ClusterIP   10.96.0.1       <none>        443/TCP   2d2h

NAME                                       READY   AGE
statefulset.apps/elasticsearch-es-default   3/3     25m

Awesome! Our cluster is starting to look nice! This cluster that you deployed by default only allocates a persistent volume of 1 GB for storage using the default storage class defined for the Kubernetes cluster.

Here's a sample of what adding more storage looks like:

cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  nodeSets:
  - name: default
    count: 3
    config:
      node.master: true
      node.data: true
      node.ingest: true
      node.store.allow_mmap: false
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 4Gi
        storageClassName: standard
EOF

You'll most likely want to have more control over this for production workloads. Check out the Volume claim templates for more information.

How to Run and Deploy Kibana with the Elasticsearch Operator

This Operator is called ECK for a reason. It comes packaged with Kibana. In one of the sections above we ran this command:

kubectl get crd

[Output]
NAME                                           CREATED AT
apmservers.apm.k8s.elastic.co                  2020-02-05T15:46:33Z
elasticsearches.elasticsearch.k8s.elastic.co   2020-02-05T15:46:33Z
kibanas.kibana.k8s.elastic.co                  2020-02-05T15:46:33Z

Check it out. You have a kibana.k8s.elastic.co/v1 API as well. This is what you'll use to create your Kibana instance.

Go ahead and specify a Kibana instance and reference your Elasticsearch cluster:

cat <<EOF | kubectl apply -f -
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  count: 1
  elasticsearchRef:
    name: elasticsearch
EOF

Give it a second to spin up the Pod. Similar to Elasticsearch, you can retrieve details about Kibana instances with this simple command:

kubectl get kibana

[Output]
NAME           HEALTH   NODES   VERSION   AGE
elasticsearch   green    1       7.5.2     2m31s

Wait until the health is green, then check the Pods:

kubectl get pod --selector='kibana.k8s.elastic.co/name=elasticsearch'

[Output]
NAME                               READY   STATUS   RESTARTS   AGE
elasticsearch-kb-5f568dcdb6-xd55w   1/1     Running   0         3m19s

When the Pods are up and running as well, you can go ahead and set up accessing Kibana. A ClusterIP Service is automatically created for Kibana:

kubectl get service elasticsearch-kb-http

[Output]
NAME                   TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
elasticsearch-kb-http   ClusterIP   10.96.199.44   <none>        5601/TCP   4m24s

Once again, open up another terminal window, and use kubectl port-forward to access Kibana from your local machine:

kubectl port-forward service/elasticsearch-kb-http 5601

Open https://localhost:5601 in your browser. Log in as the elastic user. Get the password with this command:

kubectl get secret elasticsearch-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode; echo

Once you're signed in, you'll see the Kibana quickstart screen.

There you have it. You've added a Kibana instance to your Kubernetes cluster.

Cleaning Up and Deleting the Elasticsearch Operator

With all resources installed and working, you should see this when running kubectl get all.

NAME                                   READY   STATUS   RESTARTS   AGE
pod/elasticsearch-es-default-0          1/1     Running   0         13m
pod/elasticsearch-kb-5f568dcdb6-xd55w   1/1     Running   0         11m

NAME                               TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/elasticsearch-es-default   ClusterIP   None           <none>       <none>     13m
service/elasticsearch-es-http     ClusterIP   10.96.168.225   <none>        9200/TCP   13m
service/elasticsearch-kb-http     ClusterIP   10.96.199.44   <none>        5601/TCP   11m
service/kubernetes                 ClusterIP   10.96.0.1       <none>        443/TCP   2d3h

NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/elasticsearch-kb   1/1     1            1           11m

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/elasticsearch-kb-5f568dcdb6   1         1         1       11m

NAME                                       READY   AGE
statefulset.apps/elasticsearch-es-default   1/1     13m

Way to go, you've configured an Elasticsearch cluster with Kibana using the Elasticsearch Operator! But, what if you need to delete resources? Easy. Run two commands and you're done.

First, delete all Elastic resources from all namespaces:

kubectl delete elastic --all --all-namespaces

Then, delete the Operator itself:

kubectl delete -f https://download.elastic.co/downloads/eck/1.0.0/all-in-one.yaml

That's it, all clean!

Final Thoughts About the Elasticsearch Operator

In this tutorial you've learned about the Kubernetes Operator pattern, and how to run and deploy the Elasticsearch Operator on a Kubernetes cluster. You've also scaled up the number of Elasticsearch Pods on the cluster, and installed Kibana.

With this knowledge on top of what you learned in part 1 of this series, you can make a decision whether to use a Helm chart with StatefulSets or the Elasticsearch Operator.

Why bother learning Operators?

In the last year we've witnessed a huge increase in popularity for the Operator Pattern. Right now, the official Elasticsearch Operator has the same number of stars on GitHub as the most popular Elasticsearch Helm chart. This popularity will seemingly continue to grow.

What can you do now? Contribute! Learn even more about Kubernetes, and give back to the community. These projects are open-source for a reason. Help them grow!

Hope you guys and girls enjoyed reading this as much as I enjoyed writing it. If you liked it, feel free to hit the share button so more people will see this tutorial. Until next time, be curious and have fun.

DEV Community