Horizontal scaling of Rails apps made simple
Note: This article was originally published in April, 2020
Sidekiq + k8s
Running Rails applications with Sidekiq in Kubernetes allows for the decoupling of background and web processes to take advantage of Kubernetes’ inherent scalability. A typical implementation would look something like this:
apiVersion: v1
kind: Secret
metadata:
name: redis-secret
type: Opaque
data:
REDIS_PASSWORD: Zm9vYmFy
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cool-rails-app-web
labels:
app.kubernetes.io/name: cool-rails-app-web
workload-type: web
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/name: cool-rails-app-web
template:
metadata:
labels:
app.kubernetes.io/name: cool-rails-app-web
spec:
containers:
- name: cool-rails-app
image: bgroupe/cool-rails-app:latest
command: ["bundle"]
args:
- "exec"
- "puma"
- "-b"
- "tcp://0.0.0.0:3000"
- "-t"
- "1:1"
- "-w"
- "12"
- "--preload"
env:
- name: REDIS_HOST
value: redis
- name: REDIS_ADDRESS
value: redis:6379
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: REDIS_PASSWORD
ports:
- name: http
containerPort: 3000
protocol: TCP
livenessProbe:
httpGet:
path: /lbcheck
port: http
readinessProbe:
httpGet:
path: /lbcheck
port: http
imagePullSecrets:
- name: mysecretkey
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cool-rails-app-sidekiq
labels:
app.kubernetes.io/name: cool-rails-app-sidekiq
workload-type: sidekiq
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: cool-rails-app-web-sidekiq
template:
metadata:
labels:
app.kubernetes.io/name: cool-rails-app-sidekiq
spec:
containers:
- name: cool-rails-app
image: bgroupe/cool-rails-app:latest
command: ["bundle"]
environment:
args:
- "exec"
- "sidekiq"
- "-q"
- "cool_work_queue"
- "-i"
- "0"
env:
- name: REDIS_HOST
value: redis
- name: REDIS_ADDRESS
value: redis:6379
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: REDIS_PASSWORD
imagePullSecrets:
- name: mysecretkey
Sidekiq is multi-threaded and the default number of threads is 25, which is ample for most use cases. However, when throughput increases, we may need to scale the number of processes out horizontally to leave the currently processing jobs undisturbed. Kubernetes provides the Horizontal Pod Autoscaler controller and resource out-of-the-box to scale pods based on cluster-level metrics collected by enabling the kube-state-metrics add-on. It also supports custom metrics by way of adapters, the most popular being the Prometheus adapter. Using this technique, if we wanted to scale our Sidekiq workers based on aggregated queue size we would typically need to:
Install and configure Prometheus somewhere.
Install and configure a Prometheus Redis exporter pointed to our Sidekiq Redis instance, and make sure it exposes the key and list length for each Sidekiq queue we want to monitor.
Install the
k8s-prometheus-adapter
in our cluster and configure it to adapt metrics from our Prometheus server.Deploy an HPA spec with a custom metric targeted at our Prometheus metrics adapter.
As a general rule, it’s wise to set up Prometheus monitoring, but it’s a considerable amount of work, and there are many moving pieces to maintain to achieve the use-case of periodically checking the length of a Redis list.
Enter: KEDA
KEDA, or Kubernetes-based Event Driven Autoscaling, is a lightweight operator for HPAs that acts as a metrics server adapting a whole host of events from a myriad of data sources. Sidekiq stores enqueued jobs in a Redis list and, luckily, there is a KEDA adapter specifically for scaling based on the length of a Redis list. To use KEDA, you create a CRD called a ScaledObject
, with a dead-simple spec. The KEDA operator generates an HPA targeting your deployment when the ScaledObject
is registered. These are considerably fewer pieces to achieve the same effect.
KEDA is fairly straightforward to install, and there is very little customization required. I prefer to install with Helm, but you can also install via the manifest examples provided in the KEDA Github repo:
git clone https://github.com/kedacore/keda && cd keda
kubectl apply -f deploy/crds/keda.k8s.io_scaledobjects_crd.yaml
kubectl apply -f deploy/crds/keda.k8s.io_triggerauthentications_crd.yaml
kubectl apply -f deploy/
This will install the operator, register the ScaledObject
CRD, and an additional CRD called TriggerAuthentication, which is used for providing auth mechanisms to the operator and reusing credentials between ScaledObjects
.
The Setup
Creating the scalers
apiVersion: keda.k8s.io/v1alpha1
kind: TriggerAuthentication
metadata:
name: redis-auth
spec:
secretTargetRef:
- parameter: password
name: redis-secret
key: REDIS_PASSWORD
---
apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
name: sidekiq-worker
labels:
app: cool-rails-app-sidekiq
deploymentName: cool-rails-app-sidekiq
spec:
pollingInterval: 30
cooldownPeriod: 300
minReplicaCount: 1
maxReplicaCount: 10
scaleTargetRef:
deploymentName: cool-rails-app-sidekiq
triggers:
- type: redis
metadata:
address: REDIS_ADDRESS
listName: queue:cool_work_queue
listLength: "500"
authenticationRef:
name: redis-auth
---
Let’s say we have a shared Redis database that multiple applications connect to, and this database is protected with a password. Authentication can be provided directly on the ScaledObject
, but if we store our credentials in a Kubernetes secret, then we can use a TriggerAuthentication
object to delegate auth and share the same auth mechanism between multiple scaling resources. Here, our TriggerAuthentication
resource references a secret, called redis-secret
, which contains a REDIS_PASSWORD
key, which is basically all we need to authenticate to Redis. In the ScaledObject
, we reference the TriggerAuthentication
resource with the authenticationRef
key.
Now for the ScaledObject
: KEDA supports scaling both Kubernetes deployment and job resources. Since Sidekiq is run as a deployment, our ScaledObject
configuration is very simple:
# The amount of time between each conditional check of the data source.
pollingInterval: 30
# The amount of time to wait after scaling trigger has fired to scale back down to the minimum replica count.
cooldownPeriod: 300
# The minimum number of replicas desired for the deployment (Note: KEDA supports scaling to/from 0 replicas)
minReplicaCount: 1
#The maximum number of replicas to scale
maxReplicaCount: 10
# The name of the deployment we want to scale.
scaleTargetRef: "deploymentName"
The trigger portion contains our data source and scaler type. Here is where you would also be able to add a Redis password for authentication. This is the only tricky part: these sensitive values must be env vars referenced by the container of the target deployment.
triggers:
- type: redis
metadata:
address: REDIS_ADDRESS # host:port format
listName: queue:cool_work_queue
listLength: "500"
authenticationRef:
name: redis-auth
The key that Sidekiq writes for the queue list is prefixed with queue
: and the queue name is declared when the Sidekiq process is started. Let’s say our jobs are relatively fast, so we only need to start scaling when our queue hits 500. List length must be declared as a quoted string or the CRD validations will fail on creation.
Let’s create the CRDs and watch the KEDA operator generate an HPA on our behalf:
kubectl apply -f scaled-object-with-trigger-auth.yaml
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
keda-hpa-sidekiq-worker Deployment/cool-rails-app-sidekiq 0/500 (avg) 1 10 1 10s
That’s it. We now have an HPA managed by KEDA which will scale our Sidekiq worker on queue length. Any changes to the HPA, like editing the list length, are done by applying the ScaledObject
— it’s that simple.
Testing
To see it in action, we can generate load on our Sidekiq instance using a fake job. Our job will be acked, and then will sleep a random amount of time and print a message.
class AckTest
include Sidekiq::Worker
include Sidekiq::Lock::Worker
sidekiq_options :queue => :cool_work_queue
def perform(msg)
puts "ACK:"
sleep rand(180).to_i
puts "MSG: #{msg}"
end
end
To run this, open a Rails console in your web pod and paste the class definition. Then enqueue a large number of them:
1000.times { AckTest.perform_async("Let's scale up") }
Within the 30 second interval provided, you should see the HPA fire up a handful of extra Sidekiq pods which should start pulling work off of the queue. If the work is not performed by the end of the cool-down period (in this case, 5 minutes), then the additional pods will remain for 5 more minutes until the queue is polled by KEDA again.
Tuning
Now that we can spin up Sidekiq workers roughly based on throughput, we now have a situation where our worker pods will be spinning up and tearing down dynamically. The algorithm used by the HPA for scaling is as follows:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
It stands to reason that depending on how long the average job takes to complete, as the queue drains, the HPA will begin scaling workers back down. To ensure that we are not terminating processes in the middle of performing work, we need to add some buffer time to the shutdown. We have a couple of options:
We can wait an arbitrary amount of time for the worker to finish processing before receiving a SIGTERM. This is achieved by adding a terminationGracePeriodSeconds field on the deployment spec, and using our best guess to determine how long to delay the termination signal.
The preferable option is to delegate shutdown to Sidekiq’s internal mechanism. In Kubernetes, this is done by adding a pre-stop hook. We can tell a Sidekiq process to stop accepting jobs from the queue and only attempt to complete the jobs it is currently performing for a given amount of time. We can also abide by a work timeout set on the OS level of the container on startup.
Our deployment previously started Sidekiq like this:
spec:
containers:
- name: cool-rails-app
image: bgroupe/cool-rails-app:latest
command: ["bundle"]
args:
- "exec"
- "sidekiq"
- "-q"
- "cool_work_queue"
- "-i"
- "0"
We need to add a few more options. The first is the timeout option to specify how long we should allow our workers to finish jobs when shutting down. Let’s set it to 60 seconds. The second option is a pidfile path. Since we only one run Sidekiq process per container, specifying the name and path of the pidfile allows us to reference it later in shutdown process without having to search the file system.
...
- "-P"
- "/tmp/sidekiq.pid"
- "-t"
- "60"
Let’s add the pre-stop hook, under the lifecycle
options of the container spec:
spec:
containers:
- name: cool-rails-app
image: bgroupe/cool-rails-app:latest
lifecycle:
preStop:
exec:
command:
- "sidekiqctl"
- "stop"
- "/tmp/sidekiq.pid"
- "120"
The final argument supplied to the sidekiqctl stop command is the kill_timeout, which is the overall timeout that stops the Sidekiq process. This obviously needs to be longer than the timeout option supplied at startup, or else the process will be killed while the jobs are still working. In this example, we’ve set it to twice the amount of the timeout (which also happens to be the default Kubernetes termination grace period). Now we can ensure that we are allowing the maximum amount of time for work to be completed. If your app has long-executing jobs, then you can tweak these timeouts as you see fit. From the Sidekiq docs:
Any workers that do not finish within the timeout are forcefully terminated and their messages are pushed back to Redis to be executed again when Sidekiq starts up.
Epilogue
Many other asynchronous work queues inspired by Sidekiq utilize Redis list-based queues in a similar fashion, making this scaling pattern applicable outside of a Rails context. In recent versions, a more specific metric for determining worker throughput called “queue latency” was made available. It works by determining the amount of time the oldest job in the queue was enqueued, giving a better idea of how long jobs are taking to complete. To determine this value, some computation is required, making this particular pattern we’ve just implemented an insufficient one. Luckily, KEDA supports writing custom scaler integrations, and rolling your own is fairly straightforward. I will cover building this scaler in a future article.
KEDA is a wonderfully simplified framework for leveraging Kubernetes autoscaling features and supports a whole host of other event sources. Give it a try.
Top comments (0)