Let’s say for example you started a new job as a DevOps/Dev/SRE/etc at a company that created a new smart speaker (think Amazon Echo or Google home), said device gained a lot of success and you quickly find yourself with a million clients, each with a single device at his\hers home, Sounds great right? Now the only problem you have is how do you handle deployments to a million of devices located all across the world?
- You could go the way most old school vendors do it by releasing a package for the end user to download and install himself on the company website but at this day and age this will quickly lose you customers to the competition who doesn’t have such high maintenance needs.
- You could create a self updating system built into your codebase but that will require a lot of maintenance and man hours from the development team & even then will likely lead to problems and failures down the road.
- You could containerize the codebase, create on each smart speaker a single server Kubernetes cluster and create a huge federated cluster out of all of them (as Kubernetes doesn’t support this scale nor latency tolerant workers this is required) but that will lead to huge costs on all the resources wasted only to run all said clusters.
- You could use Nebula Container Orchestrator — which was designed to solve exactly this kind of distributed orchestration needs.
As you may have guessed from the title I want to discuss about the last option from the list.
Nebula Container Orchestrator aims to help devs and ops treat IoT devices just like distributed Dockerized apps. It aim is to act as Docker orchestrator for IoT devices as well as for distributed services such as CDN or edge computing that can span thousands (or even millions) of devices worldwide and it does it all while being open-source and completely free.
Different requirements leads to different orchestrators
When you think about it a distributed orchestrator has the following requirements:
- It needs to be latency tolerant — if the IoT devices are distributed then each will connect to the orchestrator through the Internet at a connection that might not always be stable or fast.
- It needs to scale out to handle thousands (and even hundreds of thousands) of IoT devices — massive scale deployments are quickly becoming more and more common.
- It needs to run on multiple architectures — a lot of IoT devices uses ARM boards.
- It needs to be self healing — you don’t want to have to run across town to reset a device every time there is a little glitch do you?
- Code needs to be coupled to the hardware — if your company manufacture the smart speaker in the example mentioned above & a smart fridge you will need to ensure coupling of the code to the device it’s intended to run on (no packing different apps into the same devices in the IoT use case).
This is quite different from the big Three orchestrators (Kubernetes, Mesos & Swarm) which are designed to pack as many different apps\microservices onto the same servers in a single (or relatively few) data centers and as a result non of them provide truly latency tolerant connection and the scalability of Swarm & Kubernetes is limited to a few thousands workers.
Nebula was designed with stateless RESTful Manger microservice to provide a single point to manage the clusters as well as providing a single point which all containers check for updates with a Kafka inspired Monotonic ID configuration updates in a pull based methodology, this ensure that changes to any of the applications managed by Nebula are pulled to all managed devices at the same time and also ensures that all devices will always have the latest version of the configuration(thanks to the monotonic ID), all data is stored in MongoDB which is the single point of truth for the system, on the workers side it’s based around a worker container on each devices that is in charge of starting\stopping\changing the other containers running on that device, due to the design each component can be scaled out & as such Nebula can grow as much as you require it.
you can read more about Nebula architecture at https://nebula.readthedocs.io/en/latest/architecture/
Nebula features
As it was designed from the ground up to support distributed systems Nebula has a few neat features that allows it to control distributed IoT systems:
- Designed to scale out on all of it’s components (IoT devices, API layer, & Mongo all scale out)
- Able to manage millions of IoT devices
- Latency tolerant — even if a device goes offline it will be re-synced when he gets back online
- Dynamically add/remove managed devices
- Fast & easy code deployments, single API call with the new container image tag (or other configuration changes) and it will be pushed to all devices of that app.
- Simple install —MongoDB & a stateless API is all it takes for the management layer & a single container with some envvars on each IoT device you want to manage takes care of the worker layer
- Single API endpoint to manage all devices
- Allows control of multiple devices with the same Nebula orchestrator (multiple apps & device_groups)
- Not limited to IoT, also useful for other types of distributed systems
- API, Python SDK & CLI control available
A little example
The following command will install an Nebula cluster for you to play on and will create an example app as well, requires Docker, curl & docker-compose installed:
curl -L "https://raw.githubusercontent.com/nebula-orchestrator/docs/master/examples/hello-world/start_example_nebula_cluster.sh" -o start_example_nebula_cluster.sh && sudo sh start_example_nebula_cluster.sh
But let’s go over what this command does to better understand the process:
- The scripts downloads and runs a docker-compose.yml file which creates:
A MongoDB container — the backend DB where Nebula apps current state is saved.
A manager container — A RESTful API endpoint, this is where the admin manages Nebula from & where devices pulls the latest configuration state from to match against their current state
A worker container — this normally runs on the IoT devices, only one is needed on each device but as this is just an example it runs on the same server as the management layer components runs on.
It’s worth mentioning the “DEVICE_GROUP=example” environment variable set on the worker container, this DEVICE_GROUP variable controls what nebula apps will be connected to the device (similar to a pod concept in other orchestrators).
The script then waits for the API to become available.
Once the API is available the scripts sends the following 2 commands:
curl -X POST \
http://127.0.0.1/api/v2/apps/example \
-H 'authorization: Basic bmVidWxhOm5lYnVsYQ==' \ -H 'cache-control: no-cache' \
-H 'content-type: application/json' \ -d '{
"starting_ports": [{"81":"80"}],
"containers_per": {"server": 1},
"env_vars": {},
"docker_image" : "nginx",
"running": true,
"volumes": [],
"networks": ["nebula"],
"privileged": false,
"devices": [],
"rolling_restart": false
}'
This command creates an app named “example” and configures it to run an nginx container to listen on port 81 , as you can see it can also control other parameters usually passed to the docker run command such as envvars or networks or volume mounts.
curl -X POST \
http://127.0.0.1/api/v2/device_groups/example \ -H 'authorization: Basic bmVidWxhOm5lYnVsYQ==' \ -H 'cache-control: no-cache' \
-H 'content-type: application/json' \ -d '{
"apps": ["example"]
}'
This command creates a device_group that is also named “example” & attaches the app named “example” to it.
- After the app & device_groups arecreated on the nebula API the worker container will pick it up the changes to the device_group which is been confiugred to be part of (“example” in this case) and will start an Nginx container on the server, you can run “docker logs worker” to see the Nginx container being downloaded before it starts (this might take a bit if your on a slow connection). and after it’s completed you can access http://:81/ on your browser to see it running
Now that we have a working Nebula system running we can start playing around with it to see it’s true strengths:
- We can add more remote workers by running a worker container on them:
sudo docker run -d --restart unless-stopped -v /var/run/docker.sock:/var/run/docker.sock --env DEVICE_GROUP=example --env REGISTRY_HOST=https://index.docker.io/v1/ --env MAX_RESTART_WAIT_IN_SECONDS=0 --env NEBULA_MANAGER_AUTH_USER=nebula --env NEBULA_MANAGER_AUTH_PASSWORD=nebula --env NEBULA_MANAGER_HOST=<your_manager_server_ip_or_fqdn> --env NEBULA_MANAGER_PORT=80 --env nebula_manager_protocol=http --env NEBULA_MANAGER_CHECK_IN_TIME=5 --name nebula-worker nebulaorchestrator/worker
It’s worth mentioning that a lot of the envvars passed through the command above are optional (with sane defaults) & that there is no limit on how many devices we can run this command on, at some point you might have to scale out the managers and\or backend DB but those are not limited as well.
- We can change the container image on all devices with a single API call, let’s for example replace the container image to Apache to simulate that
curl -X PUT \http://127.0.0.1/api/v2/apps/example/update \-H ‘authorization: Basic bmVidWxhOm5lYnVsYQ==’ \-H ‘cache-control: no-cache’ \-H ‘content-type: application/json’ \-d ‘{“docker_image”: “httpd:alpine”}’
- Similarly we can also update any parameter of the app such as env_vars, privileged permissions, volume mounts, etc, — the full list of API endpoints as well as the Python SDK & the CLI is available at documentation page at https://nebula.readthedocs.io/en/latest/
Hopefully this little guide allowed you to see the need of an IoT docker orchestrator and it’s use case & should you find yourself interested in reading more about it you can visit Nebula Container Orchestrator site at https://nebula-orchestrator.github.io/ or skip right ahead to the documentation at https://nebula.readthedocs.io
Top comments (0)