DEV Community

Discussion on: I'm an ops person. Ask me anything!

Collapse
 
benaryorg profile image
#benaryorg

That's a tough question ;-)

There's good and bad news for you.
The bad news is: I didn't really self-learn it.
I've been using SSH to connect to my server and netcat to copy files across the LAN from my to my roommates PC long before encountering real distributed systems.
When I did encounter them they seemed very natural to me so I didn't have to look up much at all.
The rest I learned from my then-colleagues and casually talking to people.

The good news is however, it's all pretty easy if you look at it neither from a top-level perspective nor from a lowest-level one, but from somewhere in-between.

Example

Let's take for example a (pretty common) example: a distributed shopsystem.

So what's the opposite of a distributed system?
A single monolithic one, right. Let's transform one into another.

Single Server

Luckily our example-shopsystem is (not advertising anything here) Magento, Oxid, or Shopware.
All of those are written in PHP, so they run pretty much out of the box. We just setup an Apache webserver (for simplicity and compatibility's sake) with mod_php. It really just boils down to installing and telling it where the docroot is.

Database

The installer is up and running and we're navigating the configuration menu.
The installer asks you for database credentials.
What do you do?

So, we wanted to setup a distributed setup, right?
We're going to need a central database, because the data's supposed to be the same everywhere, obviously.

We spin up another server, install MySQL, create a user with permission to connect from the PHP-server.
This can bear some problems if you can't trust your network, i.e. if the servers are communicating over a public wire. Because then we would send all the data in plaintext over the internet, we'd need some encryption in place for that. There's solutions for that; VPNs (OpenVPN, tinc), encrypted connections, middleware (I think MaxScale as a local installation speaking TLS in the backend would work(and aside make failovering easier)), so let's not worry about that and just say in our example the wire is secure.

We can now tell our PHP-server to connect and start doing things.

Going Distributed

So, we want two of these servers, right?
There arises a problem: most systems have things like cronjobs or admin-interfaces, file-uploads, etc., which should only be triggered on a single server.
But we also want two servers so one of them can break.

Let's have a single authoritative server, the one we're already using.

Then we setup two other servers which we may copy files to from the other server (rsync over SSH comes to mind), and which too run Apache&mod_php.

Load Balancing

Now how we teach our browser to talk to all of these three servers?
We don't!

We create another server.
That server will be responsible for distributing the requests. All servers involved speak HTTP so it boils down to forwarding the requests. This is good, let's use something that does exactly that.
Nginx is a good choice for that.

So the nginx is "terminating" the connection, meaning that clients connect to it.

There's several things which are pretty easy then:

  • routing traffic for /admin (or whereever the admin interface is) to the authoritative server
  • HTTPS (single point to install certs)
  • placing additional things between nginx and the appservers

One thing that we want to do is route all requests for the admin interface to the admin.

After we've done that we can make changes and upload pictures to the admin server and then sync it's docroot to the application servers and be done with it.

Further Things To Do

Use memcached or Redis for sessions.
Setup a database slave for failover.
Put varnish in-between the load-balancer and the backends.

Final Words

Having a distributed system is pretty easy, if:

  • you can easily disable administrative services on non authoritative servers
  • distribute state using some sort of database, be in MySQL, Memcache or whatever
  • have a protocol that is widely supported so you have a wide range of choices for software