DEV Community

Sam Gould
Sam Gould

Posted on

Web Development for Data Scientists: Core functionality and a DevSecOps foundation

In my last post I outlined a number of different approaches for setting up a simple website architecture. I mentioned that websites are typically hosted on Cloud VM/VPS or managed hosting solutions, and recommended some follow-up actions which take a Hello World site to a more robust production level. In this post I'm going to expand on some of these follow-up actions with a focus on bolstering our DevSecOps so that we have a solid foundation from which to build. Some points will be specific to WordPress sites, but most of the DevSecOps applies in general (to Linux servers). Before I get to DevSecOps however, I'm going to quickly cover my overall approach to task management and implementation of some core site elements.

How do I manage my web development tasks?

Personally, I find it important to maintain structured notes and TODO lists for my projects (not just web dev). I am using the following structure which I created to manage my web development tasks:

  • Deployment

  • Plugins

  • Design

  • Content

  • Integrations

  • DevOps

  • Server management and security

Deployment, Plugins and Design

What is quite nice about this task management structure, besides being essentially MECE and easy to understand, is that at the beginning it can roughly be followed in order. In the first post, we covered Deployment of the Hello World site (in particular I used DigitalOcean's 1-click WordPress launcher to set up a Droplet VM which hosts samgould.net). The next step for me was to install a WordPress theme (which is very easy), essentially a Plugin, and then to configure the Design options. While not the focus of this article, the basic actions I would cover here are (all found under Appearance > Customise in WordPress):

  • Define a tagline

  • Upload a 512x512 favicon

  • Upload a logo (150x150 suggested but you can also use a banner)

  • Pick a nice colour and font scheme

  • Define a sitewide header and footer (NB: it's possible to remove the theme copyright text by going into your site html root directory, which is /var/www/your-site/ where $your-site may depend on your Apache virtual host config, and then going to the relevant theme file, which for me is wp-content/themes/theme-name/includes/template-tags.php, and editing the php code; however I'm not 100% sure if this is permitted by copyright and/or the theme license (GPL v3 for me) so I decided to be a good citizen and leave it in place. It is interesting to know that this is where the processing happens under-the-hood, though.)

  • Create a public nickname for the admin user, to be used in the default WordPress template ("content posted by X"); or it can be removed completely.

  • Set a site timezone (under Settings > General).

Content

You would probably then want to add some basic content to your site! I added stuff like:

  • Basic homepage

  • About Me / Bio page with links to socials

  • Core functionality - a blog and list of freelance services for me; you might want an e-commerce store or something else

Another important piece of functionality is enabling people to contact you. Of course it is easy to write your email address or link to other socials, but it looks quite professional to be able to receive email to you@yourdomain. I discovered that, while it is easy to receive emails like this (very simple to set up via your domain registrar), sending emails requires an SMTP server. This is commonly achieved via paid or free tier plugins but could also be self-hosted.

Integrations

The only work I have done here is beginning to automate the way I share my content across different socials. I have started posting to DEV.to and am using a script I wrote which reformats WordPress markup so it is ready to post. This is somewhat tangential and should not be considered a necessary step to setting up your website.

DevOps

In the context of this article, by DevOps what I really mean is: "how do I manage my code and other site assets in a way which is flexible for development and resilient to failure?". In particular - version control and backups. I would also echo DigitalOcean's best practice advice and first set up a non-root user for any work you do on the server. In the code below, we will call this user _non_root_user_.

Version control

At this stage, all you really need is a Git repo to hold relevant files from your site codebase. For my WordPress site, this is simply custom theme and plugin code (and at this stage I don't even have any custom plugins). As mentioned in the Design section above, the site root on your server may depend on your HTML server config; mine depends on Apache virtual hosts and can be double checked by checking which sites are within /etc/apache2/sites-enabled. As a data scientist you (should) probably know how to set up a Git repo, but there were a few gotchas which I discovered along the way:

  • cd /var/www/your-site/

  • git init

  • gotcha #1: sudo chown -R non_root_user /var/www/your-site/ (explanation here)

  • gotcha #2: if you try to interact with the filesystem from outside the server (e.g. upload media files via web UI) then you will now get errors because the default www-data worker no longer owns the files. After you finish working with the git repo you can reset it with chown -R www-data /var/www/your-site/. I'm not sure if there is a better way to handle this…

  • gotcha #3: nano .gitignore (this is a gotcha in the sense that it is crucially important you define (and save!) the correct gitignore. Do not commit passwords/secrets to your repo! In particular wp-config.php must be excluded. I used Sal Ferrarello's "surgical" .gitignore as my starting point. Make sure you populate and save the file.)

  • git add .

  • git commit -m "First commit"

  • Create a repo, e.g. on github, and do git remote add origin https://github.com/your-github-username/your-repo.git

  • git branch -M main

  • git push origin main

  • gotcha #4: although the command output says it requires a password, it actually needs a Personal Access Token.

Congrats, you have now synced your mutable (WordPress) site files to a Git repo. But this is not every component of the site - we also need to create a backup of our content and config. For that we will use phpMyAdmin (although there are multiple ways to do this).

Database backup

What is phpMyAdmin?

As WordPress puts it:

"An administrator’s tool of sorts, phpMyAdmin is a PHP script meant for giving users the ability to interact with their MySQL databases. WordPress stores all of its information in the MySQL database and interacts with the database to generate information within your WordPress site. A “raw” view of the data, tables and fields stored in the MySQL database is accessible through phpMyAdmin."

How to use phpMyAdmin to backup a WordPress site

Step 1: install phpMyAdmin: sudo apt install phpmyadmin

  • With an Apache server, you may need to run these additional commands: sudo ln -s /etc/phpmyadmin/apache.conf /etc/apache2/conf-available/phpmyadmin.conf && sudo a2enconf phpmyadmin.conf && sudo service apache2 reload. I only needed to run the apache2 reload for some reason.

  • With an Nginx server, you may need to run these additional commands: sudo ln -s /user/share/phpmyadmin /var/www/my-site/phpmyadmin && nginx -s reload

phpMyAdmin can then be accessed via https://yourdomain/phpmyadmin (NB: see the 'Server management and security' section below for instructions on enabling https, or get it automatically from the DigitalOcean 1-click installer). You can log in with the MySQL root user created during the DigitalOcean 1-click deployment when our LAMP stack was configured (see the first post), or create a new database user with appropriate permissions.

Step 2: run a simple backup using phpMyAdmin. This is very simple to do - instructions can be found here.

Congrats, you have now backed up pretty much everything you need to be able to restore your (WordPress) site in case something breaks in production!

What about CI/CD?

A pattern for deploying different codebase versions is blue/green deployment. This is possible to do, for example using Apache virtual hosts, but in my opinion is overkill at this stage.

Server management and security

One of the first, most important steps to securing your server is implementing a firewall to block unwanted traffic. The simplest configuration tool is the Uncomplicated Firewall (UFW) and the suggested configuration is to allow only SSH (port 22, rate limited), HTTP (port 80), and HTTPS (port 443) access.

With a firewall (i.e. with appropriately exposed ports), you can further secure your website by enabling encrypted connections over HTTPS/SSL. This protects user privacy, data and has additional benefits like preferential treatment by search engines. The simplest way to implement this protection is by using the Certbot tool, which handles the certification and renewal/reminder processes.

We can block further unwanted traffic by implementing DDoS prevention. The DigitalOcean 1-click installer for WordPress uses two layers, namely fail2ban (should be implemented on all architectures) and disabling XML-RPC (WordPress-specific).

Finally, we want to make sure that our server stays updated to mitigate against any potential exploits. When using a Cloud VM/VPS, your Cloud provider will most likely push messages into the SSH console so that upon login you can see if you need to run updates (udo apt update && sudo apt upgrade). Since vulnerabilities should be patched in a timely manner, it is sometimes recommended to run updates in an automated and unattended manner (i.e. as soon as they are available).

Conclusion

We have completed our first batch of activities from each section of my task management framework, taking us from a simple Hello World to a site which is version controlled, backed-up and secured from attackers. Congratulations!

Disclaimer: I am not a security professional and you should DYOR. You are responsible for the security of your server/website.

This content was originally posted on samgould.net

Top comments (0)