<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ash Walker</title>
    <description>The latest articles on DEV Community by Ash Walker (@signalwalker).</description>
    <link>https://dev.to/signalwalker</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3726536%2F91d4d802-f6b6-4943-a054-bcb80d3be675.png</url>
      <title>DEV Community: Ash Walker</title>
      <link>https://dev.to/signalwalker</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/signalwalker"/>
    <language>en</language>
    <item>
      <title>Adding Comments To My Site</title>
      <dc:creator>Ash Walker</dc:creator>
      <pubDate>Thu, 17 Oct 2024 14:30:25 +0000</pubDate>
      <link>https://dev.to/signalwalker/adding-comments-to-my-site-4aa4</link>
      <guid>https://dev.to/signalwalker/adding-comments-to-my-site-4aa4</guid>
      <description>&lt;p&gt;I've been wanting to add a comment section to the site but I don't want to use any third-party services, so I hadn't done so until after finding out about &lt;a href="https://comentario.app/" rel="noopener noreferrer"&gt;Comentario&lt;/a&gt;, which is an open-source comment server that I'm now self-hosting &lt;a href="https://comments.ashwalker.net/" rel="noopener noreferrer"&gt;an instance of&lt;/a&gt;. This &lt;em&gt;would&lt;/em&gt; have been pretty simple to set up if I were using Debian/Ubuntu (for which Comentario has already been packaged) or if I were willing to use Docker, but my server's running &lt;a href="https://nixos.org/" rel="noopener noreferrer"&gt;NixOS&lt;/a&gt;, which didn't have a Comentario package.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/signalwalker/nix.pkg.comentario" rel="noopener noreferrer"&gt;Until now!&lt;/a&gt; I made one myself.&lt;/p&gt;

&lt;p&gt;This ended up being much more complicated than I initially assumed because, while building the &lt;em&gt;backend&lt;/em&gt; was pretty easy, the &lt;em&gt;frontend&lt;/em&gt; is built with &lt;a href="https://yarnpkg.com/" rel="noopener noreferrer"&gt;Yarn&lt;/a&gt;, which isn't supported very well by the build tools available in Nixpkgs. I ended up having to go through a lot of trial-and-error to figure out exactly what build tools the frontend depended on -- for example, &lt;a href="https://gohugo.io/" rel="noopener noreferrer"&gt;Hugo&lt;/a&gt; is used, but the &lt;a href="https://docs.comentario.app/en/installation/building/" rel="noopener noreferrer"&gt;docs&lt;/a&gt;, as of 2024-10-17, don't mention it.&lt;/p&gt;

&lt;p&gt;I initially used &lt;a href="https://nixos.org/manual/nixpkgs/unstable/#javascript-yarn2nix-mkYarnPackage" rel="noopener noreferrer"&gt;&lt;code&gt;mkYarnPackage&lt;/code&gt;&lt;/a&gt;, because that seemed like the obvious answer, but apparently that's not very useful for building web frontends, so I wasted a lot of time trying to get that to work until I just gave up and wrote a custom build script.&lt;/p&gt;

&lt;p&gt;The main issue ended up being the &lt;code&gt;yarn run generate&lt;/code&gt; step, which uses &lt;a href="https://openapi-generator.tech/" rel="noopener noreferrer"&gt;OpenAPI Generator&lt;/a&gt;, which nixpkgs &lt;em&gt;does&lt;/em&gt; provide but which tries to download a specific version of itself at runtime, so I had write a script to patch the build config to use the nixpkgs version.&lt;/p&gt;

&lt;p&gt;Its NixOS module is pretty standard for this sort of thing (just a simple systemd service and a Nginx vhost), and adding the web components to the site template was trivial, so at least everything else was easy.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>nix</category>
    </item>
    <item>
      <title>How to Block Scrapers on Every Nginx Virtualhost in NixOS</title>
      <dc:creator>Ash Walker</dc:creator>
      <pubDate>Wed, 18 Sep 2024 19:34:00 +0000</pubDate>
      <link>https://dev.to/signalwalker/how-to-block-scrapers-on-every-nginx-virtualhost-in-nixos-166n</link>
      <guid>https://dev.to/signalwalker/how-to-block-scrapers-on-every-nginx-virtualhost-in-nixos-166n</guid>
      <description>&lt;p&gt;A couple months ago I realized that a lot of my home bandwidth was being eaten by AI scrapers constantly refreshing the login screen of the &lt;a href="https://jellyfin.org" rel="noopener noreferrer"&gt;Jellyfin&lt;/a&gt; instance I host for my friends on my home server. Regardless of one's opinions about the ethicality of LLMs, the scrapers gathering training data for them are bad for the ecosystem and they're making me pay extra money to Comcast, so: here's how to block them in &lt;a href="https://nginx.org/" rel="noopener noreferrer"&gt;Nginx&lt;/a&gt; (as long as you're using &lt;a href="https://nixos.org" rel="noopener noreferrer"&gt;NixOS&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;First, &lt;a href="https://github.com/ai-robots-txt/ai.robots.txt/blob/main/robots.txt" rel="noopener noreferrer"&gt;here&lt;/a&gt; is a &lt;code&gt;robots.txt&lt;/code&gt; file containing a list of user agent strings for common scrapers. If you want to add that to a vhost, you can add this to the vhost config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;location =/robots.txt {
  alias /path/to/the/robots.txt/file;
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Web crawlers are &lt;em&gt;supposed&lt;/em&gt; to respect rules set in &lt;code&gt;robots.txt&lt;/code&gt; files, but they sometimes ignore them (either through malice or by mistake), so it's also useful to block them entirely.&lt;/p&gt;

&lt;p&gt;All you have to do to block a specific user agent in Nginx is to add something like this to the server config, where "GPTBot", "Amazonbot" and "Bytespider" are the user agent strings you want to block:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if ($http_user_agent ~* "(GPTBot|Amazonbot|Bytespider)") {
  return 444;
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;("444" isn't a real HTTP status code; &lt;a href="https://nginx.org/en/docs/http/request_processing.html" rel="noopener noreferrer"&gt;Nginx uses it internally to signal that it should drop the connection without a response.&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Nginx, as far as I know, doesn't let you set common configuration settings shared by all vhosts, so, if you've got more than one vhost, you'll have to do a lot of copy-and-pasting. &lt;em&gt;Nix&lt;/em&gt;, however, makes that (relatively) simple.&lt;/p&gt;

&lt;p&gt;The naive way to do this in NixOS would be something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;services.nginx.virtualHosts = let
  robots = ["GPTBot" "Amazonbot" "Bytespider"];
  rules = lib.concatStringsSep "|" robots;
  robotsTxt = let
    agentsStr = pkgs.lib.concatStringsSep "\n" (map (agent: "User-agent: ${agent}" robots));
  in pkgs.writeText "robots.txt" ''
    ${agentsStr}
    Disallow: /
  '';
in {
  "vhost-A" = {
    # ... other config ...
    locations."=/robots.txt".alias = ${robotsTxt};
    extraConfig = ''
      if ($http_user_agent ~* "(${rules})") {
        return 444;
      }
    '';
  };
  "vhost-B" = {
    # ... other config ...
    locations."=/robots.txt".alias = ${robotsTxt};
    extraConfig = ''
      if ($http_user_agent ~* "(${rules})") {
        return 444;
      }
    '';
  };
  # ... and so on
};

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But that gets tedious and it's easy to forget to add the rules to a specific vhost. Instead, you can override the &lt;code&gt;services.nginx.virtualHosts&lt;/code&gt; module to automatically apply the rules for you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;let
  robots = ["GPTBot" "Amazonbot" "Bytespider"];
  rules = lib.concatStringsSep "|" robots;
  robotsTxt = let
    agentsStr = pkgs.lib.concatStringsSep "\n" (map (agent: "User-agent: ${agent}" robots));
  in pkgs.writeText "robots.txt" ''
    ${agentsStr}
    Disallow: /
  '';
in {
  options = with lib; {
    services.nginx.virtualHosts = mkOption {
      type = types.attrsOf (types.submodule {
        config = {
          locations."=/robots.txt" = lib.mkDefault {
            alias = robotsTxt;
          };
          extraConfig = ''
            if ($http_user_agent ~* "(${rules})") {
              return 444;
            }
          '';
        };
      });
    };
  };
  config = {
    # normal nginx vhost config goes here
  };
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because that overrides the submodule used by &lt;code&gt;virtualHosts.&amp;lt;name&amp;gt;&lt;/code&gt;, this configuration will automatically apply to every vhost, including ones defined by external modules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Addendum, 2024-09-24
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/SignalWalker/nix.nginx.vhost-defaults" rel="noopener noreferrer"&gt;I wrote a NixOS module&lt;/a&gt; implementing this, including automatically getting the block list from &lt;a href="https://github.com/ai-robots-txt/ai.robots.txt" rel="noopener noreferrer"&gt;ai-robots-txt&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;small&gt;&lt;br&gt;
Apparently, the NixOS manual does actually obliquely reference that you can type-merge submodules, in the &lt;a href="https://nixos.org/manual/nixos/unstable/#sec-option-types-submodule" rel="noopener noreferrer"&gt;documentation for &lt;code&gt;types.deferredModule&lt;/code&gt;&lt;/a&gt;.&lt;br&gt;
&lt;/small&gt;&lt;/p&gt;

</description>
      <category>nix</category>
      <category>nginx</category>
    </item>
    <item>
      <title>Rebuilding the Website</title>
      <dc:creator>Ash Walker</dc:creator>
      <pubDate>Tue, 17 Sep 2024 00:00:00 +0000</pubDate>
      <link>https://dev.to/signalwalker/rebuilding-the-website-10a1</link>
      <guid>https://dev.to/signalwalker/rebuilding-the-website-10a1</guid>
      <description>&lt;p&gt;Because &lt;a href="https://cohost.org/staff/post/7611443-cohost-to-shut-down" rel="noopener noreferrer"&gt;Cohost is shutting down&lt;/a&gt;, I decided to rebuild my whole website so I could use it as a blog. I already have an &lt;a href="https://social.ashwalker.net/Ash" rel="noopener noreferrer"&gt;ActivityPub server&lt;/a&gt;, but the software running it (&lt;a href="https://akkoma.dev/AkkomaGang/akkoma/" rel="noopener noreferrer"&gt;Akkoma&lt;/a&gt;) doesn't really work well for long-form posting.&lt;/p&gt;

&lt;p&gt;I hadn't been using any sort of site generator for this website -- the whole thing was entirely manual, which involved a lot of copy-and-pasting and would've made something like an RSS feed a Sisyphean effort.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fashwalker.net%2Fimg%2FZcw3a-dUpo-1850.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fashwalker.net%2Fimg%2FZcw3a-dUpo-1850.webp" alt="A screenshot of the old version of this website." title="The old version of this website." width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ideally, I'd have written some sort of bespoke HTTP server in Rust, with on-demand page generation, image processing, and ActivityPub support, but I, unfortunately, have a life outside of yak shaving trivial projects, so, instead, I'm just using &lt;a href="https://www.11ty.dev/" rel="noopener noreferrer"&gt;Eleventy&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This server runs &lt;a href="https://nixos.org/" rel="noopener noreferrer"&gt;Nix&lt;/a&gt;, though, &lt;del&gt;so I've also set up a relatively simple NixOS module defining a couple systemd services to build the site whenever it detects a change to its source folder. (I'm doing it this way instead of just building it once during &lt;code&gt;nixos-rebuild&lt;/code&gt; so that I can post things without having to go in and rebuild the whole system every time.)&lt;/del&gt; Nevermind, I had insomnia last night and I don't want to deal anymore with trying to get Eleventy to work in a systemd unit installed through Nix; it's just building in a derivation for now.&lt;/p&gt;

&lt;p&gt;(Make sure not to use Git LFS in a Nix flake repo; &lt;a href="https://github.com/NixOS/nix/issues/10079" rel="noopener noreferrer"&gt;it's illegal as of Nix 2.20.&lt;/a&gt; This has been causing problems for like 2 hours and I only just now discovered why.)&lt;/p&gt;

&lt;p&gt;Eleventy is pretty simple to use, so most of the work here has been in wrangling Nix and NodeJS to cooperate with each other and in dealing with weird CSS edge cases (like the tiny gap underneath images in posts that I can't seem to get rid of).&lt;/p&gt;

&lt;p&gt;There are a couple extra things I'd like to get to (like CSS for mobile &amp;amp; a dark theme), but this is good enough for now -- I've been at this long enough that I'm typing &lt;code&gt;;&lt;/code&gt; instead of &lt;code&gt;.&lt;/code&gt; at the end of my sentences, so I think it's time to work on something else.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://git.ashwalker.net/Ash/ashwalker.net/src/tag/v2.0.0" rel="noopener noreferrer"&gt;Here's the source code&lt;/a&gt;, if anyone's interested.&lt;/p&gt;

</description>
      <category>web</category>
      <category>nix</category>
    </item>
  </channel>
</rss>
