<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: catatsuy</title>
    <description>The latest articles on DEV Community by catatsuy (@catatsuy).</description>
    <link>https://dev.to/catatsuy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F42659%2F727807c8-3987-4502-8757-67dd99c89449.jpg</url>
      <title>DEV Community: catatsuy</title>
      <link>https://dev.to/catatsuy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/catatsuy"/>
    <language>en</language>
    <item>
      <title>I Built lls, a Go CLI to List 33.12 Million Files</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sun, 29 Mar 2026 06:56:46 +0000</pubDate>
      <link>https://dev.to/catatsuy/i-built-lls-a-go-cli-to-list-3312-million-files-4bm9</link>
      <guid>https://dev.to/catatsuy/i-built-lls-a-go-cli-to-list-3312-million-files-4bm9</guid>
      <description>&lt;p&gt;Sometimes a problem looks simple at first.&lt;/p&gt;

&lt;p&gt;In my case, I needed a complete file list from a huge directory on storage mounted over NFS from an application server. At first, this sounded like something existing tools should be able to handle. But once the number of files became extremely large, that assumption stopped being true.&lt;/p&gt;

&lt;p&gt;I eventually built a Go CLI called &lt;code&gt;lls&lt;/code&gt; to solve this problem.&lt;/p&gt;

&lt;p&gt;This was not a toy project. I built &lt;code&gt;lls&lt;/code&gt; to solve a real production problem, and in the end it was able to list &lt;strong&gt;33.12 million files&lt;/strong&gt; from a single directory on NFS.&lt;/p&gt;

&lt;p&gt;Repository:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/catatsuy/lls" rel="noopener noreferrer"&gt;https://github.com/catatsuy/lls&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, I will explain what failed, why I decided to use the Linux &lt;code&gt;getdents64&lt;/code&gt; system call directly, how the implementation works, and how &lt;code&gt;lls&lt;/code&gt; finally solved the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;The directory I had to deal with was on storage mounted over NFS, and it contained an extremely large number of files.&lt;/p&gt;

&lt;p&gt;If a directory is small, &lt;code&gt;ls&lt;/code&gt; and &lt;code&gt;find&lt;/code&gt; are usually enough. But once the number of files becomes too large, even getting a complete file list becomes difficult. And when the directory is on NFS instead of local storage, the situation can become even worse.&lt;/p&gt;

&lt;p&gt;What I needed was simple in theory: get the full list of files and finish successfully. In practice, that turned out to be the hard part.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;ls -U1&lt;/code&gt; and &lt;code&gt;find&lt;/code&gt; could not finish
&lt;/h2&gt;

&lt;p&gt;The first thing I tried was &lt;code&gt;ls -U1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I disabled sorting because sorting is one of the well-known reasons &lt;code&gt;ls&lt;/code&gt; becomes painful on huge directories. But even with &lt;code&gt;ls -U1&lt;/code&gt;, it still could not finish. The number varied from run to run, but at best it stopped after outputting about &lt;strong&gt;6 million files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I did not fully investigate why it stopped, but I suspected the storage server might have stopped responding.&lt;/p&gt;

&lt;p&gt;Next, I tried &lt;code&gt;find&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I thought &lt;code&gt;find&lt;/code&gt; might handle more entries than &lt;code&gt;ls&lt;/code&gt;, but it also failed. The result also varied, but at best it output about &lt;strong&gt;12 million lines&lt;/strong&gt; before it stopped responding.&lt;/p&gt;

&lt;p&gt;At that point, I was almost ready to give up. I started thinking I might have to output part of the file list, delete files in multiple rounds, and somehow work around the problem manually.&lt;/p&gt;

&lt;p&gt;But I wanted a real solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built &lt;code&gt;lls&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Around that time, I found an article describing how someone listed a directory containing 8 million files by calling &lt;code&gt;getdents&lt;/code&gt; directly with a large buffer. That was the key idea I needed. The article showed the C approach, but not a ready-to-use implementation, so I decided to build my own tool in Go.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html" rel="noopener noreferrer"&gt;http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Go, I could call Linux-specific system calls through the &lt;code&gt;syscall&lt;/code&gt; package. That meant I could stay in Go, avoid cgo, and still work directly with the kernel interface I needed.&lt;/p&gt;

&lt;p&gt;That was how &lt;code&gt;lls&lt;/code&gt; started.&lt;/p&gt;

&lt;p&gt;The point of &lt;code&gt;lls&lt;/code&gt; was not to replace &lt;code&gt;ls&lt;/code&gt; in general. It was a narrow tool for one difficult job: keep reading directory entries from a huge directory until the end.&lt;/p&gt;

&lt;h2&gt;
  
  
  System calls and &lt;code&gt;getdents64&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;On Linux, userland programs ask the kernel to do work through system calls. Directory reading is no exception.&lt;/p&gt;

&lt;p&gt;For this problem, the important system call was &lt;code&gt;getdents64&lt;/code&gt;. The older &lt;code&gt;getdents&lt;/code&gt; exists, but &lt;code&gt;getdents64&lt;/code&gt; was added because the original interface did not handle large filesystems and large file offsets well. In Go, the function exposed as &lt;code&gt;syscall.Getdents&lt;/code&gt; uses &lt;code&gt;getdents64&lt;/code&gt;, which was exactly what I needed here.&lt;/p&gt;

&lt;p&gt;The returned data is not a high-level file list. It is raw directory-entry data packed into a byte buffer.&lt;/p&gt;

&lt;p&gt;Conceptually, the data corresponds to a structure with fields such as inode number, offset, record length, type, and a null-terminated file name.&lt;/p&gt;

&lt;p&gt;That detail matters, because if you use &lt;code&gt;getdents64&lt;/code&gt; directly, you have to parse the buffer yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  My first idea: one large buffer
&lt;/h2&gt;

&lt;p&gt;My first idea was simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;allocate a buffer based on the directory size&lt;/li&gt;
&lt;li&gt;call &lt;code&gt;getdents64&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;print each file name to standard output&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The directory size here was the same value you can see with &lt;code&gt;ls -dl&lt;/code&gt;. The idea was that this value should be large enough to hold the full result. If the buffer was smaller than what was really needed, the output would be incomplete.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;lls&lt;/code&gt; also had a &lt;code&gt;-buf-size&lt;/code&gt; option so I could adjust the size manually, and a &lt;code&gt;-debug&lt;/code&gt; option to show how much of the buffer was actually used.&lt;/p&gt;

&lt;p&gt;However, on the real directory, this did not work as expected.&lt;/p&gt;

&lt;p&gt;The directory size reported by &lt;code&gt;ls -dl&lt;/code&gt; was over 2 GB, and running &lt;code&gt;lls&lt;/code&gt; with that default buffer size produced &lt;code&gt;EINVAL&lt;/code&gt;. After trying different values, I found that &lt;code&gt;2147483647&lt;/code&gt; worked but &lt;code&gt;2147483648&lt;/code&gt; did not. Later, I concluded that this was because the size had to fit in an &lt;code&gt;int&lt;/code&gt;, which also explains why the call failed beyond that point.&lt;/p&gt;

&lt;p&gt;Even after increasing the buffer size as much as possible, that approach still was not the real solution. The important point was not “make one call with a bigger buffer.” The real solution was to change the design.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real fix: call &lt;code&gt;getdents64&lt;/code&gt; repeatedly
&lt;/h2&gt;

&lt;p&gt;The real fix was to stop thinking in terms of one huge call.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;getdents64&lt;/code&gt; can be called repeatedly. If you keep calling it until it returns &lt;code&gt;0&lt;/code&gt;, you can continue reading the remaining directory entries.&lt;/p&gt;

&lt;p&gt;This became the key change in &lt;code&gt;lls&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of relying on a single enormous buffer, &lt;code&gt;lls&lt;/code&gt; now uses a reasonable buffer and keeps calling &lt;code&gt;syscall.Getdents&lt;/code&gt; until the directory is fully consumed. That change made it possible to list &lt;strong&gt;all 33.12 million files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That was the point where &lt;code&gt;lls&lt;/code&gt; became a practical tool for extremely large directories rather than a one-shot experiment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core implementation
&lt;/h2&gt;

&lt;p&gt;The implementation in &lt;code&gt;lls&lt;/code&gt; is built around &lt;code&gt;syscall.Dirent&lt;/code&gt;, which Go defines on Linux with fields like inode number, offset, record length, type, and a fixed-size name field.&lt;/p&gt;

&lt;p&gt;The core loop is straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;allocate a buffer&lt;/li&gt;
&lt;li&gt;call &lt;code&gt;syscall.Getdents(int(f.Fd()), buf)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;if the return value is &lt;code&gt;0&lt;/code&gt;, stop&lt;/li&gt;
&lt;li&gt;otherwise parse the returned bytes entry by entry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most important part is how the parsing works.&lt;/p&gt;

&lt;p&gt;The returned buffer contains multiple directory entries. Each entry has a variable size, so the code cannot move by a fixed structure size. Instead, it casts the current position in the buffer to &lt;code&gt;*syscall.Dirent&lt;/code&gt;, reads &lt;code&gt;Reclen&lt;/code&gt;, and moves forward by that many bytes.&lt;/p&gt;

&lt;p&gt;That is how it walks through the buffer correctly.&lt;/p&gt;

&lt;p&gt;The code also checks &lt;code&gt;Ino&lt;/code&gt;. If the inode number is &lt;code&gt;0&lt;/code&gt;, the entry is skipped, because that means the file no longer exists.&lt;/p&gt;

&lt;p&gt;For file names, the implementation uses the &lt;code&gt;Name&lt;/code&gt; field from &lt;code&gt;syscall.Dirent&lt;/code&gt;. In Go this is &lt;code&gt;[256]int8&lt;/code&gt;, so the code first treats it as bytes and then converts the bytes before the terminating null into a string.&lt;/p&gt;

&lt;p&gt;In other words, the implementation stays intentionally close to the kernel interface:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;call &lt;code&gt;getdents64&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;interpret the returned bytes as directory entries&lt;/li&gt;
&lt;li&gt;move forward using &lt;code&gt;Reclen&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;extract the file name&lt;/li&gt;
&lt;li&gt;repeat until &lt;code&gt;getdents64&lt;/code&gt; returns &lt;code&gt;0&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this worked
&lt;/h2&gt;

&lt;p&gt;One especially useful detail is that libc's &lt;code&gt;readdir&lt;/code&gt; implementation often uses a fixed internal buffer. In one of the articles I read, the example used a 2048-byte buffer internally. If your directory is huge, that means a large number of system calls just to read through it.&lt;/p&gt;

&lt;p&gt;You cannot easily change that buffer size from outside, which is why directly calling &lt;code&gt;getdents64&lt;/code&gt; yourself can make sense in an extreme case like this.&lt;/p&gt;

&lt;p&gt;That does not mean low-level code is always better. It only means this particular problem was narrow enough, and extreme enough, that the lower-level interface matched the problem better than a general-purpose tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  The result
&lt;/h2&gt;

&lt;p&gt;In the end, the comparison looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ls -U1&lt;/code&gt;: about &lt;strong&gt;6 million files&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;find&lt;/code&gt;: about &lt;strong&gt;12 million lines&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lls&lt;/code&gt;: &lt;strong&gt;33.12 million files&lt;/strong&gt; listed successfully&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That was the result that mattered.&lt;/p&gt;

&lt;p&gt;This was not just an experiment in system programming. It solved a real production problem on NFS. I needed the full file list, and &lt;code&gt;lls&lt;/code&gt; made that possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I built &lt;code&gt;lls&lt;/code&gt; because standard tools could not finish the job on a huge NFS-mounted directory.&lt;/p&gt;

&lt;p&gt;The important ideas were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use the Linux &lt;code&gt;getdents64&lt;/code&gt; interface through Go's &lt;code&gt;syscall.Getdents&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;parse the returned directory entries directly&lt;/li&gt;
&lt;li&gt;advance through the buffer using &lt;code&gt;Reclen&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;keep calling the system call until it returns &lt;code&gt;0&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That approach finally made it possible to list &lt;strong&gt;33.12 million files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you want to look at the code, here is the repository again:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/catatsuy/lls" rel="noopener noreferrer"&gt;https://github.com/catatsuy/lls&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Useful references:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://man7.org/linux/man-pages/man2/getdents.2.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man2/getdents.2.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pkg.go.dev/syscall" rel="noopener noreferrer"&gt;https://pkg.go.dev/syscall&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>linux</category>
      <category>nfs</category>
      <category>cli</category>
    </item>
    <item>
      <title>Designing a File Tampering Detection Tool for a Legacy PHP Application</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Fri, 20 Mar 2026 05:06:17 +0000</pubDate>
      <link>https://dev.to/catatsuy/designing-a-file-tampering-detection-tool-for-a-legacy-php-application-21p6</link>
      <guid>https://dev.to/catatsuy/designing-a-file-tampering-detection-tool-for-a-legacy-php-application-21p6</guid>
      <description>&lt;p&gt;I work on a legacy PHP application that runs on AWS EC2. The application is deployed from a deploy server with &lt;code&gt;rsync&lt;/code&gt;. In this environment, I needed a practical way to detect file tampering on application servers.&lt;/p&gt;

&lt;p&gt;Existing tools did not fit this deployment model well, so I built a small Go tool called &lt;code&gt;kekkai&lt;/code&gt; and open-sourced it. In this post, I want to explain not only the design choices, but also the implementation and operational details that mattered in practice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/catatsuy/kekkai" rel="noopener noreferrer"&gt;https://github.com/catatsuy/kekkai&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The environment
&lt;/h2&gt;

&lt;p&gt;This application has these characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it runs on AWS EC2&lt;/li&gt;
&lt;li&gt;it is a legacy PHP application&lt;/li&gt;
&lt;li&gt;dependencies are installed on a deploy server&lt;/li&gt;
&lt;li&gt;the application is deployed with &lt;code&gt;rsync&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a common setup for older PHP applications. I wanted a solution that fits this environment instead of assuming container images or immutable deployments.&lt;/p&gt;

&lt;h2&gt;
  
  
  The basic model
&lt;/h2&gt;

&lt;p&gt;The model is simple.&lt;/p&gt;

&lt;p&gt;First, the deploy server calculates hashes for files and creates a manifest. The manifest can be stored either in S3 or in a local file. Then the application server verifies local files against that manifest.&lt;/p&gt;

&lt;p&gt;The tool has two main commands:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;generate&lt;/code&gt;: create a manifest from the current files&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;verify&lt;/code&gt;: compare current files with the manifest&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted the data flow to stay easy to understand. The deploy server creates the trusted data, and the application server only reads it and verifies local files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Manifest structure
&lt;/h2&gt;

&lt;p&gt;The manifest contains these values for each file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;path&lt;/li&gt;
&lt;li&gt;SHA-256 hash&lt;/li&gt;
&lt;li&gt;file size&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also contains the exclude rules used at generation time.&lt;/p&gt;

&lt;p&gt;I wanted the manifest itself to describe what should be checked. I did not want verification behavior to depend on extra local configuration on the application server.&lt;/p&gt;

&lt;p&gt;This is also why &lt;code&gt;verify&lt;/code&gt; does not accept additional exclude rules. If the application server is compromised, I do not want it to be able to silently skip more files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I only hash file contents
&lt;/h2&gt;

&lt;p&gt;I only check file contents. I do not check timestamps or other metadata.&lt;/p&gt;

&lt;p&gt;The reason is simple: metadata changes too easily. Normal operational work can change timestamps even when the file contents are still the same. If a tool alerts on that, it creates noisy alerts, and eventually people stop trusting the alerts.&lt;/p&gt;

&lt;p&gt;This is also why I did not want an approach that archives the whole source tree into a tar file and hashes that tar file. A tar file can change for reasons that do not mean the application code was tampered with. I wanted the tool to fail only when the content of an actual file changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  How &lt;code&gt;generate&lt;/code&gt; works
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;generate&lt;/code&gt; command walks the target directory and creates manifest entries one by one.&lt;/p&gt;

&lt;p&gt;For regular files, it reads the file, calculates a SHA-256 hash, and stores the path, hash, and file size in the manifest.&lt;/p&gt;

&lt;p&gt;Exclude rules are applied at this stage. I made this choice on purpose. The deploy server is the side that creates trusted data, so exclude handling should be fixed there.&lt;/p&gt;

&lt;p&gt;After all entries are collected, the manifest is written either to a local file or to S3.&lt;/p&gt;

&lt;p&gt;I also made &lt;code&gt;generate&lt;/code&gt; flexible enough to work even when some excluded directories do not exist on the deploy server. That helps in real deployment environments where some paths only exist on application servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  How &lt;code&gt;verify&lt;/code&gt; works
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;verify&lt;/code&gt; command loads the manifest first. Then it walks the target directory and compares each current file with the manifest entry.&lt;/p&gt;

&lt;p&gt;It checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;whether the path exists in the manifest&lt;/li&gt;
&lt;li&gt;whether the file type matches&lt;/li&gt;
&lt;li&gt;whether the file size matches&lt;/li&gt;
&lt;li&gt;whether the calculated hash matches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also detects files that exist in the manifest but are missing on disk.&lt;/p&gt;

&lt;p&gt;When verification fails, the command exits with a non-zero status. It also writes error details to standard error, including the path that failed. This makes the tool easy to integrate with monitoring systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  How symlinks are handled in Go
&lt;/h2&gt;

&lt;p&gt;Symlinks needed special handling.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kekkai&lt;/code&gt; does not follow symlinks. Instead, it verifies the symlink itself.&lt;/p&gt;

&lt;p&gt;The implementation is roughly like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use &lt;code&gt;os.Lstat&lt;/code&gt; to check whether the entry is a symlink&lt;/li&gt;
&lt;li&gt;use &lt;code&gt;os.Readlink&lt;/code&gt; to read the target path string&lt;/li&gt;
&lt;li&gt;add a &lt;code&gt;symlink:&lt;/code&gt; prefix to that string&lt;/li&gt;
&lt;li&gt;calculate the SHA-256 hash of that prefixed string&lt;/li&gt;
&lt;li&gt;during verification, check both the file type and the stored hash&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This lets the tool detect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a changed symlink target path&lt;/li&gt;
&lt;li&gt;a type change between a regular file and a symlink&lt;/li&gt;
&lt;li&gt;added or removed symlinks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This design is intentional. If the symlink target path stays the same but the target file contents change, that is outside the scope of this check. I accepted that trade-off because I wanted predictable behavior and simple logic.&lt;/p&gt;

&lt;p&gt;I also do not cache symlink verification results. The hashed input is only a short string, so the cost is small.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I support both S3 and local files
&lt;/h2&gt;

&lt;p&gt;The manifest itself must be protected.&lt;/p&gt;

&lt;p&gt;If an attacker can modify both the application files and the manifest, verification becomes meaningless. That is why the main production model stores the manifest in S3 instead of next to the application files.&lt;/p&gt;

&lt;p&gt;At the same time, I also wanted local file output. Without that, even simple tests would require AWS credentials. So the tool supports both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 for production&lt;/li&gt;
&lt;li&gt;local files for testing and development&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also recommend being careful with local manifest output. If you deploy that manifest into the same target directory, &lt;code&gt;verify&lt;/code&gt; can fail because the manifest itself appears as an unexpected file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Protecting the manifest with S3 and IAM
&lt;/h2&gt;

&lt;p&gt;Using S3 also makes it easier to separate permissions.&lt;/p&gt;

&lt;p&gt;The application side only needs &lt;code&gt;GetObject&lt;/code&gt;.&lt;br&gt;
The deploy side only needs &lt;code&gt;PutObject&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That separation is useful because the deploy server and the application servers have different roles. If needed, S3 features such as versioning can also help protect the manifest further.&lt;/p&gt;

&lt;p&gt;I also recommend keeping &lt;code&gt;base-path&lt;/code&gt; fixed in production and managing it explicitly. Since &lt;code&gt;base-path&lt;/code&gt; and &lt;code&gt;app-name&lt;/code&gt; become part of the S3 path, this helps avoid accidentally overwriting production data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why I chose SHA-256
&lt;/h2&gt;

&lt;p&gt;For this kind of verification, I needed a hash function with the right security properties. I did not want to use a weak fast hash that would make it easier to replace a file with another input that matches the stored hash.&lt;/p&gt;

&lt;p&gt;In security terms, the important property here is second-preimage resistance.&lt;/p&gt;

&lt;p&gt;I considered SHA-256 and SHA-512. I chose SHA-256 because it is standard, well known, and easy to justify. I also did not see a meaningful advantage from SHA-512 for source-code-sized files in this use case.&lt;/p&gt;
&lt;h2&gt;
  
  
  How I reduced production load
&lt;/h2&gt;

&lt;p&gt;Performance was the hardest practical problem.&lt;/p&gt;

&lt;p&gt;Hashing a large codebase uses CPU, memory, and I/O. If the verification tool itself harms production stability, that defeats the purpose. Because of that, I added several controls.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;code&gt;GOMAXPROCS&lt;/code&gt; and workers
&lt;/h3&gt;

&lt;p&gt;First, I rely on normal Go controls such as &lt;code&gt;GOMAXPROCS&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kekkai&lt;/code&gt; also has a &lt;code&gt;--workers&lt;/code&gt; option to control how many files are hashed in parallel. By default, it uses the same value as &lt;code&gt;GOMAXPROCS&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This helps, but it is not enough. Even with one worker, the process can still keep one CPU core busy when many files are processed.&lt;/p&gt;
&lt;h3&gt;
  
  
  I/O rate limiting with &lt;code&gt;golang.org/x/time/rate&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To make the tool safer in production, I added I/O rate limiting with &lt;code&gt;golang.org/x/time/rate&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of only limiting concurrency, I also limit how fast the tool reads file data. This makes it possible to slow verification down on purpose and reduce the production impact.&lt;/p&gt;

&lt;p&gt;The core idea is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create a limiter&lt;/li&gt;
&lt;li&gt;read file data in chunks&lt;/li&gt;
&lt;li&gt;wait on the limiter before each chunk&lt;/li&gt;
&lt;li&gt;write the chunk into the hasher&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach gave me the most flexible control. In practice, this mattered more than worker limits alone.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kekkai&lt;/code&gt; exposes this through the &lt;code&gt;--rate-limit&lt;/code&gt; option. Of course, if the value is too small, verification will become very slow, so this needs to be tuned carefully.&lt;/p&gt;
&lt;h3&gt;
  
  
  Cache
&lt;/h3&gt;

&lt;p&gt;I also added a local cache to make repeated verification faster.&lt;/p&gt;

&lt;p&gt;The cache stores file metadata and can skip hash calculation when &lt;code&gt;mtime&lt;/code&gt;, &lt;code&gt;ctime&lt;/code&gt;, and file size have not changed. Here, &lt;code&gt;ctime&lt;/code&gt; means file change time, not creation time.&lt;/p&gt;

&lt;p&gt;I know that metadata-based skipping is not a perfect security check by itself. That is why the cache is only an optimization layer.&lt;/p&gt;

&lt;p&gt;There is also some risk that the cache file itself could be tampered with. Because of that, the default behavior is to recalculate hashes with a 10% probability even when the cache says the file is unchanged. This probability can be changed with &lt;code&gt;--verify-probability&lt;/code&gt;. If it is set to &lt;code&gt;0&lt;/code&gt;, hash recalculation is skipped as long as the cache metadata still matches.&lt;/p&gt;

&lt;p&gt;The cache also includes the hash of the cache file itself. If tampering is detected, the cache is disabled. Also, files under &lt;code&gt;/tmp&lt;/code&gt; may eventually be deleted, so the cache can be rebuilt naturally over time.&lt;/p&gt;
&lt;h2&gt;
  
  
  Go implementation notes
&lt;/h2&gt;

&lt;p&gt;I also made a few implementation choices to reduce overhead in Go itself.&lt;/p&gt;

&lt;p&gt;When hashing many files, I do not want to allocate a new hasher for every file if I can avoid it. So I reuse &lt;code&gt;hash.Hash&lt;/code&gt; with &lt;code&gt;Reset()&lt;/code&gt; instead of calling &lt;code&gt;sha256.New()&lt;/code&gt; every time.&lt;/p&gt;

&lt;p&gt;The same idea applies to buffers. I reuse the &lt;code&gt;io.CopyBuffer&lt;/code&gt; buffer for each worker, instead of allocating a new buffer per file.&lt;/p&gt;

&lt;p&gt;This matters because &lt;code&gt;sha256.New()&lt;/code&gt; is not free, and repeated allocations across many files and workers increase GC cost and cache misses.&lt;/p&gt;

&lt;p&gt;One important detail is that &lt;code&gt;hash.Hash&lt;/code&gt; is not goroutine-safe. So if hashing is done in parallel, each worker needs its own hasher and buffer.&lt;/p&gt;
&lt;h2&gt;
  
  
  How I run it on the deploy server
&lt;/h2&gt;

&lt;p&gt;In production, the deploy process is implemented with shell scripts. The deploy server installs dependencies first and then runs &lt;code&gt;rsync&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Because of that, I run &lt;code&gt;kekkai generate&lt;/code&gt; at the end of the deploy script.&lt;/p&gt;

&lt;p&gt;A typical command looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kekkai generate &lt;span class="nt"&gt;--target&lt;/span&gt; /var/www/app &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-bucket&lt;/span&gt; &lt;span class="s1"&gt;'kekkai-test'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--base-path&lt;/span&gt; production &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--app-name&lt;/span&gt; kekkai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--exclude&lt;/span&gt; &lt;span class="s2"&gt;".git/**"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This stores the manifest as &lt;code&gt;production/kekkai/manifest.json&lt;/code&gt; in the specified S3 bucket.&lt;/p&gt;

&lt;p&gt;At this stage, it is important to list every directory that must be ignored, such as log directories or NFS mount points. Since exclude rules are stored in the manifest, mistakes here will affect later verification.&lt;/p&gt;
&lt;h2&gt;
  
  
  How I run it on application servers
&lt;/h2&gt;

&lt;p&gt;The minimum command on an application server is simple:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kekkai verify &lt;span class="nt"&gt;--target&lt;/span&gt; /var/www/app &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-bucket&lt;/span&gt; &lt;span class="s1"&gt;'kekkai-test'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--base-path&lt;/span&gt; production &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--app-name&lt;/span&gt; kekkai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;In real production, I also care about these points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the application server must not be able to write to S3&lt;/li&gt;
&lt;li&gt;I want alerts on failure&lt;/li&gt;
&lt;li&gt;I want to limit load on EC2&lt;/li&gt;
&lt;li&gt;I want to use the cache to reduce execution time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the application side only gets &lt;code&gt;s3:GetObject&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Monitoring and alerts
&lt;/h2&gt;

&lt;p&gt;I run verification as a periodic check from our monitoring system.&lt;/p&gt;

&lt;p&gt;If I alert on a single failure, I may get alerts during deployment. That would create false positives and reduce trust in alerts. So I only alert after repeated failures.&lt;/p&gt;

&lt;p&gt;Timeout is also important. Full verification can take several minutes, so the monitoring side needs a longer timeout than a normal health check.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why I use &lt;code&gt;systemd-run&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;I also use &lt;code&gt;systemd-run&lt;/code&gt; when running &lt;code&gt;kekkai verify&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The reason is simple: I do not want this check to run with strong privileges or compete too aggressively with the main application.&lt;/p&gt;

&lt;p&gt;A real example looks like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemd-run &lt;span class="nt"&gt;--quiet&lt;/span&gt; &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--pipe&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;oneshot &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;CPUQuota&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;25% &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;CPUWeight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;50 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;PrivateTmp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;no &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;User&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nobody &lt;span class="se"&gt;\&lt;/span&gt;
  /bin/bash &lt;span class="nt"&gt;-lc&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s1"&gt;'nice -n 10 ionice -c2 -n7 /usr/local/bin/kekkai verify --s3-bucket kekkai-test --app-name app --base-path production --target /var/www/app --use-cache --rate-limit 10485760 2&amp;gt;&amp;amp;1'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;There are several reasons for this setup.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;User=nobody&lt;/code&gt; makes the command run as a low-privilege user&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nice&lt;/code&gt; and &lt;code&gt;ionice&lt;/code&gt; reduce CPU and I/O priority&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CPUQuota&lt;/code&gt; and &lt;code&gt;CPUWeight&lt;/code&gt; reduce CPU usage further through cgroup control&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PrivateTmp=no&lt;/code&gt; is necessary if I want to use the cache in &lt;code&gt;/tmp&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is easy to miss. If &lt;code&gt;PrivateTmp=no&lt;/code&gt; is not set, the process gets a different private &lt;code&gt;/tmp&lt;/code&gt;, and the cache file cannot be reused.&lt;/p&gt;

&lt;p&gt;I also mention Go 1.25 or later in this context. Before Go 1.25, even if cgroup limits were applied, &lt;code&gt;GOMAXPROCS&lt;/code&gt; could still reflect the parent machine's CPU count. Since Go 1.25 became cgroup-aware by default, I target Go 1.25 or later in &lt;code&gt;kekkai&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Alert contents
&lt;/h2&gt;

&lt;p&gt;When verification fails, &lt;code&gt;kekkai&lt;/code&gt; writes the error to standard error, including the affected path.&lt;/p&gt;

&lt;p&gt;Some monitoring systems include standard output in notifications, so I redirect standard error to standard output when needed. That way, a notification to Slack or another channel can include the actual file path that failed verification.&lt;/p&gt;

&lt;p&gt;This makes investigation much faster.&lt;/p&gt;
&lt;h2&gt;
  
  
  Production results
&lt;/h2&gt;

&lt;p&gt;In production, the application has about 17,000 files including dependencies.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;manifest generation takes a few seconds&lt;/li&gt;
&lt;li&gt;verification takes about 4 to 5 minutes with &lt;code&gt;--rate-limit 10485760&lt;/code&gt; (10 MB/s)&lt;/li&gt;
&lt;li&gt;with &lt;code&gt;--use-cache&lt;/code&gt;, a cache hit can reduce that to about 25 seconds&lt;/li&gt;
&lt;li&gt;verification runs once per hour on application servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This difference is intentional. I want &lt;code&gt;generate&lt;/code&gt; to finish quickly as part of deployment, but I want &lt;code&gt;verify&lt;/code&gt; to run slowly and safely on production servers. Even if verification takes about five minutes, running it once per hour is enough for this use case.&lt;/p&gt;
&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;I did not want to build a large security platform. I wanted a small tool that fits a specific real-world environment: a legacy PHP application on EC2, deployed with &lt;code&gt;rsync&lt;/code&gt;, with a deploy server and application servers playing different roles.&lt;/p&gt;

&lt;p&gt;That focus shaped both the design and the implementation: content-only hashing, strict exclude rules, explicit symlink handling, S3 and IAM for manifest protection, local cache with probabilistic re-verification, rate limiting, and safe execution with &lt;code&gt;systemd-run&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you work with a similar deployment model, this approach may be useful for you too. I have open-sourced the tool on GitHub as &lt;code&gt;catatsuy/kekkai&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/catatsuy" rel="noopener noreferrer"&gt;
        catatsuy
      &lt;/a&gt; / &lt;a href="https://github.com/catatsuy/kekkai" rel="noopener noreferrer"&gt;
        kekkai
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A lightweight Go tool for detecting file tampering by comparing content-based hashes stored securely in S3.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Kekkai&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;A simple and fast Go tool for file integrity monitoring. Detects unauthorized file modifications caused by OS command injection and other attacks by recording file hashes during deployment and verifying them periodically.&lt;/p&gt;

&lt;p&gt;The name "Kekkai" comes from the Japanese word 結界 (kekkai), meaning "barrier" - a protective boundary that keeps unwanted things out, perfectly representing this tool's purpose of protecting your files from tampering.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Takumi, the AI offensive security engineer&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a href="https://flatt.tech/oss/gmo/trampoline" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/3a15e3e01ccd3cd6a238fdf4a7d1b23953ed233640439974047f80f7aaafe0fe/68747470733a2f2f666c6174742e746563682f6173736574732f696d616765732f6261646765732f676d6f2d6f73732e737667" height="24px"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Design Philosophy&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;Kekkai was designed to solve specific challenges in production server environments:&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Why Kekkai?&lt;/h3&gt;

&lt;/div&gt;

&lt;p&gt;Traditional tools like &lt;code&gt;tar&lt;/code&gt; or file sync utilities (e.g., &lt;code&gt;rsync&lt;/code&gt;) include metadata like timestamps in their comparisons, causing false positives when only timestamps change. In environments with heavy NFS usage or dynamic log directories, existing tools become difficult to configure and maintain.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Core Principles&lt;/h3&gt;

&lt;/div&gt;


&lt;ol&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Content-Only Hashing&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hashes only file contents, ignoring timestamps and metadata&lt;/li&gt;
&lt;li&gt;Detects actual content changes, not superficial modifications&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Immutable Exclude&lt;/strong&gt;…&lt;/p&gt;


&lt;/li&gt;

&lt;/ol&gt;
&lt;/div&gt;
&lt;br&gt;
  &lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/catatsuy/kekkai" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;





</description>
      <category>go</category>
      <category>php</category>
      <category>security</category>
      <category>aws</category>
    </item>
    <item>
      <title>Why I, as Someone Who Likes MySQL, Now Want to Recommend PostgreSQL</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sun, 15 Mar 2026 05:48:21 +0000</pubDate>
      <link>https://dev.to/catatsuy/why-i-as-someone-who-likes-mysql-now-want-to-recommend-postgresql-2a8i</link>
      <guid>https://dev.to/catatsuy/why-i-as-someone-who-likes-mysql-now-want-to-recommend-postgresql-2a8i</guid>
      <description>&lt;p&gt;I like MySQL. I have used it for a long time, and I have also operated it in on-premises environments.&lt;/p&gt;

&lt;p&gt;However, since joining my current company, I have had more opportunities to use PostgreSQL. At first, I honestly felt a lot of resistance to it. I had used MySQL for so long, so part of it was just habit, and I think I was also more wary of PostgreSQL than I needed to be.&lt;/p&gt;

&lt;p&gt;But as I actually used it, I gradually started to see what was good about PostgreSQL. These days, if someone asks me which one I would choose for a new project, I have come to feel that I would want to choose PostgreSQL.&lt;/p&gt;

&lt;p&gt;Because I have used MySQL for a long time, I also know the rough edges that older MySQL had. At the same time, I think it is inaccurate to talk about MySQL today based only on old impressions. If you configure &lt;code&gt;sql_mode&lt;/code&gt; properly, you can avoid many dangerous behaviors, and MySQL 8 added a large number of features.&lt;/p&gt;

&lt;p&gt;Also, this time I want to compare current MySQL and PostgreSQL on the assumption that they will run in the cloud, rather than based on impressions from the on-premises era. Some of the things that used to be described as disadvantages of PostgreSQL are no longer very important issues now.&lt;/p&gt;

&lt;p&gt;This is not a story about “MySQL is bad.” It is also not a story like “the philosophy of PostgreSQL is beautiful.”&lt;/p&gt;

&lt;p&gt;If I write only the conclusion, it is these two points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Things that used to be considered disadvantages of PostgreSQL have become much less important. The feature gap has narrowed a lot, and under the assumption of managed services, there are more things you do not need to worry about.&lt;/li&gt;
&lt;li&gt;On the other hand, from the perspective of application implementation, there are still points where PostgreSQL is clearly better.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article, I will organize the discussion from that perspective.&lt;/p&gt;

&lt;h2&gt;
  
  
  What used to be considered disadvantages of PostgreSQL has become much less significant
&lt;/h2&gt;

&lt;p&gt;In older comparisons, PostgreSQL’s weaknesses were often said to be heavier operations and awkwardness around DDL.&lt;/p&gt;

&lt;p&gt;But I think bringing those points up as-is today is a bit outdated.&lt;/p&gt;

&lt;p&gt;MySQL has become very strong in online DDL, but at least for everyday tasks like adding columns, I do not think there is still a clear difference between MySQL and PostgreSQL. Partitioning is also no longer as big an issue as it once felt.&lt;/p&gt;

&lt;p&gt;Also, operational topics specific to PostgreSQL, such as &lt;code&gt;VACUUM&lt;/code&gt;, come up much less often when you assume managed services, because users have far fewer situations where they need to handle them directly. I do not think it is very fair to bring comparisons from the old on-premises era, where you had to manage everything yourself, directly into the current cloud era.&lt;/p&gt;

&lt;p&gt;The differences around replication have also become less visible recently, because managed services have become the mainstream, and there are more parts that users do not directly touch. I feel there are fewer situations than before where I strongly notice an advantage on the MySQL side.&lt;/p&gt;

&lt;p&gt;In other words, some of the things that used to be valid reasons not to recommend PostgreSQL have now become much weaker.&lt;/p&gt;

&lt;h2&gt;
  
  
  Even so, PostgreSQL is stronger for application implementation
&lt;/h2&gt;

&lt;p&gt;This is the main point.&lt;/p&gt;

&lt;p&gt;MySQL 8 closed a lot of the gap. Even so, when I look at things from the standpoint of someone actually writing applications, there are still reasons why PostgreSQL is easier to recommend.&lt;/p&gt;

&lt;h2&gt;
  
  
  First, the things MySQL 8 added and narrowed the gap on
&lt;/h2&gt;

&lt;p&gt;I want to make this clear first. The following are things that used to be described as PostgreSQL advantages, but are no longer decisive because MySQL 8 added them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CHECK&lt;/code&gt; constraints&lt;/li&gt;
&lt;li&gt;Window functions&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SKIP LOCKED&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this point, it is not fair to talk about these as strengths that only PostgreSQL has.&lt;/p&gt;

&lt;p&gt;However, as I will explain later, “window functions themselves were added in MySQL 8” and “being able to naturally bring window functions into update processing” are different things. I still think PostgreSQL is much easier to work with for the latter.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; is not a replacement for &lt;code&gt;INSERT IGNORE&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This is a feature MySQL did not originally have, and it is one of the fairly big reasons why I recommend PostgreSQL.&lt;/p&gt;

&lt;p&gt;MySQL has &lt;code&gt;INSERT IGNORE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;However, this is hard to treat as a replacement for &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;PostgreSQL’s &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; is, basically, a feature that explicitly says: “Do not insert only when a unique constraint conflict occurs.” What you want to do becomes SQL exactly as it is.&lt;/p&gt;

&lt;p&gt;By contrast, MySQL’s &lt;code&gt;INSERT IGNORE&lt;/code&gt; is not a dedicated feature only for ignoring duplicates. It is a feature that turns errors into warnings and continues processing, so it is too broad for the use case of “I only want to ignore duplicates.”&lt;/p&gt;

&lt;p&gt;This difference may look small, but in practice it is quite large.&lt;/p&gt;

&lt;p&gt;It makes the behavior easier to read during review, and it makes it less likely that unintended invalid input will be silently accepted.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;RETURNING&lt;/code&gt; is very powerful
&lt;/h2&gt;

&lt;p&gt;This is also a feature MySQL did not originally have, and it is another fairly big reason why I recommend PostgreSQL.&lt;/p&gt;

&lt;p&gt;In PostgreSQL, you can use &lt;code&gt;INSERT/UPDATE/DELETE ... RETURNING&lt;/code&gt;. Because you can return the changed result right there, you can naturally complete “get the result of the change” in a single statement.&lt;/p&gt;

&lt;p&gt;For example, you can do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'catatsuy'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'catatsuy@example.com'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;RETURNING&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you have this, the following become very natural:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;receive the inserted ID directly&lt;/li&gt;
&lt;li&gt;receive default values or stored values directly&lt;/li&gt;
&lt;li&gt;return the updated row as-is and use it as the API response&lt;/li&gt;
&lt;li&gt;receive the result of an upsert directly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To be honest, the &lt;code&gt;LAST_INSERT_ID()&lt;/code&gt;-based style that is common in MySQL is quite limiting.&lt;/p&gt;

&lt;p&gt;The information you can get back is narrow, and basically centered on getting a numeric &lt;code&gt;AUTO_INCREMENT&lt;/code&gt; ID.&lt;/p&gt;

&lt;p&gt;You cannot naturally receive arbitrary columns from the inserted result, and you also cannot return the completed row as-is including default values and generated columns.&lt;/p&gt;

&lt;p&gt;For example, what you may want is something like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you use UUIDs as primary keys, so you want to return them directly&lt;/li&gt;
&lt;li&gt;you want to return the completed row including generated columns and default values&lt;/li&gt;
&lt;li&gt;you want to pass the entire inserted row directly to the next step&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You cannot do this with &lt;code&gt;LAST_INSERT_ID()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In addition, even when multiple rows are inserted in a single statement, &lt;code&gt;LAST_INSERT_ID()&lt;/code&gt; does not return the inserted result as-is. What you can get is only the first &lt;code&gt;AUTO_INCREMENT&lt;/code&gt; value.&lt;/p&gt;

&lt;p&gt;So if you want to handle the result of a multi-row &lt;code&gt;INSERT&lt;/code&gt; directly in the application, it is inconvenient. With PostgreSQL’s &lt;code&gt;RETURNING&lt;/code&gt;, you can return the inserted rows directly, and this difference is very large.&lt;/p&gt;

&lt;p&gt;“Being able to return the changed result directly” is not just a convenience feature. It affects the very way you structure application implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;VALUES&lt;/code&gt; helps in real implementation
&lt;/h2&gt;

&lt;p&gt;This is not about a feature that MySQL entirely lacks. Rather, I think PostgreSQL lets you use it much more naturally.&lt;/p&gt;

&lt;p&gt;It is easy to create a small constant table on the spot, join with it, and connect it directly to update processing. When you have this, you do not need to push half-baked temporary-table-like processing out to the application side.&lt;/p&gt;

&lt;p&gt;For example, when you want to join a small number of values received from the application and update based on them, in PostgreSQL you can write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;VALUES&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'pro'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'free'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'team'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This kind of processing comes up very often in real implementation.&lt;/p&gt;

&lt;p&gt;For example, you may want to pass a small set of master-like values on the spot and update with them, or send a group of values received from an API directly into SQL.&lt;/p&gt;

&lt;p&gt;As another example, it is also natural to treat a group of values received from the application as a join target:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;VALUES&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is not that MySQL cannot do similar things at all. Since MySQL 8.0.19, it has had the &lt;code&gt;VALUES&lt;/code&gt; statement, and it can be treated as a table value constructor.&lt;/p&gt;

&lt;p&gt;However, in MySQL you need to write it with &lt;code&gt;ROW(...)&lt;/code&gt;, and if you leave column names as they are, they become things like &lt;code&gt;column_0&lt;/code&gt; and &lt;code&gt;column_1&lt;/code&gt;. It feels a bit different from PostgreSQL, where you create a small constant table on the spot, give it natural column names, and flow it directly into a &lt;code&gt;JOIN&lt;/code&gt; or &lt;code&gt;UPDATE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For example, in MySQL the same idea would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;column_0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is not a flashy feature, but this kind of thing affects the ease of everyday implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Being able to bring window functions into update processing is powerful
&lt;/h2&gt;

&lt;p&gt;This part is important.&lt;/p&gt;

&lt;p&gt;Window functions themselves were added in MySQL 8.&lt;/p&gt;

&lt;p&gt;So it is wrong to talk about window functions themselves as a PostgreSQL-only strength.&lt;/p&gt;

&lt;p&gt;However, in PostgreSQL, by combining them with &lt;code&gt;WITH&lt;/code&gt; and &lt;code&gt;UPDATE ... FROM&lt;/code&gt;, it is easy to bring the result of window functions naturally into update processing. I think there is still a difference here.&lt;/p&gt;

&lt;p&gt;For example, if you want to set a flag only on the latest row for each user, you can write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rn&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;is_latest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Window functions themselves do exist in MySQL 8.&lt;/p&gt;

&lt;p&gt;However, PostgreSQL is much more natural when it comes to connecting them to this kind of update logic.&lt;/p&gt;

&lt;p&gt;This is not just a convenience feature for analytics. It works as a weapon for application implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Partial indexes are a clear feature difference
&lt;/h2&gt;

&lt;p&gt;This is something I can clearly describe as a feature difference.&lt;/p&gt;

&lt;p&gt;MySQL did not originally have it, and it is still missing now.&lt;/p&gt;

&lt;p&gt;PostgreSQL has partial indexes, and you can create an index only on some rows, such as with &lt;code&gt;WHERE deleted_at IS NULL&lt;/code&gt;. This fits very well with soft delete patterns, and it is also useful for managing records by status.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_users_active_email&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;deleted_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, if you have a table using soft deletes and you only want to speed up “search by email address among active users,” you can write that directly.&lt;/p&gt;

&lt;p&gt;In MySQL, you can do something similar using generated columns or functional indexes. However, that is not a replacement for partial indexes.&lt;/p&gt;

&lt;p&gt;PostgreSQL partial indexes put only the rows that satisfy a condition like &lt;code&gt;WHERE deleted_at IS NULL&lt;/code&gt; into the index. In other words, unnecessary rows are excluded from the index from the beginning.&lt;/p&gt;

&lt;p&gt;By contrast, MySQL generated columns and functional indexes basically evaluate an expression for all rows and then index that result. If you design the expression well, you can use them for similar purposes, but they do not directly express “an index that physically stays small by containing only some rows.”&lt;/p&gt;

&lt;p&gt;So in terms of size, update cost, and clarity of intent, PostgreSQL’s partial indexes are more straightforward. The MySQL side can be used as a workaround, but it is hard to say it has the same feature.&lt;/p&gt;

&lt;p&gt;This is not just a difference in how the SQL feels to write. It is a real feature difference, and a reason to recommend PostgreSQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Foreign keys are much better in PostgreSQL
&lt;/h2&gt;

&lt;p&gt;This is my personal impression, but I feel there are many people in the MySQL world who think foreign keys are unnecessary, while in the PostgreSQL world there are many people who think foreign keys are necessary.&lt;/p&gt;

&lt;p&gt;I think this comes not so much from a difference in philosophy, but from differences in how easy they are to test with, how easy they are to operate, and how hard they make it for bugs to enter.&lt;/p&gt;

&lt;h3&gt;
  
  
  PostgreSQL supports deferred constraints
&lt;/h3&gt;

&lt;p&gt;In PostgreSQL, foreign keys can be made &lt;code&gt;DEFERRABLE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is extremely important.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;author_id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;books_author_fk&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;DEFERRABLE&lt;/span&gt; &lt;span class="k"&gt;INITIALLY&lt;/span&gt; &lt;span class="k"&gt;DEFERRED&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because of this, you can delay constraint checks until the end of the transaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This SQL works in PostgreSQL.&lt;/p&gt;

&lt;p&gt;The parent does not exist in the middle, but it is fine as long as consistency is satisfied at commit time.&lt;/p&gt;

&lt;p&gt;What is this useful for?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;loading data that includes circular references&lt;/li&gt;
&lt;li&gt;creating complex test data&lt;/li&gt;
&lt;li&gt;migration processes where the order is temporarily reversed&lt;/li&gt;
&lt;li&gt;bulk inserts and replacement operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These kinds of processes come up normally in real work.&lt;/p&gt;

&lt;p&gt;And whether or not you can write them naturally is a very big deal.&lt;/p&gt;

&lt;h3&gt;
  
  
  MySQL forces strict ordering
&lt;/h3&gt;

&lt;p&gt;MySQL does not have this mechanism.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NO ACTION&lt;/code&gt; is effectively &lt;code&gt;RESTRICT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In other words, you cannot structure processing in the form of “it is okay as long as it is consistent in the end.” You are always constrained by ordering rules where the parent must come first and the child must come after.&lt;/p&gt;

&lt;p&gt;This may look like a small issue, but it makes loading test data and writing migration processes much harder.&lt;/p&gt;

&lt;p&gt;For example, in test code, if you want to roughly load fixtures spanning multiple tables, in PostgreSQL you can write it so that consistency is satisfied by the end of the transaction. In MySQL you cannot do that, so you always need to manage fixture insertion order strictly.&lt;/p&gt;

&lt;p&gt;If using foreign keys makes testing more troublesome, I think it is natural that a culture emerges where people say, “Then let’s stop using foreign keys.”&lt;/p&gt;

&lt;h3&gt;
  
  
  MySQL makes it easy to escape by disabling foreign keys
&lt;/h3&gt;

&lt;p&gt;In MySQL, you can disable constraints with &lt;code&gt;foreign_key_checks=0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This looks convenient, but it is quite dangerous.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;foreign_key_checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;999&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;foreign_key_checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this, inconsistent data inserted while constraints are disabled can remain.&lt;/p&gt;

&lt;p&gt;Even if you enable them again, MySQL does not go back and verify all inconsistencies that were inserted during that time.&lt;/p&gt;

&lt;p&gt;With this behavior, it becomes easy for accidents to happen where constraints are turned off for testing or migration convenience, and inconsistent data is brought in as-is.&lt;/p&gt;

&lt;p&gt;PostgreSQL has a tool that says, “Keep the constraints, but delay the check timing.”&lt;/p&gt;

&lt;p&gt;MySQL tends to go in the direction of “turn off the constraints themselves.”&lt;/p&gt;

&lt;p&gt;This difference is quite large.&lt;/p&gt;

&lt;h3&gt;
  
  
  You can see the difference even from a foreign key example alone
&lt;/h3&gt;

&lt;p&gt;For example, consider an ordinary pair of tables with a parent-child relationship.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;orders_user_fk&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;DEFERRABLE&lt;/span&gt; &lt;span class="k"&gt;INITIALLY&lt;/span&gt; &lt;span class="k"&gt;DEFERRED&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In PostgreSQL, even during tests, you can write something like inserting into &lt;code&gt;orders&lt;/code&gt; first and then inserting into &lt;code&gt;users&lt;/code&gt; later, as long as it is within a transaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This flexibility helps a lot with fixture creation, data migration, and simplifying test code.&lt;/p&gt;

&lt;p&gt;MySQL does not have this.&lt;/p&gt;

&lt;p&gt;So in MySQL it is easier for people to drift toward thinking, “Foreign keys are in the way, so turn them off,” or “Guarantee it in the application,” while in PostgreSQL it is easier to drift toward thinking, “Let’s use foreign keys properly.”&lt;/p&gt;

&lt;p&gt;The reason PostgreSQL foreign keys are better is not simply that they have more features.&lt;/p&gt;

&lt;p&gt;A big part of it is that they make it easier to write tests, migrations, and data loading while keeping the constraints intact, and as a result, it becomes easier to actually use foreign keys in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  In MySQL, you cannot do vector operations
&lt;/h2&gt;

&lt;p&gt;Recently, I think this is probably the reason most often mentioned for adopting PostgreSQL.&lt;/p&gt;

&lt;p&gt;PostgreSQL has &lt;code&gt;pgvector&lt;/code&gt;, which not only allows you to store vectors, but also lets you use distance operations and similarity search directly from applications. It also has indexes for nearest-neighbor search, so it is easy to use directly in implementation.&lt;/p&gt;

&lt;p&gt;By contrast, assuming the OSS edition, MySQL added a Vector type in MySQL 9.0, which has already been released as an Innovation Release, while the LTS version has not yet been released. However, distance functions are provided only in MySQL HeatWave on OCI and MySQL AI, and are not included in MySQL Commercial or Community. In other words, in the OSS edition you cannot do vector operations, so it is not really usable for this. This is a clear difference from PostgreSQL + &lt;code&gt;pgvector&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Character sets and collations are still more complicated in MySQL
&lt;/h2&gt;

&lt;p&gt;This is also very important.&lt;/p&gt;

&lt;p&gt;I still think character sets and collations are more likely to cause trouble in MySQL than in PostgreSQL.&lt;/p&gt;

&lt;p&gt;However, this is not only a problem with MySQL itself. It includes frameworks, connectors, and default settings as well.&lt;/p&gt;

&lt;p&gt;There are well-known examples in Japan such as the so-called “Haha-Papa problem” and “Sushi-Beer problem.”&lt;/p&gt;

&lt;p&gt;Both of these are not so much character encoding problems as collation problems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Haha-Papa problem is when strings that look different are treated as the same because of collation rules.&lt;/li&gt;
&lt;li&gt;The Sushi-Beer problem is when comparisons involving emoji and similar characters do not behave the way you intuitively expect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What makes these problems troublesome is that “we changed it to utf8mb4, so we are done” is not enough. In reality, you need to understand both the character set and the collation.&lt;/p&gt;

&lt;p&gt;And in MySQL 8, rather than making this simpler, it actually gave us even more things to think about. New collations were added, and they coexist with older systems, so “the old style,” “the post-MySQL-8 style,” and “framework defaults” do not always line up.&lt;/p&gt;

&lt;p&gt;In other words, MySQL 8 certainly improved some things, but because old and new styles now coexist as a result of those improvements, the overall situation has in some ways become even more chaotic.&lt;/p&gt;

&lt;p&gt;I think this is not so much because MySQL itself is bad, but because it has a long history and has evolved while carrying compatibility with it.&lt;/p&gt;

&lt;p&gt;Still, from the point of view of an application developer, that complexity directly becomes an entry point for accidents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In the past, PostgreSQL had some clear weaknesses too.&lt;/p&gt;

&lt;p&gt;But now, many of them have become much less significant. The feature gap has narrowed, and under the assumption of managed services, there are more things users do not need to think about directly.&lt;/p&gt;

&lt;p&gt;On the other hand, from the point of view of application implementation, there are still reasons why PostgreSQL is easier to recommend.&lt;/p&gt;

&lt;p&gt;The especially big ones are these.&lt;/p&gt;

&lt;h3&gt;
  
  
  Things MySQL 8 added and narrowed the gap on
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CHECK&lt;/code&gt; constraints&lt;/li&gt;
&lt;li&gt;Window functions&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SKIP LOCKED&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Clear reasons to recommend PostgreSQL
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;RETURNING&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;VALUES&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;being able to bring window functions into update processing&lt;/li&gt;
&lt;li&gt;partial indexes&lt;/li&gt;
&lt;li&gt;the maturity of foreign keys&lt;/li&gt;
&lt;li&gt;vector operations through &lt;code&gt;pgvector&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;being less likely to cause trouble around character sets and collations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I like MySQL.&lt;/p&gt;

&lt;p&gt;I have used it for a long time, and I still think it is a good database that is easy to get good performance from.&lt;/p&gt;

&lt;p&gt;Even so, if the question is which one I would adopt for a new project today, the one I would recommend is PostgreSQL.&lt;/p&gt;

&lt;p&gt;That is not because MySQL is bad.&lt;/p&gt;

&lt;p&gt;It is because even now, after many of its old weaknesses have been filled in, I still think PostgreSQL has an advantage when it comes to ease of application implementation.&lt;/p&gt;

</description>
      <category>mysql</category>
      <category>postgres</category>
      <category>database</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
