<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: catatsuy</title>
    <description>The latest articles on DEV Community by catatsuy (@catatsuy).</description>
    <link>https://dev.to/catatsuy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F42659%2F727807c8-3987-4502-8757-67dd99c89449.jpg</url>
      <title>DEV Community: catatsuy</title>
      <link>https://dev.to/catatsuy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/catatsuy"/>
    <language>en</language>
    <item>
      <title>Let's Encrypt short-lived certificates are quite strict, so you should use an ARI-capable client</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sun, 19 Apr 2026 07:24:36 +0000</pubDate>
      <link>https://dev.to/catatsuy/lets-encrypt-short-lived-certificates-are-quite-strict-so-you-should-use-an-ari-capable-client-j5a</link>
      <guid>https://dev.to/catatsuy/lets-encrypt-short-lived-certificates-are-quite-strict-so-you-should-use-an-ari-capable-client-j5a</guid>
      <description>&lt;p&gt;Let's Encrypt short-lived certificates are much harder than they look if you think of them as just a shorter version of 90-day certificates.&lt;/p&gt;

&lt;p&gt;If you issue and renew multiple certificates for multiple subdomains in a short interval, you can hit certificate issuance rate limits more easily. Short-lived certificates increase the number of renewals, so these limits become much more visible.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://letsencrypt.org/docs/rate-limits/" rel="noopener noreferrer"&gt;https://letsencrypt.org/docs/rate-limits/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  You should test in the staging environment first
&lt;/h2&gt;

&lt;p&gt;For development and testing, you should use the staging environment instead of production. It has the same kind of behavior, but the limits are much looser. That makes it safer when you are still deciding how to split certificates and how to renew them.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment&lt;/th&gt;
&lt;th&gt;ACME directory URL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Staging&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://acme-staging-v02.api.letsencrypt.org/directory&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://acme-v02.api.letsencrypt.org/directory&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://letsencrypt.org/docs/staging-environment/" rel="noopener noreferrer"&gt;https://letsencrypt.org/docs/staging-environment/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Short-lived certificates hit rate limits more easily
&lt;/h2&gt;

&lt;p&gt;Short-lived certificates are valid for only 160 hours. Let's Encrypt recommends renewing them every 3 days. That means many more renewals than 90-day certificates, so rate limits become much easier to hit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://letsencrypt.org/2026/01/15/6day-and-ip-general-availability" rel="noopener noreferrer"&gt;https://letsencrypt.org/2026/01/15/6day-and-ip-general-availability&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://letsencrypt.org/docs/faq/" rel="noopener noreferrer"&gt;https://letsencrypt.org/docs/faq/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The strictest limit here is the one for the same exact set of domain names: 5 certificates per 7 days. If you keep issuing certificates for the same set of names, you get close to that limit quickly.&lt;/p&gt;

&lt;p&gt;Also, this limit does not fully reset all at once after 7 days. Let's Encrypt says the ability to request new certificates for the same exact set of identifiers refills at a rate of 1 certificate every 34 hours. With short-lived certificates, the renewal interval is short, so this refill speed matters too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://letsencrypt.org/docs/rate-limits/" rel="noopener noreferrer"&gt;https://letsencrypt.org/docs/rate-limits/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you split certificates by subdomain, the number of certificates grows. Also, subdomains under the same registered domain share the same bucket, so there is less room than it first seems.&lt;/p&gt;

&lt;h2&gt;
  
  
  It gets worse if you issue both RSA and ECDSA certificates
&lt;/h2&gt;

&lt;p&gt;If you use both RSA and ECDSA, you need two certificates for the same domain names.&lt;/p&gt;

&lt;p&gt;That means the number of certificates doubles immediately.&lt;/p&gt;

&lt;p&gt;With short-lived certificates, the renewal interval is already short, so a setup that splits certificates by subdomain and also keeps both RSA and ECDSA certificates can hit rate limits quite easily.&lt;/p&gt;

&lt;h2&gt;
  
  
  That is why you should use an ARI-capable client
&lt;/h2&gt;

&lt;p&gt;This is where ARI becomes important. ARI means ACME Renewal Information. It is a mechanism that lets the CA tell the ACME client when it should renew a certificate.&lt;/p&gt;

&lt;p&gt;With Let's Encrypt, renewals that use ARI are exempt from all rate limits. Since short-lived certificates assume renewal every 3 days, this difference is large. It matters even more if you want both RSA and ECDSA certificates.&lt;/p&gt;

&lt;p&gt;If you want to use short-lived certificates, you should use an ARI-capable client.&lt;/p&gt;

&lt;h2&gt;
  
  
  ARI is not inside the certificate
&lt;/h2&gt;

&lt;p&gt;ARI is not a certificate extension. Even if you inspect a certificate with &lt;code&gt;openssl x509 -text&lt;/code&gt;, you cannot tell whether it was renewed with ARI.&lt;/p&gt;

&lt;p&gt;ARI works through ACME &lt;code&gt;renewalInfo&lt;/code&gt;, so it is part of the ACME protocol, not part of the certificate itself. To know whether ARI is being used, you need to look at the client and the CA interaction, not only at the certificate.&lt;/p&gt;

&lt;h2&gt;
  
  
  lego is very useful for this
&lt;/h2&gt;

&lt;p&gt;If you want to use short-lived certificates, you need both &lt;code&gt;shortlived&lt;/code&gt; profile support and ARI support. lego supports both, so it is very useful for this use case.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/go-acme/lego" rel="noopener noreferrer"&gt;https://github.com/go-acme/lego&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Its usage is also simple. Use &lt;code&gt;run&lt;/code&gt; for the first issuance, then use &lt;code&gt;renew&lt;/code&gt; for later renewals. For short-lived certificates, &lt;code&gt;--profile shortlived&lt;/code&gt; is the important option, and &lt;code&gt;renew --dynamic&lt;/code&gt; is a simple way to run renewals. If you change &lt;code&gt;--server&lt;/code&gt; to the staging URL, you can test the same setup in staging.&lt;/p&gt;

&lt;p&gt;For example, the first issuance looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lego &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--accept-tos&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--email&lt;/span&gt; you@example.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--server&lt;/span&gt; https://acme-v02.api.letsencrypt.org/directory &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dns&lt;/span&gt; route53 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--domains&lt;/span&gt; example.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--domains&lt;/span&gt; www.example.com &lt;span class="se"&gt;\&lt;/span&gt;
  run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--profile&lt;/span&gt; shortlived
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Renewal looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lego &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--accept-tos&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--email&lt;/span&gt; you@example.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--server&lt;/span&gt; https://acme-v02.api.letsencrypt.org/directory &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dns&lt;/span&gt; route53 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--domains&lt;/span&gt; example.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--domains&lt;/span&gt; www.example.com &lt;span class="se"&gt;\&lt;/span&gt;
  renew &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dynamic&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--profile&lt;/span&gt; shortlived
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to test in staging, change &lt;code&gt;--server&lt;/code&gt; to this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nt"&gt;--server&lt;/span&gt; https://acme-staging-v02.api.letsencrypt.org/directory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  I still do not know whether it is ready for real production use
&lt;/h2&gt;

&lt;p&gt;I still do not know how practical short-lived certificates are for real production services.&lt;/p&gt;

&lt;p&gt;At least for now, I am using them on personal domains, and that already showed several problems.&lt;/p&gt;

&lt;p&gt;First, short-lived certificates increase the number of renewals, so CA logs also increase. There is simply too much to monitor.&lt;/p&gt;

&lt;p&gt;Also, the maximum validity is only 160 hours, so common certificate monitoring services tend to stay in a critical state all the time.&lt;/p&gt;

&lt;p&gt;And as described above, these certificates are also more likely to hit rate limits.&lt;/p&gt;

&lt;p&gt;Because of these characteristics, my own site started breaking more often after I switched to short-lived certificates. The feature is interesting, but I think stable operations are still hard if you want to use it widely in real services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With Let's Encrypt short-lived certificates, it is easier to hit certificate issuance limits.&lt;/p&gt;

&lt;p&gt;It gets even harder if you want both RSA and ECDSA certificates, because the number of certificates simply doubles. Since short-lived certificates also renew more often, a setup with multiple subdomains can get close to rate limits much faster than expected.&lt;/p&gt;

&lt;p&gt;To make this easier to operate, it is very important to test in staging first and to use an ARI-capable client.&lt;/p&gt;

</description>
      <category>letsencrypt</category>
      <category>tls</category>
      <category>acme</category>
      <category>security</category>
    </item>
    <item>
      <title>Designing and Optimizing Image Delivery with a CDN</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sat, 11 Apr 2026 06:21:53 +0000</pubDate>
      <link>https://dev.to/catatsuy/designing-and-optimizing-image-delivery-with-a-cdn-35ia</link>
      <guid>https://dev.to/catatsuy/designing-and-optimizing-image-delivery-with-a-cdn-35ia</guid>
      <description>&lt;p&gt;In modern web services, it is no longer enough to handle images the way we did in the past.&lt;/p&gt;

&lt;p&gt;A long time ago, it was common to generate a few fixed thumbnail sizes when a user uploaded an image, then serve those files as they were. That approach was often good enough. But today, high-density smartphone displays are common, and the same image may need different sizes and formats depending on the client. On top of that, CDNs and image optimization services can now generate image variants dynamically and cache them efficiently.&lt;/p&gt;

&lt;p&gt;Because of this, image delivery is no longer just a frontend concern. It is part of system design.&lt;/p&gt;

&lt;p&gt;What matters most to me is this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;image size matters more than quality&lt;/li&gt;
&lt;li&gt;image URL design should be controlled by the backend, not the frontend&lt;/li&gt;
&lt;li&gt;the backend should return multiple candidate URLs for the same image, and the browser should choose one&lt;/li&gt;
&lt;li&gt;modern formats and image optimization services are useful, but they need to be used with a good understanding of their behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article, I will explain how I think image delivery should be designed for modern web services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Image size matters more than quality
&lt;/h2&gt;

&lt;p&gt;For JPEG-like images, there are two major parameters: image size and quality. But if your goal is visual quality, image size is more important than quality.&lt;/p&gt;

&lt;p&gt;When an image is too small, people notice it immediately. A difference in quality is often more subtle. Because of that, the first step should not be fine-tuning quality. The first step should be making sure the image is large enough.&lt;/p&gt;

&lt;p&gt;Today, devices like iPhones with Retina displays are normal. On smartphones, an image often needs to be 2x larger than its display size, and sometimes 3x larger, to look good. The visual quality of thumbnail images directly affects user experience. In some services, it can even affect click-through rate or sales.&lt;/p&gt;

&lt;p&gt;Of course, larger images increase transfer size. That affects CDN cost, response time, and mobile data usage. But even with that trade-off, the right order is still important: first make sure the image is large enough, then tune quality and format.&lt;/p&gt;

&lt;p&gt;For photo-like images, this is especially important. If you want the same visual result, using a larger image with slightly lower quality can sometimes be more efficient than trying to compensate for an image that is too small by increasing quality. If the image is too small, raising quality does not solve the real problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The backend should control image URL design
&lt;/h2&gt;

&lt;p&gt;A CDN or image optimization service can generate different sizes and formats from the original image. That is very useful. But it does not mean the frontend should build image URLs freely.&lt;/p&gt;

&lt;p&gt;If the frontend builds URLs on its own, long-term operation becomes harder.&lt;/p&gt;

&lt;p&gt;For example, later you may want to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;change parameter names&lt;/li&gt;
&lt;li&gt;change how quality is handled&lt;/li&gt;
&lt;li&gt;introduce signed URLs&lt;/li&gt;
&lt;li&gt;move to another image optimization service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the client owns the URL format, backward compatibility becomes painful. This is especially true for native apps, because old versions remain in use for a long time.&lt;/p&gt;

&lt;p&gt;This also matters for CDN cache efficiency. Image transformation works best when the same request conditions lead to the same URL. If the frontend calculates width and quality freely, many slightly different URLs will appear, and the cache will fragment. Even changing query parameter order can create a different URL. That lowers cache hit rate, increases transformation work, and adds more origin load and cost.&lt;/p&gt;

&lt;p&gt;Because of that, image URL design should be controlled by the backend. The backend should define the allowed sizes, quality policy, and format policy, then return valid candidate URLs. The frontend should use those candidates.&lt;/p&gt;

&lt;h2&gt;
  
  
  Return multiple candidate URLs for the same image, then let the browser choose
&lt;/h2&gt;

&lt;p&gt;Even if the backend controls image URLs, that does not mean the application has to decide the exact image for every device by itself.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;picture&lt;/code&gt; and &lt;code&gt;srcset&lt;/code&gt;, browsers can choose images based on display width and pixel density. Those are standard browser features and they are already good enough for this job.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/picture" rel="noopener noreferrer"&gt;https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/picture&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Guides/Responsive_images" rel="noopener noreferrer"&gt;https://developer.mozilla.org/en-US/docs/Web/HTML/Guides/Responsive_images&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The important design is this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;for the same image, the backend returns multiple candidate URLs&lt;/li&gt;
&lt;li&gt;the browser chooses the best one&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, for one product image, the backend can prepare several URLs for different sizes. Then &lt;code&gt;picture&lt;/code&gt; or &lt;code&gt;srcset&lt;/code&gt; can let the browser choose the right one for the current layout and display density.&lt;/p&gt;

&lt;p&gt;This keeps URL design under backend control while still using the browser’s built-in image selection features.&lt;/p&gt;

&lt;h2&gt;
  
  
  A low-effort starting point: use the &lt;code&gt;Accept&lt;/code&gt; header
&lt;/h2&gt;

&lt;p&gt;If you want to introduce modern image formats, you do not need to start with full &lt;code&gt;picture&lt;/code&gt; and &lt;code&gt;srcset&lt;/code&gt; support everywhere.&lt;/p&gt;

&lt;p&gt;A lower-effort starting point is to use the &lt;code&gt;Accept&lt;/code&gt; header.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;Accept&lt;/code&gt; header tells the server what content types the client can receive. If your CDN or image optimization service supports it, the image delivery side can look at that header and return AVIF or WebP for supported clients, and JPEG or PNG for others, while keeping the same image URL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept" rel="noopener noreferrer"&gt;https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is practical because it does not require large template changes.&lt;/p&gt;

&lt;p&gt;But this method has an important limitation: it mainly solves format negotiation, not size selection.&lt;/p&gt;

&lt;p&gt;It also depends on where the logic runs. This approach works when the CDN or image transformation service that actually serves the image supports &lt;code&gt;Accept&lt;/code&gt;-based negotiation. It is difficult to complete this logic only in the API that returns image URLs, because that API does not directly handle the final &lt;code&gt;Accept&lt;/code&gt; header of the browser’s image request.&lt;/p&gt;

&lt;p&gt;Many image optimization services already support this kind of format switching, so it is often a very practical first step.&lt;/p&gt;

&lt;h2&gt;
  
  
  A stronger approach: use &lt;code&gt;picture&lt;/code&gt; and &lt;code&gt;srcset&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;If you want to optimize not only format but also size, &lt;code&gt;picture&lt;/code&gt; and &lt;code&gt;srcset&lt;/code&gt; are much more powerful.&lt;/p&gt;

&lt;p&gt;Here is a simple example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;picture&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;source&lt;/span&gt;
    &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"image/avif"&lt;/span&gt;
    &lt;span class="na"&gt;srcset=&lt;/span&gt;&lt;span class="s"&gt;"
      /images/example-640.avif 640w,
      /images/example-1280.avif 1280w
    "&lt;/span&gt;
    &lt;span class="na"&gt;sizes=&lt;/span&gt;&lt;span class="s"&gt;"(max-width: 640px) 100vw, 640px"&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt;
    &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"/images/example-1280.jpg"&lt;/span&gt;
    &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"example"&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/picture&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets the browser choose both by format and by size.&lt;/p&gt;

&lt;p&gt;If you only want a low-effort improvement, &lt;code&gt;Accept&lt;/code&gt;-based format switching is a good start. If you want a more complete solution that also handles responsive size selection, &lt;code&gt;picture&lt;/code&gt; and &lt;code&gt;srcset&lt;/code&gt; are the right tools.&lt;/p&gt;

&lt;p&gt;They do require template changes, so the cost is higher. But for thumbnails, product images, and other places where image quality and transfer size matter a lot, the benefit is worth that cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  WebP is very useful, but it is not magic
&lt;/h2&gt;

&lt;p&gt;WebP is still a very useful format. For photo-like images, it can often provide smaller files than JPEG while keeping good visual quality.&lt;/p&gt;

&lt;p&gt;A key advantage of WebP is that it tends to avoid some of the visible artifacts that appear when JPEG quality is pushed too low. Because of that, WebP can often use lower quality values than JPEG.&lt;/p&gt;

&lt;p&gt;But that does not mean you should convert everything to WebP without thinking.&lt;/p&gt;

&lt;p&gt;A common mistake is to focus too much on quality and forget size. Even with WebP, the better approach is still to secure enough image size first, then lower quality as needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Some images need extra care with WebP
&lt;/h2&gt;

&lt;p&gt;WebP is not equally good for every kind of image.&lt;/p&gt;

&lt;p&gt;Text images and pixel-art-like images can be tricky. The default settings of &lt;code&gt;cwebp&lt;/code&gt; are tuned for photo compression, and that can produce a slightly blurred look. For photos, this is often acceptable. For text and pixel art, it can look like visible degradation.&lt;/p&gt;

&lt;p&gt;In some cases, &lt;code&gt;cwebp&lt;/code&gt; presets such as &lt;code&gt;text&lt;/code&gt; or &lt;code&gt;icon&lt;/code&gt; can improve the result. But in a real service, it is not always easy to choose different conversion parameters for every image type.&lt;/p&gt;

&lt;p&gt;Because of that, it is often reasonable not to force WebP conversion for images that are already very small, such as text-heavy images or simple pixel art. WebP should be used where it has clear value.&lt;/p&gt;

&lt;p&gt;Also, WebP does not behave exactly like JPEG progressive rendering or PNG/GIF interlace behavior. If your service depends on those details, you should verify the difference before rollout.&lt;/p&gt;

&lt;h2&gt;
  
  
  AVIF is a strong choice for new support
&lt;/h2&gt;

&lt;p&gt;AVIF is another important modern image format. Today, browser support for AVIF is already broad, and the support gap between AVIF and WebP is much smaller than it used to be.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://caniuse.com/avif" rel="noopener noreferrer"&gt;https://caniuse.com/avif&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In practice, there are image types where WebP needs extra care, but AVIF tends to show fewer obvious issues. AVIF also works naturally with &lt;code&gt;Accept&lt;/code&gt;-based content negotiation.&lt;/p&gt;

&lt;p&gt;AVIF is often said to have heavier encoding cost, but when you use a CDN or image optimization service, service users usually do not need to care about that cost directly. What matters more is whether the format works naturally in real browser environments and whether it behaves well for your images.&lt;/p&gt;

&lt;p&gt;Because of that, for a new implementation, it is often realistic to keep JPEG or another traditional format as the fallback and support only AVIF as the modern format.&lt;/p&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;return AVIF for clients that support it&lt;/li&gt;
&lt;li&gt;return JPEG or another fallback for the rest&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Today, in many cases, it is no longer necessary to support WebP as well just because it is a modern image format.&lt;/p&gt;

&lt;p&gt;That said, quality numbers are not directly comparable between JPEG, WebP, and AVIF. You still need to tune them by looking at real output.&lt;/p&gt;

&lt;h2&gt;
  
  
  The main way to detect WebP support is the &lt;code&gt;Accept&lt;/code&gt; header
&lt;/h2&gt;

&lt;p&gt;If you want to return WebP only to supported clients, the main method is to look at the &lt;code&gt;Accept&lt;/code&gt; header.&lt;/p&gt;

&lt;p&gt;Browsers include supported content types in that header. If &lt;code&gt;image/webp&lt;/code&gt; is present, the image delivery side can treat the client as WebP-capable.&lt;/p&gt;

&lt;p&gt;The important point is that this logic should not be forced into the API that returns image URLs. API requests often use &lt;code&gt;Accept&lt;/code&gt; values for JSON and do not directly reflect the final image request. That is why this method works naturally at the image-serving CDN or transformation layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  JavaScript detection is not the main path, but it is useful for stricter checks
&lt;/h2&gt;

&lt;p&gt;In some cases, header-based detection is not enough.&lt;/p&gt;

&lt;p&gt;JavaScript can detect support more strictly by loading a small test image. This is useful if you want to check not only general WebP support but also features like lossless, alpha, or animation.&lt;/p&gt;

&lt;p&gt;Google’s WebP FAQ shows code like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developers.google.com/speed/webp/faq" rel="noopener noreferrer"&gt;https://developers.google.com/speed/webp/faq&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;check_webp_feature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;kTestImages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;lossy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;UklGRiIAAABXRUJQVlA4IBYAAAAwAQCdASoBAAEADsD+JaQAA3AAAAAA&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;lossless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;UklGRhoAAABXRUJQVlA4TA0AAAAvAAAAEAcQERGIiP4HAA==&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;UklGRkoAAABXRUJQVlA4WAoAAAAQAAAAAAAAAAAAQUxQSAwAAAARBxAR/Q9ERP8DAABWUDggGAAAABQBAJ0BKgEAAQAAAP4AAA3AAP7mtQAAAA==&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;animation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;UklGRlIAAABXRUJQVlA4WAoAAAASAAAAAAAAAAAAQU5JTQYAAAD/////AABBTk1GJgAAAAAAAAAAAAAAAAAAAGQAAABWUDhMDQAAAC8AAAAQBxAREYiI/gcA&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;height&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nf"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onerror&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;src&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;data:image/webp;base64,&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;kTestImages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is helpful when you need very strict compatibility checks. But it is not the main path for normal WebP rollout. In most cases, &lt;code&gt;Accept&lt;/code&gt;-based handling plus &lt;code&gt;picture&lt;/code&gt; when needed is enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  There used to be compatibility issues that affected delivery strategy
&lt;/h2&gt;

&lt;p&gt;In the past, image delivery strategy was affected by compatibility issues such as Safari not supporting WebP and Android 4.3 and below being difficult to support reliably. On Android 4.3 and below, even if surface-level detection suggested WebP support, real rendering could still break.&lt;/p&gt;

&lt;p&gt;That situation has changed a lot. Safari 14 added WebP support, and the impact of very old Android environments has become much smaller as more services dropped support for TLS versions below 1.2.&lt;/p&gt;

&lt;p&gt;Today, this is more of a historical compatibility concern than a core design issue.&lt;/p&gt;

&lt;h2&gt;
  
  
  You should monitor whether large images are being served by mistake
&lt;/h2&gt;

&lt;p&gt;When people talk about image optimization, they often focus on formats and transformation logic. But in real operation, it is also important to monitor whether oversized images are being served by mistake.&lt;/p&gt;

&lt;p&gt;Even with a good design, mistakes happen. Configuration issues or unexpected input images can result in large objects being delivered. That directly affects bandwidth, response time, and CDN cost.&lt;/p&gt;

&lt;p&gt;Because of that, you should use metrics from your CDN or monitoring system to observe object size ranges and transfer volume continuously.&lt;/p&gt;

&lt;p&gt;In practice, bugs often appear in edge cases. For example, a very tall or very wide image with an unusual aspect ratio can trigger an unexpected resize result and cause a much larger image to be served than intended. You will not catch that by thinking only about format choice. You need to observe the actual object sizes being delivered.&lt;/p&gt;

&lt;h2&gt;
  
  
  Converting images is easy, but operating an image transformation service is not
&lt;/h2&gt;

&lt;p&gt;If you only want to convert JPEG to WebP on your own machine, that is easy. If you use Homebrew on macOS, you can install the tools like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;webp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;cwebp&lt;/code&gt; converts images to WebP, and &lt;code&gt;dwebp&lt;/code&gt; converts WebP images to other formats.&lt;/p&gt;

&lt;p&gt;But that does not mean building and operating your own image transformation service is easy.&lt;/p&gt;

&lt;p&gt;In real services, you will receive images that technically contain problems but are still uploaded by users all the time. For example, JPEG files with broken ICC profiles still appear in real systems. Even then, you need to transform them while keeping color changes as small as possible.&lt;/p&gt;

&lt;p&gt;That is why image optimization services and image CDNs are valuable. The hard part is not running a conversion command. The hard part is operating the full system safely with broken files, strange inputs, and edge cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;If I were designing image delivery for a modern web service, I would think about it like this.&lt;/p&gt;

&lt;p&gt;First, image size matters more than quality. You should deliver images that are large enough for the real display environment, especially now that high-density displays are normal.&lt;/p&gt;

&lt;p&gt;Second, image URL design should be controlled by the backend. The backend should return grouped candidate URLs for the same image, and the browser should choose one.&lt;/p&gt;

&lt;p&gt;Third, if you want a low-effort starting point, use &lt;code&gt;Accept&lt;/code&gt;-based format negotiation. If you want a more complete solution, use &lt;code&gt;picture&lt;/code&gt; and &lt;code&gt;srcset&lt;/code&gt; so that size selection is also handled properly.&lt;/p&gt;

&lt;p&gt;Fourth, modern formats are useful, but they are not interchangeable in every case. WebP is still valuable, but some image types need extra care. AVIF is now broadly supported and can be a very practical choice when paired with a traditional fallback format.&lt;/p&gt;

&lt;p&gt;Finally, do not stop at conversion logic. You also need monitoring, and you need to think about real operational bugs such as oversized outputs caused by unusual aspect ratios.&lt;/p&gt;

&lt;p&gt;Understanding these points and using a CDN or image optimization service well is one of the most practical ways to improve image delivery and user experience today.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>performance</category>
      <category>cdn</category>
      <category>images</category>
    </item>
    <item>
      <title>Safely Updating Existing Files in Go</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sun, 05 Apr 2026 07:30:24 +0000</pubDate>
      <link>https://dev.to/catatsuy/safely-updating-existing-files-in-go-1hlc</link>
      <guid>https://dev.to/catatsuy/safely-updating-existing-files-in-go-1hlc</guid>
      <description>&lt;p&gt;If you want to safely update an existing file in Go, the basic rule is simple: &lt;strong&gt;do not write to the original file directly&lt;/strong&gt;.&lt;br&gt;
Instead, write the new content to a temporary file first, and replace the original file with &lt;code&gt;rename&lt;/code&gt; only after the write is fully complete.&lt;/p&gt;

&lt;p&gt;In this article, I focus on Linux and explain three points:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;why direct overwrite is dangerous&lt;/li&gt;
&lt;li&gt;why using &lt;code&gt;os.CreateTemp("", ...)&lt;/code&gt; can cause problems&lt;/li&gt;
&lt;li&gt;what to watch out for when systemd &lt;code&gt;PrivateTmp&lt;/code&gt; is enabled&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Why you should not overwrite a file directly
&lt;/h2&gt;

&lt;p&gt;When you build a CLI tool that updates an existing file, such as a formatter like &lt;code&gt;gofmt -w&lt;/code&gt; or a tool that regenerates a cache file, it is better not to open the original file and overwrite it in place.&lt;/p&gt;

&lt;p&gt;There are three reasons.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. If the process crashes halfway, the original file may be left incomplete
&lt;/h3&gt;

&lt;p&gt;If an error happens during writing, or the process crashes before it finishes, the original file may be left in a partially written state.&lt;/p&gt;

&lt;p&gt;That means even a small update can destroy the whole file.&lt;/p&gt;

&lt;p&gt;So the final replacement must happen only after all data has already been written successfully.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Other processes may read the file at the same time
&lt;/h3&gt;

&lt;p&gt;Another process may already be reading that file.&lt;/p&gt;

&lt;p&gt;If you overwrite the file directly, that process may observe a partially written file.&lt;/p&gt;

&lt;p&gt;A safer pattern is to write the full content to a separate file first, and replace the destination only after the new file is complete.&lt;/p&gt;

&lt;p&gt;On Linux, &lt;code&gt;rename&lt;/code&gt; is atomic when the source and destination are on the same filesystem.&lt;br&gt;
Because of that, readers opening the destination path will typically see either the old file or the new file, but not a half-written replacement.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Concurrent writes become easier to reason about
&lt;/h3&gt;

&lt;p&gt;If two or more processes try to overwrite the same file at the same time, the result may become unclear or corrupted.&lt;/p&gt;

&lt;p&gt;If each process writes to its own temporary file and only calls &lt;code&gt;rename&lt;/code&gt; after completion, the behavior becomes easier to reason about.&lt;/p&gt;

&lt;p&gt;In that case, whichever process renames its file last wins.&lt;/p&gt;

&lt;p&gt;So the basic pattern is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;write everything to another file&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rename&lt;/code&gt; it to the final path&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  A common trap with &lt;code&gt;os.CreateTemp&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;In Go, a common way to create a temporary file is this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateTemp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tmp-*"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the first argument is an empty string, Go creates the file under &lt;code&gt;os.TempDir()&lt;/code&gt;.&lt;br&gt;
On Linux, that is usually &lt;code&gt;/tmp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The problem is that &lt;code&gt;/tmp&lt;/code&gt; may be on a different filesystem from the directory where you want to place the final file.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/tmp&lt;/code&gt; may be a &lt;code&gt;tmpfs&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;your application data may be under &lt;code&gt;/var/lib/...&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;or your target file may be under &lt;code&gt;/home/...&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In that case, &lt;code&gt;rename&lt;/code&gt; from &lt;code&gt;/tmp&lt;/code&gt; to the destination file fails with this error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;invalid cross-device link
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Linux, &lt;code&gt;rename&lt;/code&gt; works only within the same filesystem.&lt;/p&gt;

&lt;p&gt;So if you already know the final destination, you should create the temporary file in the destination directory from the beginning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create the temporary file in the destination directory
&lt;/h2&gt;

&lt;p&gt;Instead of creating a temporary file under &lt;code&gt;/tmp&lt;/code&gt;, do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;dst&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"/var/lib/myapp/cache.json"&lt;/span&gt;

&lt;span class="n"&gt;dir&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;filepath&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateTemp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;".tmp-*"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c"&gt;// cleanup if something fails&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way, &lt;code&gt;tmp&lt;/code&gt; and &lt;code&gt;dst&lt;/code&gt; are in the same directory, so they are also on the same filesystem.&lt;/p&gt;

&lt;p&gt;Then you can replace the destination with &lt;code&gt;os.Rename&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important order is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;write all data to the temporary file&lt;/li&gt;
&lt;li&gt;close it&lt;/li&gt;
&lt;li&gt;rename it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Also note that the default permission of the temporary file is &lt;code&gt;0600&lt;/code&gt;, so if you need different permissions, you should change them explicitly.&lt;/p&gt;

&lt;h2&gt;
  
  
  A pitfall with systemd &lt;code&gt;PrivateTmp&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This is not specific to Go. It is a Linux and systemd topic.&lt;/p&gt;

&lt;p&gt;This matters because some programs assume that a file created under &lt;code&gt;/tmp&lt;/code&gt; is visible to other processes.&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;PrivateTmp&lt;/code&gt; is enabled for a service, the process still sees a directory called &lt;code&gt;/tmp&lt;/code&gt;, but that directory is isolated from &lt;code&gt;/tmp&lt;/code&gt; used by other processes.&lt;/p&gt;

&lt;p&gt;That can cause problems like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Process A creates a temporary file in &lt;code&gt;/tmp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Process B tries to read the same path under &lt;code&gt;/tmp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;but Process B cannot see it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So if your design assumes that a temporary file in &lt;code&gt;/tmp&lt;/code&gt; is shared with other processes, that assumption breaks when &lt;code&gt;PrivateTmp&lt;/code&gt; is enabled.&lt;/p&gt;

&lt;p&gt;In that case, you need one of these approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;disable &lt;code&gt;PrivateTmp&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;this is useful if you want to keep using the normal &lt;code&gt;/tmp&lt;/code&gt; cleanup behavior&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;create and use a shared directory that all related processes can access&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;but make sure the directory exists, or file creation will fail&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;If you use the pattern from this article — creating the temporary file in the final destination directory — you usually avoid this problem from the start.&lt;/p&gt;

&lt;h2&gt;
  
  
  Think about concurrent execution
&lt;/h2&gt;

&lt;p&gt;You should also think about what happens when multiple processes update the same file at the same time.&lt;/p&gt;

&lt;p&gt;If your requirement is simply that the last completed update wins, the approach above is usually enough.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;os.CreateTemp&lt;/code&gt; generates unique file names, so multiple processes can create temporary files in the same directory without colliding on file names.&lt;/p&gt;

&lt;p&gt;But if you need stricter control, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;do not allow overwriting with stale data&lt;/li&gt;
&lt;li&gt;only write if the file generation is still based on the latest state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;then you need additional coordination such as a lock file or &lt;code&gt;flock&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Do not overwrite the original file directly&lt;/li&gt;
&lt;li&gt;Always write the full content to another file first, then replace it with &lt;code&gt;rename&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If you want the final replacement to be atomic, create the temporary file in the same directory as the destination file&lt;/li&gt;
&lt;li&gt;If the temporary file is on a different filesystem, &lt;code&gt;rename&lt;/code&gt; fails with &lt;code&gt;invalid cross-device link&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If systemd &lt;code&gt;PrivateTmp&lt;/code&gt; is enabled, &lt;code&gt;/tmp&lt;/code&gt; is not shared in the way you may expect, so either disable it or use a shared directory&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>linux</category>
      <category>files</category>
      <category>systemd</category>
    </item>
    <item>
      <title>I Built lls, a Go CLI to List 33.12 Million Files</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sun, 29 Mar 2026 06:56:46 +0000</pubDate>
      <link>https://dev.to/catatsuy/i-built-lls-a-go-cli-to-list-3312-million-files-4bm9</link>
      <guid>https://dev.to/catatsuy/i-built-lls-a-go-cli-to-list-3312-million-files-4bm9</guid>
      <description>&lt;p&gt;Sometimes a problem looks simple at first.&lt;/p&gt;

&lt;p&gt;In my case, I needed a complete file list from a huge directory on storage mounted over NFS from an application server. At first, this sounded like something existing tools should be able to handle. But once the number of files became extremely large, that assumption stopped being true.&lt;/p&gt;

&lt;p&gt;I eventually built a Go CLI called &lt;code&gt;lls&lt;/code&gt; to solve this problem.&lt;/p&gt;

&lt;p&gt;This was not a toy project. I built &lt;code&gt;lls&lt;/code&gt; to solve a real production problem, and in the end it was able to list &lt;strong&gt;33.12 million files&lt;/strong&gt; from a single directory on NFS.&lt;/p&gt;

&lt;p&gt;Repository:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/catatsuy/lls" rel="noopener noreferrer"&gt;https://github.com/catatsuy/lls&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, I will explain what failed, why I decided to use the Linux &lt;code&gt;getdents64&lt;/code&gt; system call directly, how the implementation works, and how &lt;code&gt;lls&lt;/code&gt; finally solved the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;The directory I had to deal with was on storage mounted over NFS, and it contained an extremely large number of files.&lt;/p&gt;

&lt;p&gt;If a directory is small, &lt;code&gt;ls&lt;/code&gt; and &lt;code&gt;find&lt;/code&gt; are usually enough. But once the number of files becomes too large, even getting a complete file list becomes difficult. And when the directory is on NFS instead of local storage, the situation can become even worse.&lt;/p&gt;

&lt;p&gt;What I needed was simple in theory: get the full list of files and finish successfully. In practice, that turned out to be the hard part.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;ls -U1&lt;/code&gt; and &lt;code&gt;find&lt;/code&gt; could not finish
&lt;/h2&gt;

&lt;p&gt;The first thing I tried was &lt;code&gt;ls -U1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I disabled sorting because sorting is one of the well-known reasons &lt;code&gt;ls&lt;/code&gt; becomes painful on huge directories. But even with &lt;code&gt;ls -U1&lt;/code&gt;, it still could not finish. The number varied from run to run, but at best it stopped after outputting about &lt;strong&gt;6 million files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I did not fully investigate why it stopped, but I suspected the storage server might have stopped responding.&lt;/p&gt;

&lt;p&gt;Next, I tried &lt;code&gt;find&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I thought &lt;code&gt;find&lt;/code&gt; might handle more entries than &lt;code&gt;ls&lt;/code&gt;, but it also failed. The result also varied, but at best it output about &lt;strong&gt;12 million lines&lt;/strong&gt; before it stopped responding.&lt;/p&gt;

&lt;p&gt;At that point, I was almost ready to give up. I started thinking I might have to output part of the file list, delete files in multiple rounds, and somehow work around the problem manually.&lt;/p&gt;

&lt;p&gt;But I wanted a real solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built &lt;code&gt;lls&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Around that time, I found an article describing how someone listed a directory containing 8 million files by calling &lt;code&gt;getdents&lt;/code&gt; directly with a large buffer. That was the key idea I needed. The article showed the C approach, but not a ready-to-use implementation, so I decided to build my own tool in Go.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html" rel="noopener noreferrer"&gt;http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Go, I could call Linux-specific system calls through the &lt;code&gt;syscall&lt;/code&gt; package. That meant I could stay in Go, avoid cgo, and still work directly with the kernel interface I needed.&lt;/p&gt;

&lt;p&gt;That was how &lt;code&gt;lls&lt;/code&gt; started.&lt;/p&gt;

&lt;p&gt;The point of &lt;code&gt;lls&lt;/code&gt; was not to replace &lt;code&gt;ls&lt;/code&gt; in general. It was a narrow tool for one difficult job: keep reading directory entries from a huge directory until the end.&lt;/p&gt;

&lt;h2&gt;
  
  
  System calls and &lt;code&gt;getdents64&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;On Linux, userland programs ask the kernel to do work through system calls. Directory reading is no exception.&lt;/p&gt;

&lt;p&gt;For this problem, the important system call was &lt;code&gt;getdents64&lt;/code&gt;. The older &lt;code&gt;getdents&lt;/code&gt; exists, but &lt;code&gt;getdents64&lt;/code&gt; was added because the original interface did not handle large filesystems and large file offsets well. In Go, the function exposed as &lt;code&gt;syscall.Getdents&lt;/code&gt; uses &lt;code&gt;getdents64&lt;/code&gt;, which was exactly what I needed here.&lt;/p&gt;

&lt;p&gt;The returned data is not a high-level file list. It is raw directory-entry data packed into a byte buffer.&lt;/p&gt;

&lt;p&gt;Conceptually, the data corresponds to a structure with fields such as inode number, offset, record length, type, and a null-terminated file name.&lt;/p&gt;

&lt;p&gt;That detail matters, because if you use &lt;code&gt;getdents64&lt;/code&gt; directly, you have to parse the buffer yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  My first idea: one large buffer
&lt;/h2&gt;

&lt;p&gt;My first idea was simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;allocate a buffer based on the directory size&lt;/li&gt;
&lt;li&gt;call &lt;code&gt;getdents64&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;print each file name to standard output&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The directory size here was the same value you can see with &lt;code&gt;ls -dl&lt;/code&gt;. The idea was that this value should be large enough to hold the full result. If the buffer was smaller than what was really needed, the output would be incomplete.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;lls&lt;/code&gt; also had a &lt;code&gt;-buf-size&lt;/code&gt; option so I could adjust the size manually, and a &lt;code&gt;-debug&lt;/code&gt; option to show how much of the buffer was actually used.&lt;/p&gt;

&lt;p&gt;However, on the real directory, this did not work as expected.&lt;/p&gt;

&lt;p&gt;The directory size reported by &lt;code&gt;ls -dl&lt;/code&gt; was over 2 GB, and running &lt;code&gt;lls&lt;/code&gt; with that default buffer size produced &lt;code&gt;EINVAL&lt;/code&gt;. After trying different values, I found that &lt;code&gt;2147483647&lt;/code&gt; worked but &lt;code&gt;2147483648&lt;/code&gt; did not. Later, I concluded that this was because the size had to fit in an &lt;code&gt;int&lt;/code&gt;, which also explains why the call failed beyond that point.&lt;/p&gt;

&lt;p&gt;Even after increasing the buffer size as much as possible, that approach still was not the real solution. The important point was not “make one call with a bigger buffer.” The real solution was to change the design.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real fix: call &lt;code&gt;getdents64&lt;/code&gt; repeatedly
&lt;/h2&gt;

&lt;p&gt;The real fix was to stop thinking in terms of one huge call.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;getdents64&lt;/code&gt; can be called repeatedly. If you keep calling it until it returns &lt;code&gt;0&lt;/code&gt;, you can continue reading the remaining directory entries.&lt;/p&gt;

&lt;p&gt;This became the key change in &lt;code&gt;lls&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of relying on a single enormous buffer, &lt;code&gt;lls&lt;/code&gt; now uses a reasonable buffer and keeps calling &lt;code&gt;syscall.Getdents&lt;/code&gt; until the directory is fully consumed. That change made it possible to list &lt;strong&gt;all 33.12 million files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That was the point where &lt;code&gt;lls&lt;/code&gt; became a practical tool for extremely large directories rather than a one-shot experiment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core implementation
&lt;/h2&gt;

&lt;p&gt;The implementation in &lt;code&gt;lls&lt;/code&gt; is built around &lt;code&gt;syscall.Dirent&lt;/code&gt;, which Go defines on Linux with fields like inode number, offset, record length, type, and a fixed-size name field.&lt;/p&gt;

&lt;p&gt;The core loop is straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;allocate a buffer&lt;/li&gt;
&lt;li&gt;call &lt;code&gt;syscall.Getdents(int(f.Fd()), buf)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;if the return value is &lt;code&gt;0&lt;/code&gt;, stop&lt;/li&gt;
&lt;li&gt;otherwise parse the returned bytes entry by entry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most important part is how the parsing works.&lt;/p&gt;

&lt;p&gt;The returned buffer contains multiple directory entries. Each entry has a variable size, so the code cannot move by a fixed structure size. Instead, it casts the current position in the buffer to &lt;code&gt;*syscall.Dirent&lt;/code&gt;, reads &lt;code&gt;Reclen&lt;/code&gt;, and moves forward by that many bytes.&lt;/p&gt;

&lt;p&gt;That is how it walks through the buffer correctly.&lt;/p&gt;

&lt;p&gt;The code also checks &lt;code&gt;Ino&lt;/code&gt;. If the inode number is &lt;code&gt;0&lt;/code&gt;, the entry is skipped, because that means the file no longer exists.&lt;/p&gt;

&lt;p&gt;For file names, the implementation uses the &lt;code&gt;Name&lt;/code&gt; field from &lt;code&gt;syscall.Dirent&lt;/code&gt;. In Go this is &lt;code&gt;[256]int8&lt;/code&gt;, so the code first treats it as bytes and then converts the bytes before the terminating null into a string.&lt;/p&gt;

&lt;p&gt;In other words, the implementation stays intentionally close to the kernel interface:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;call &lt;code&gt;getdents64&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;interpret the returned bytes as directory entries&lt;/li&gt;
&lt;li&gt;move forward using &lt;code&gt;Reclen&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;extract the file name&lt;/li&gt;
&lt;li&gt;repeat until &lt;code&gt;getdents64&lt;/code&gt; returns &lt;code&gt;0&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this worked
&lt;/h2&gt;

&lt;p&gt;One especially useful detail is that libc's &lt;code&gt;readdir&lt;/code&gt; implementation often uses a fixed internal buffer. In one of the articles I read, the example used a 2048-byte buffer internally. If your directory is huge, that means a large number of system calls just to read through it.&lt;/p&gt;

&lt;p&gt;You cannot easily change that buffer size from outside, which is why directly calling &lt;code&gt;getdents64&lt;/code&gt; yourself can make sense in an extreme case like this.&lt;/p&gt;

&lt;p&gt;That does not mean low-level code is always better. It only means this particular problem was narrow enough, and extreme enough, that the lower-level interface matched the problem better than a general-purpose tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  The result
&lt;/h2&gt;

&lt;p&gt;In the end, the comparison looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ls -U1&lt;/code&gt;: about &lt;strong&gt;6 million files&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;find&lt;/code&gt;: about &lt;strong&gt;12 million lines&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lls&lt;/code&gt;: &lt;strong&gt;33.12 million files&lt;/strong&gt; listed successfully&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That was the result that mattered.&lt;/p&gt;

&lt;p&gt;This was not just an experiment in system programming. It solved a real production problem on NFS. I needed the full file list, and &lt;code&gt;lls&lt;/code&gt; made that possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I built &lt;code&gt;lls&lt;/code&gt; because standard tools could not finish the job on a huge NFS-mounted directory.&lt;/p&gt;

&lt;p&gt;The important ideas were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use the Linux &lt;code&gt;getdents64&lt;/code&gt; interface through Go's &lt;code&gt;syscall.Getdents&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;parse the returned directory entries directly&lt;/li&gt;
&lt;li&gt;advance through the buffer using &lt;code&gt;Reclen&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;keep calling the system call until it returns &lt;code&gt;0&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That approach finally made it possible to list &lt;strong&gt;33.12 million files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you want to look at the code, here is the repository again:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/catatsuy/lls" rel="noopener noreferrer"&gt;https://github.com/catatsuy/lls&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Useful references:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://man7.org/linux/man-pages/man2/getdents.2.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man2/getdents.2.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pkg.go.dev/syscall" rel="noopener noreferrer"&gt;https://pkg.go.dev/syscall&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>linux</category>
      <category>nfs</category>
      <category>cli</category>
    </item>
    <item>
      <title>Designing a File Tampering Detection Tool for a Legacy PHP Application</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Fri, 20 Mar 2026 05:06:17 +0000</pubDate>
      <link>https://dev.to/catatsuy/designing-a-file-tampering-detection-tool-for-a-legacy-php-application-21p6</link>
      <guid>https://dev.to/catatsuy/designing-a-file-tampering-detection-tool-for-a-legacy-php-application-21p6</guid>
      <description>&lt;p&gt;I work on a legacy PHP application that runs on AWS EC2. The application is deployed from a deploy server with &lt;code&gt;rsync&lt;/code&gt;. In this environment, I needed a practical way to detect file tampering on application servers.&lt;/p&gt;

&lt;p&gt;Existing tools did not fit this deployment model well, so I built a small Go tool called &lt;code&gt;kekkai&lt;/code&gt; and open-sourced it. In this post, I want to explain not only the design choices, but also the implementation and operational details that mattered in practice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/catatsuy/kekkai" rel="noopener noreferrer"&gt;https://github.com/catatsuy/kekkai&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The environment
&lt;/h2&gt;

&lt;p&gt;This application has these characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it runs on AWS EC2&lt;/li&gt;
&lt;li&gt;it is a legacy PHP application&lt;/li&gt;
&lt;li&gt;dependencies are installed on a deploy server&lt;/li&gt;
&lt;li&gt;the application is deployed with &lt;code&gt;rsync&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a common setup for older PHP applications. I wanted a solution that fits this environment instead of assuming container images or immutable deployments.&lt;/p&gt;

&lt;h2&gt;
  
  
  The basic model
&lt;/h2&gt;

&lt;p&gt;The model is simple.&lt;/p&gt;

&lt;p&gt;First, the deploy server calculates hashes for files and creates a manifest. The manifest can be stored either in S3 or in a local file. Then the application server verifies local files against that manifest.&lt;/p&gt;

&lt;p&gt;The tool has two main commands:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;generate&lt;/code&gt;: create a manifest from the current files&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;verify&lt;/code&gt;: compare current files with the manifest&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted the data flow to stay easy to understand. The deploy server creates the trusted data, and the application server only reads it and verifies local files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Manifest structure
&lt;/h2&gt;

&lt;p&gt;The manifest contains these values for each file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;path&lt;/li&gt;
&lt;li&gt;SHA-256 hash&lt;/li&gt;
&lt;li&gt;file size&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also contains the exclude rules used at generation time.&lt;/p&gt;

&lt;p&gt;I wanted the manifest itself to describe what should be checked. I did not want verification behavior to depend on extra local configuration on the application server.&lt;/p&gt;

&lt;p&gt;This is also why &lt;code&gt;verify&lt;/code&gt; does not accept additional exclude rules. If the application server is compromised, I do not want it to be able to silently skip more files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I only hash file contents
&lt;/h2&gt;

&lt;p&gt;I only check file contents. I do not check timestamps or other metadata.&lt;/p&gt;

&lt;p&gt;The reason is simple: metadata changes too easily. Normal operational work can change timestamps even when the file contents are still the same. If a tool alerts on that, it creates noisy alerts, and eventually people stop trusting the alerts.&lt;/p&gt;

&lt;p&gt;This is also why I did not want an approach that archives the whole source tree into a tar file and hashes that tar file. A tar file can change for reasons that do not mean the application code was tampered with. I wanted the tool to fail only when the content of an actual file changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  How &lt;code&gt;generate&lt;/code&gt; works
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;generate&lt;/code&gt; command walks the target directory and creates manifest entries one by one.&lt;/p&gt;

&lt;p&gt;For regular files, it reads the file, calculates a SHA-256 hash, and stores the path, hash, and file size in the manifest.&lt;/p&gt;

&lt;p&gt;Exclude rules are applied at this stage. I made this choice on purpose. The deploy server is the side that creates trusted data, so exclude handling should be fixed there.&lt;/p&gt;

&lt;p&gt;After all entries are collected, the manifest is written either to a local file or to S3.&lt;/p&gt;

&lt;p&gt;I also made &lt;code&gt;generate&lt;/code&gt; flexible enough to work even when some excluded directories do not exist on the deploy server. That helps in real deployment environments where some paths only exist on application servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  How &lt;code&gt;verify&lt;/code&gt; works
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;verify&lt;/code&gt; command loads the manifest first. Then it walks the target directory and compares each current file with the manifest entry.&lt;/p&gt;

&lt;p&gt;It checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;whether the path exists in the manifest&lt;/li&gt;
&lt;li&gt;whether the file type matches&lt;/li&gt;
&lt;li&gt;whether the file size matches&lt;/li&gt;
&lt;li&gt;whether the calculated hash matches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also detects files that exist in the manifest but are missing on disk.&lt;/p&gt;

&lt;p&gt;When verification fails, the command exits with a non-zero status. It also writes error details to standard error, including the path that failed. This makes the tool easy to integrate with monitoring systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  How symlinks are handled in Go
&lt;/h2&gt;

&lt;p&gt;Symlinks needed special handling.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kekkai&lt;/code&gt; does not follow symlinks. Instead, it verifies the symlink itself.&lt;/p&gt;

&lt;p&gt;The implementation is roughly like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use &lt;code&gt;os.Lstat&lt;/code&gt; to check whether the entry is a symlink&lt;/li&gt;
&lt;li&gt;use &lt;code&gt;os.Readlink&lt;/code&gt; to read the target path string&lt;/li&gt;
&lt;li&gt;add a &lt;code&gt;symlink:&lt;/code&gt; prefix to that string&lt;/li&gt;
&lt;li&gt;calculate the SHA-256 hash of that prefixed string&lt;/li&gt;
&lt;li&gt;during verification, check both the file type and the stored hash&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This lets the tool detect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a changed symlink target path&lt;/li&gt;
&lt;li&gt;a type change between a regular file and a symlink&lt;/li&gt;
&lt;li&gt;added or removed symlinks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This design is intentional. If the symlink target path stays the same but the target file contents change, that is outside the scope of this check. I accepted that trade-off because I wanted predictable behavior and simple logic.&lt;/p&gt;

&lt;p&gt;I also do not cache symlink verification results. The hashed input is only a short string, so the cost is small.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I support both S3 and local files
&lt;/h2&gt;

&lt;p&gt;The manifest itself must be protected.&lt;/p&gt;

&lt;p&gt;If an attacker can modify both the application files and the manifest, verification becomes meaningless. That is why the main production model stores the manifest in S3 instead of next to the application files.&lt;/p&gt;

&lt;p&gt;At the same time, I also wanted local file output. Without that, even simple tests would require AWS credentials. So the tool supports both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 for production&lt;/li&gt;
&lt;li&gt;local files for testing and development&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also recommend being careful with local manifest output. If you deploy that manifest into the same target directory, &lt;code&gt;verify&lt;/code&gt; can fail because the manifest itself appears as an unexpected file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Protecting the manifest with S3 and IAM
&lt;/h2&gt;

&lt;p&gt;Using S3 also makes it easier to separate permissions.&lt;/p&gt;

&lt;p&gt;The application side only needs &lt;code&gt;GetObject&lt;/code&gt;.&lt;br&gt;
The deploy side only needs &lt;code&gt;PutObject&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That separation is useful because the deploy server and the application servers have different roles. If needed, S3 features such as versioning can also help protect the manifest further.&lt;/p&gt;

&lt;p&gt;I also recommend keeping &lt;code&gt;base-path&lt;/code&gt; fixed in production and managing it explicitly. Since &lt;code&gt;base-path&lt;/code&gt; and &lt;code&gt;app-name&lt;/code&gt; become part of the S3 path, this helps avoid accidentally overwriting production data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why I chose SHA-256
&lt;/h2&gt;

&lt;p&gt;For this kind of verification, I needed a hash function with the right security properties. I did not want to use a weak fast hash that would make it easier to replace a file with another input that matches the stored hash.&lt;/p&gt;

&lt;p&gt;In security terms, the important property here is second-preimage resistance.&lt;/p&gt;

&lt;p&gt;I considered SHA-256 and SHA-512. I chose SHA-256 because it is standard, well known, and easy to justify. I also did not see a meaningful advantage from SHA-512 for source-code-sized files in this use case.&lt;/p&gt;
&lt;h2&gt;
  
  
  How I reduced production load
&lt;/h2&gt;

&lt;p&gt;Performance was the hardest practical problem.&lt;/p&gt;

&lt;p&gt;Hashing a large codebase uses CPU, memory, and I/O. If the verification tool itself harms production stability, that defeats the purpose. Because of that, I added several controls.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;code&gt;GOMAXPROCS&lt;/code&gt; and workers
&lt;/h3&gt;

&lt;p&gt;First, I rely on normal Go controls such as &lt;code&gt;GOMAXPROCS&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kekkai&lt;/code&gt; also has a &lt;code&gt;--workers&lt;/code&gt; option to control how many files are hashed in parallel. By default, it uses the same value as &lt;code&gt;GOMAXPROCS&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This helps, but it is not enough. Even with one worker, the process can still keep one CPU core busy when many files are processed.&lt;/p&gt;
&lt;h3&gt;
  
  
  I/O rate limiting with &lt;code&gt;golang.org/x/time/rate&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To make the tool safer in production, I added I/O rate limiting with &lt;code&gt;golang.org/x/time/rate&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of only limiting concurrency, I also limit how fast the tool reads file data. This makes it possible to slow verification down on purpose and reduce the production impact.&lt;/p&gt;

&lt;p&gt;The core idea is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create a limiter&lt;/li&gt;
&lt;li&gt;read file data in chunks&lt;/li&gt;
&lt;li&gt;wait on the limiter before each chunk&lt;/li&gt;
&lt;li&gt;write the chunk into the hasher&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach gave me the most flexible control. In practice, this mattered more than worker limits alone.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kekkai&lt;/code&gt; exposes this through the &lt;code&gt;--rate-limit&lt;/code&gt; option. Of course, if the value is too small, verification will become very slow, so this needs to be tuned carefully.&lt;/p&gt;
&lt;h3&gt;
  
  
  Cache
&lt;/h3&gt;

&lt;p&gt;I also added a local cache to make repeated verification faster.&lt;/p&gt;

&lt;p&gt;The cache stores file metadata and can skip hash calculation when &lt;code&gt;mtime&lt;/code&gt;, &lt;code&gt;ctime&lt;/code&gt;, and file size have not changed. Here, &lt;code&gt;ctime&lt;/code&gt; means file change time, not creation time.&lt;/p&gt;

&lt;p&gt;I know that metadata-based skipping is not a perfect security check by itself. That is why the cache is only an optimization layer.&lt;/p&gt;

&lt;p&gt;There is also some risk that the cache file itself could be tampered with. Because of that, the default behavior is to recalculate hashes with a 10% probability even when the cache says the file is unchanged. This probability can be changed with &lt;code&gt;--verify-probability&lt;/code&gt;. If it is set to &lt;code&gt;0&lt;/code&gt;, hash recalculation is skipped as long as the cache metadata still matches.&lt;/p&gt;

&lt;p&gt;The cache also includes the hash of the cache file itself. If tampering is detected, the cache is disabled. Also, files under &lt;code&gt;/tmp&lt;/code&gt; may eventually be deleted, so the cache can be rebuilt naturally over time.&lt;/p&gt;
&lt;h2&gt;
  
  
  Go implementation notes
&lt;/h2&gt;

&lt;p&gt;I also made a few implementation choices to reduce overhead in Go itself.&lt;/p&gt;

&lt;p&gt;When hashing many files, I do not want to allocate a new hasher for every file if I can avoid it. So I reuse &lt;code&gt;hash.Hash&lt;/code&gt; with &lt;code&gt;Reset()&lt;/code&gt; instead of calling &lt;code&gt;sha256.New()&lt;/code&gt; every time.&lt;/p&gt;

&lt;p&gt;The same idea applies to buffers. I reuse the &lt;code&gt;io.CopyBuffer&lt;/code&gt; buffer for each worker, instead of allocating a new buffer per file.&lt;/p&gt;

&lt;p&gt;This matters because &lt;code&gt;sha256.New()&lt;/code&gt; is not free, and repeated allocations across many files and workers increase GC cost and cache misses.&lt;/p&gt;

&lt;p&gt;One important detail is that &lt;code&gt;hash.Hash&lt;/code&gt; is not goroutine-safe. So if hashing is done in parallel, each worker needs its own hasher and buffer.&lt;/p&gt;
&lt;h2&gt;
  
  
  How I run it on the deploy server
&lt;/h2&gt;

&lt;p&gt;In production, the deploy process is implemented with shell scripts. The deploy server installs dependencies first and then runs &lt;code&gt;rsync&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Because of that, I run &lt;code&gt;kekkai generate&lt;/code&gt; at the end of the deploy script.&lt;/p&gt;

&lt;p&gt;A typical command looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kekkai generate &lt;span class="nt"&gt;--target&lt;/span&gt; /var/www/app &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-bucket&lt;/span&gt; &lt;span class="s1"&gt;'kekkai-test'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--base-path&lt;/span&gt; production &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--app-name&lt;/span&gt; kekkai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--exclude&lt;/span&gt; &lt;span class="s2"&gt;".git/**"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This stores the manifest as &lt;code&gt;production/kekkai/manifest.json&lt;/code&gt; in the specified S3 bucket.&lt;/p&gt;

&lt;p&gt;At this stage, it is important to list every directory that must be ignored, such as log directories or NFS mount points. Since exclude rules are stored in the manifest, mistakes here will affect later verification.&lt;/p&gt;
&lt;h2&gt;
  
  
  How I run it on application servers
&lt;/h2&gt;

&lt;p&gt;The minimum command on an application server is simple:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kekkai verify &lt;span class="nt"&gt;--target&lt;/span&gt; /var/www/app &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-bucket&lt;/span&gt; &lt;span class="s1"&gt;'kekkai-test'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--base-path&lt;/span&gt; production &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--app-name&lt;/span&gt; kekkai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;In real production, I also care about these points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the application server must not be able to write to S3&lt;/li&gt;
&lt;li&gt;I want alerts on failure&lt;/li&gt;
&lt;li&gt;I want to limit load on EC2&lt;/li&gt;
&lt;li&gt;I want to use the cache to reduce execution time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the application side only gets &lt;code&gt;s3:GetObject&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Monitoring and alerts
&lt;/h2&gt;

&lt;p&gt;I run verification as a periodic check from our monitoring system.&lt;/p&gt;

&lt;p&gt;If I alert on a single failure, I may get alerts during deployment. That would create false positives and reduce trust in alerts. So I only alert after repeated failures.&lt;/p&gt;

&lt;p&gt;Timeout is also important. Full verification can take several minutes, so the monitoring side needs a longer timeout than a normal health check.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why I use &lt;code&gt;systemd-run&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;I also use &lt;code&gt;systemd-run&lt;/code&gt; when running &lt;code&gt;kekkai verify&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The reason is simple: I do not want this check to run with strong privileges or compete too aggressively with the main application.&lt;/p&gt;

&lt;p&gt;A real example looks like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemd-run &lt;span class="nt"&gt;--quiet&lt;/span&gt; &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--pipe&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;oneshot &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;CPUQuota&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;25% &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;CPUWeight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;50 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;PrivateTmp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;no &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;User&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nobody &lt;span class="se"&gt;\&lt;/span&gt;
  /bin/bash &lt;span class="nt"&gt;-lc&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s1"&gt;'nice -n 10 ionice -c2 -n7 /usr/local/bin/kekkai verify --s3-bucket kekkai-test --app-name app --base-path production --target /var/www/app --use-cache --rate-limit 10485760 2&amp;gt;&amp;amp;1'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;There are several reasons for this setup.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;User=nobody&lt;/code&gt; makes the command run as a low-privilege user&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nice&lt;/code&gt; and &lt;code&gt;ionice&lt;/code&gt; reduce CPU and I/O priority&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CPUQuota&lt;/code&gt; and &lt;code&gt;CPUWeight&lt;/code&gt; reduce CPU usage further through cgroup control&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PrivateTmp=no&lt;/code&gt; is necessary if I want to use the cache in &lt;code&gt;/tmp&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is easy to miss. If &lt;code&gt;PrivateTmp=no&lt;/code&gt; is not set, the process gets a different private &lt;code&gt;/tmp&lt;/code&gt;, and the cache file cannot be reused.&lt;/p&gt;

&lt;p&gt;I also mention Go 1.25 or later in this context. Before Go 1.25, even if cgroup limits were applied, &lt;code&gt;GOMAXPROCS&lt;/code&gt; could still reflect the parent machine's CPU count. Since Go 1.25 became cgroup-aware by default, I target Go 1.25 or later in &lt;code&gt;kekkai&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Alert contents
&lt;/h2&gt;

&lt;p&gt;When verification fails, &lt;code&gt;kekkai&lt;/code&gt; writes the error to standard error, including the affected path.&lt;/p&gt;

&lt;p&gt;Some monitoring systems include standard output in notifications, so I redirect standard error to standard output when needed. That way, a notification to Slack or another channel can include the actual file path that failed verification.&lt;/p&gt;

&lt;p&gt;This makes investigation much faster.&lt;/p&gt;
&lt;h2&gt;
  
  
  Production results
&lt;/h2&gt;

&lt;p&gt;In production, the application has about 17,000 files including dependencies.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;manifest generation takes a few seconds&lt;/li&gt;
&lt;li&gt;verification takes about 4 to 5 minutes with &lt;code&gt;--rate-limit 10485760&lt;/code&gt; (10 MB/s)&lt;/li&gt;
&lt;li&gt;with &lt;code&gt;--use-cache&lt;/code&gt;, a cache hit can reduce that to about 25 seconds&lt;/li&gt;
&lt;li&gt;verification runs once per hour on application servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This difference is intentional. I want &lt;code&gt;generate&lt;/code&gt; to finish quickly as part of deployment, but I want &lt;code&gt;verify&lt;/code&gt; to run slowly and safely on production servers. Even if verification takes about five minutes, running it once per hour is enough for this use case.&lt;/p&gt;
&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;I did not want to build a large security platform. I wanted a small tool that fits a specific real-world environment: a legacy PHP application on EC2, deployed with &lt;code&gt;rsync&lt;/code&gt;, with a deploy server and application servers playing different roles.&lt;/p&gt;

&lt;p&gt;That focus shaped both the design and the implementation: content-only hashing, strict exclude rules, explicit symlink handling, S3 and IAM for manifest protection, local cache with probabilistic re-verification, rate limiting, and safe execution with &lt;code&gt;systemd-run&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you work with a similar deployment model, this approach may be useful for you too. I have open-sourced the tool on GitHub as &lt;code&gt;catatsuy/kekkai&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/catatsuy" rel="noopener noreferrer"&gt;
        catatsuy
      &lt;/a&gt; / &lt;a href="https://github.com/catatsuy/kekkai" rel="noopener noreferrer"&gt;
        kekkai
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A lightweight Go tool for detecting file tampering by comparing content-based hashes stored securely in S3.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Kekkai&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;A simple and fast Go tool for file integrity monitoring. Detects unauthorized file modifications caused by OS command injection and other attacks by recording file hashes during deployment and verifying them periodically.&lt;/p&gt;

&lt;p&gt;The name "Kekkai" comes from the Japanese word 結界 (kekkai), meaning "barrier" - a protective boundary that keeps unwanted things out, perfectly representing this tool's purpose of protecting your files from tampering.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Takumi, the AI offensive security engineer&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a href="https://flatt.tech/oss/gmo/trampoline" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/3a15e3e01ccd3cd6a238fdf4a7d1b23953ed233640439974047f80f7aaafe0fe/68747470733a2f2f666c6174742e746563682f6173736574732f696d616765732f6261646765732f676d6f2d6f73732e737667" height="24px"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Design Philosophy&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;Kekkai was designed to solve specific challenges in production server environments:&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Why Kekkai?&lt;/h3&gt;

&lt;/div&gt;

&lt;p&gt;Traditional tools like &lt;code&gt;tar&lt;/code&gt; or file sync utilities (e.g., &lt;code&gt;rsync&lt;/code&gt;) include metadata like timestamps in their comparisons, causing false positives when only timestamps change. In environments with heavy NFS usage or dynamic log directories, existing tools become difficult to configure and maintain.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Core Principles&lt;/h3&gt;

&lt;/div&gt;


&lt;ol&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Content-Only Hashing&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hashes only file contents, ignoring timestamps and metadata&lt;/li&gt;
&lt;li&gt;Detects actual content changes, not superficial modifications&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Immutable Exclude&lt;/strong&gt;…&lt;/p&gt;


&lt;/li&gt;

&lt;/ol&gt;
&lt;/div&gt;
&lt;br&gt;
  &lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/catatsuy/kekkai" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;





</description>
      <category>go</category>
      <category>php</category>
      <category>security</category>
      <category>aws</category>
    </item>
    <item>
      <title>Why I, as Someone Who Likes MySQL, Now Want to Recommend PostgreSQL</title>
      <dc:creator>catatsuy</dc:creator>
      <pubDate>Sun, 15 Mar 2026 05:48:21 +0000</pubDate>
      <link>https://dev.to/catatsuy/why-i-as-someone-who-likes-mysql-now-want-to-recommend-postgresql-2a8i</link>
      <guid>https://dev.to/catatsuy/why-i-as-someone-who-likes-mysql-now-want-to-recommend-postgresql-2a8i</guid>
      <description>&lt;p&gt;I like MySQL. I have used it for a long time, and I have also operated it in on-premises environments.&lt;/p&gt;

&lt;p&gt;However, since joining my current company, I have had more opportunities to use PostgreSQL. At first, I honestly felt a lot of resistance to it. I had used MySQL for so long, so part of it was just habit, and I think I was also more wary of PostgreSQL than I needed to be.&lt;/p&gt;

&lt;p&gt;But as I actually used it, I gradually started to see what was good about PostgreSQL. These days, if someone asks me which one I would choose for a new project, I have come to feel that I would want to choose PostgreSQL.&lt;/p&gt;

&lt;p&gt;Because I have used MySQL for a long time, I also know the rough edges that older MySQL had. At the same time, I think it is inaccurate to talk about MySQL today based only on old impressions. If you configure &lt;code&gt;sql_mode&lt;/code&gt; properly, you can avoid many dangerous behaviors, and MySQL 8 added a large number of features.&lt;/p&gt;

&lt;p&gt;Also, this time I want to compare current MySQL and PostgreSQL on the assumption that they will run in the cloud, rather than based on impressions from the on-premises era. Some of the things that used to be described as disadvantages of PostgreSQL are no longer very important issues now.&lt;/p&gt;

&lt;p&gt;This is not a story about “MySQL is bad.” It is also not a story like “the philosophy of PostgreSQL is beautiful.”&lt;/p&gt;

&lt;p&gt;If I write only the conclusion, it is these two points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Things that used to be considered disadvantages of PostgreSQL have become much less important. The feature gap has narrowed a lot, and under the assumption of managed services, there are more things you do not need to worry about.&lt;/li&gt;
&lt;li&gt;On the other hand, from the perspective of application implementation, there are still points where PostgreSQL is clearly better.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article, I will organize the discussion from that perspective.&lt;/p&gt;

&lt;h2&gt;
  
  
  What used to be considered disadvantages of PostgreSQL has become much less significant
&lt;/h2&gt;

&lt;p&gt;In older comparisons, PostgreSQL’s weaknesses were often said to be heavier operations and awkwardness around DDL.&lt;/p&gt;

&lt;p&gt;But I think bringing those points up as-is today is a bit outdated.&lt;/p&gt;

&lt;p&gt;MySQL has become very strong in online DDL, but at least for everyday tasks like adding columns, I do not think there is still a clear difference between MySQL and PostgreSQL. Partitioning is also no longer as big an issue as it once felt.&lt;/p&gt;

&lt;p&gt;Also, operational topics specific to PostgreSQL, such as &lt;code&gt;VACUUM&lt;/code&gt;, come up much less often when you assume managed services, because users have far fewer situations where they need to handle them directly. I do not think it is very fair to bring comparisons from the old on-premises era, where you had to manage everything yourself, directly into the current cloud era.&lt;/p&gt;

&lt;p&gt;The differences around replication have also become less visible recently, because managed services have become the mainstream, and there are more parts that users do not directly touch. I feel there are fewer situations than before where I strongly notice an advantage on the MySQL side.&lt;/p&gt;

&lt;p&gt;In other words, some of the things that used to be valid reasons not to recommend PostgreSQL have now become much weaker.&lt;/p&gt;

&lt;h2&gt;
  
  
  Even so, PostgreSQL is stronger for application implementation
&lt;/h2&gt;

&lt;p&gt;This is the main point.&lt;/p&gt;

&lt;p&gt;MySQL 8 closed a lot of the gap. Even so, when I look at things from the standpoint of someone actually writing applications, there are still reasons why PostgreSQL is easier to recommend.&lt;/p&gt;

&lt;h2&gt;
  
  
  First, the things MySQL 8 added and narrowed the gap on
&lt;/h2&gt;

&lt;p&gt;I want to make this clear first. The following are things that used to be described as PostgreSQL advantages, but are no longer decisive because MySQL 8 added them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CHECK&lt;/code&gt; constraints&lt;/li&gt;
&lt;li&gt;Window functions&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SKIP LOCKED&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this point, it is not fair to talk about these as strengths that only PostgreSQL has.&lt;/p&gt;

&lt;p&gt;However, as I will explain later, “window functions themselves were added in MySQL 8” and “being able to naturally bring window functions into update processing” are different things. I still think PostgreSQL is much easier to work with for the latter.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; is not a replacement for &lt;code&gt;INSERT IGNORE&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This is a feature MySQL did not originally have, and it is one of the fairly big reasons why I recommend PostgreSQL.&lt;/p&gt;

&lt;p&gt;MySQL has &lt;code&gt;INSERT IGNORE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;However, this is hard to treat as a replacement for &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;PostgreSQL’s &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; is, basically, a feature that explicitly says: “Do not insert only when a unique constraint conflict occurs.” What you want to do becomes SQL exactly as it is.&lt;/p&gt;

&lt;p&gt;By contrast, MySQL’s &lt;code&gt;INSERT IGNORE&lt;/code&gt; is not a dedicated feature only for ignoring duplicates. It is a feature that turns errors into warnings and continues processing, so it is too broad for the use case of “I only want to ignore duplicates.”&lt;/p&gt;

&lt;p&gt;This difference may look small, but in practice it is quite large.&lt;/p&gt;

&lt;p&gt;It makes the behavior easier to read during review, and it makes it less likely that unintended invalid input will be silently accepted.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;RETURNING&lt;/code&gt; is very powerful
&lt;/h2&gt;

&lt;p&gt;This is also a feature MySQL did not originally have, and it is another fairly big reason why I recommend PostgreSQL.&lt;/p&gt;

&lt;p&gt;In PostgreSQL, you can use &lt;code&gt;INSERT/UPDATE/DELETE ... RETURNING&lt;/code&gt;. Because you can return the changed result right there, you can naturally complete “get the result of the change” in a single statement.&lt;/p&gt;

&lt;p&gt;For example, you can do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'catatsuy'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'catatsuy@example.com'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;RETURNING&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you have this, the following become very natural:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;receive the inserted ID directly&lt;/li&gt;
&lt;li&gt;receive default values or stored values directly&lt;/li&gt;
&lt;li&gt;return the updated row as-is and use it as the API response&lt;/li&gt;
&lt;li&gt;receive the result of an upsert directly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To be honest, the &lt;code&gt;LAST_INSERT_ID()&lt;/code&gt;-based style that is common in MySQL is quite limiting.&lt;/p&gt;

&lt;p&gt;The information you can get back is narrow, and basically centered on getting a numeric &lt;code&gt;AUTO_INCREMENT&lt;/code&gt; ID.&lt;/p&gt;

&lt;p&gt;You cannot naturally receive arbitrary columns from the inserted result, and you also cannot return the completed row as-is including default values and generated columns.&lt;/p&gt;

&lt;p&gt;For example, what you may want is something like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you use UUIDs as primary keys, so you want to return them directly&lt;/li&gt;
&lt;li&gt;you want to return the completed row including generated columns and default values&lt;/li&gt;
&lt;li&gt;you want to pass the entire inserted row directly to the next step&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You cannot do this with &lt;code&gt;LAST_INSERT_ID()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In addition, even when multiple rows are inserted in a single statement, &lt;code&gt;LAST_INSERT_ID()&lt;/code&gt; does not return the inserted result as-is. What you can get is only the first &lt;code&gt;AUTO_INCREMENT&lt;/code&gt; value.&lt;/p&gt;

&lt;p&gt;So if you want to handle the result of a multi-row &lt;code&gt;INSERT&lt;/code&gt; directly in the application, it is inconvenient. With PostgreSQL’s &lt;code&gt;RETURNING&lt;/code&gt;, you can return the inserted rows directly, and this difference is very large.&lt;/p&gt;

&lt;p&gt;“Being able to return the changed result directly” is not just a convenience feature. It affects the very way you structure application implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;VALUES&lt;/code&gt; helps in real implementation
&lt;/h2&gt;

&lt;p&gt;This is not about a feature that MySQL entirely lacks. Rather, I think PostgreSQL lets you use it much more naturally.&lt;/p&gt;

&lt;p&gt;It is easy to create a small constant table on the spot, join with it, and connect it directly to update processing. When you have this, you do not need to push half-baked temporary-table-like processing out to the application side.&lt;/p&gt;

&lt;p&gt;For example, when you want to join a small number of values received from the application and update based on them, in PostgreSQL you can write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;VALUES&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'pro'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'free'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'team'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This kind of processing comes up very often in real implementation.&lt;/p&gt;

&lt;p&gt;For example, you may want to pass a small set of master-like values on the spot and update with them, or send a group of values received from an API directly into SQL.&lt;/p&gt;

&lt;p&gt;As another example, it is also natural to treat a group of values received from the application as a join target:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;VALUES&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is not that MySQL cannot do similar things at all. Since MySQL 8.0.19, it has had the &lt;code&gt;VALUES&lt;/code&gt; statement, and it can be treated as a table value constructor.&lt;/p&gt;

&lt;p&gt;However, in MySQL you need to write it with &lt;code&gt;ROW(...)&lt;/code&gt;, and if you leave column names as they are, they become things like &lt;code&gt;column_0&lt;/code&gt; and &lt;code&gt;column_1&lt;/code&gt;. It feels a bit different from PostgreSQL, where you create a small constant table on the spot, give it natural column names, and flow it directly into a &lt;code&gt;JOIN&lt;/code&gt; or &lt;code&gt;UPDATE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For example, in MySQL the same idea would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;column_0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is not a flashy feature, but this kind of thing affects the ease of everyday implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Being able to bring window functions into update processing is powerful
&lt;/h2&gt;

&lt;p&gt;This part is important.&lt;/p&gt;

&lt;p&gt;Window functions themselves were added in MySQL 8.&lt;/p&gt;

&lt;p&gt;So it is wrong to talk about window functions themselves as a PostgreSQL-only strength.&lt;/p&gt;

&lt;p&gt;However, in PostgreSQL, by combining them with &lt;code&gt;WITH&lt;/code&gt; and &lt;code&gt;UPDATE ... FROM&lt;/code&gt;, it is easy to bring the result of window functions naturally into update processing. I think there is still a difference here.&lt;/p&gt;

&lt;p&gt;For example, if you want to set a flag only on the latest row for each user, you can write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rn&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;is_latest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Window functions themselves do exist in MySQL 8.&lt;/p&gt;

&lt;p&gt;However, PostgreSQL is much more natural when it comes to connecting them to this kind of update logic.&lt;/p&gt;

&lt;p&gt;This is not just a convenience feature for analytics. It works as a weapon for application implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Partial indexes are a clear feature difference
&lt;/h2&gt;

&lt;p&gt;This is something I can clearly describe as a feature difference.&lt;/p&gt;

&lt;p&gt;MySQL did not originally have it, and it is still missing now.&lt;/p&gt;

&lt;p&gt;PostgreSQL has partial indexes, and you can create an index only on some rows, such as with &lt;code&gt;WHERE deleted_at IS NULL&lt;/code&gt;. This fits very well with soft delete patterns, and it is also useful for managing records by status.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_users_active_email&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;deleted_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, if you have a table using soft deletes and you only want to speed up “search by email address among active users,” you can write that directly.&lt;/p&gt;

&lt;p&gt;In MySQL, you can do something similar using generated columns or functional indexes. However, that is not a replacement for partial indexes.&lt;/p&gt;

&lt;p&gt;PostgreSQL partial indexes put only the rows that satisfy a condition like &lt;code&gt;WHERE deleted_at IS NULL&lt;/code&gt; into the index. In other words, unnecessary rows are excluded from the index from the beginning.&lt;/p&gt;

&lt;p&gt;By contrast, MySQL generated columns and functional indexes basically evaluate an expression for all rows and then index that result. If you design the expression well, you can use them for similar purposes, but they do not directly express “an index that physically stays small by containing only some rows.”&lt;/p&gt;

&lt;p&gt;So in terms of size, update cost, and clarity of intent, PostgreSQL’s partial indexes are more straightforward. The MySQL side can be used as a workaround, but it is hard to say it has the same feature.&lt;/p&gt;

&lt;p&gt;This is not just a difference in how the SQL feels to write. It is a real feature difference, and a reason to recommend PostgreSQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Foreign keys are much better in PostgreSQL
&lt;/h2&gt;

&lt;p&gt;This is my personal impression, but I feel there are many people in the MySQL world who think foreign keys are unnecessary, while in the PostgreSQL world there are many people who think foreign keys are necessary.&lt;/p&gt;

&lt;p&gt;I think this comes not so much from a difference in philosophy, but from differences in how easy they are to test with, how easy they are to operate, and how hard they make it for bugs to enter.&lt;/p&gt;

&lt;h3&gt;
  
  
  PostgreSQL supports deferred constraints
&lt;/h3&gt;

&lt;p&gt;In PostgreSQL, foreign keys can be made &lt;code&gt;DEFERRABLE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is extremely important.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;author_id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;books_author_fk&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;DEFERRABLE&lt;/span&gt; &lt;span class="k"&gt;INITIALLY&lt;/span&gt; &lt;span class="k"&gt;DEFERRED&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because of this, you can delay constraint checks until the end of the transaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This SQL works in PostgreSQL.&lt;/p&gt;

&lt;p&gt;The parent does not exist in the middle, but it is fine as long as consistency is satisfied at commit time.&lt;/p&gt;

&lt;p&gt;What is this useful for?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;loading data that includes circular references&lt;/li&gt;
&lt;li&gt;creating complex test data&lt;/li&gt;
&lt;li&gt;migration processes where the order is temporarily reversed&lt;/li&gt;
&lt;li&gt;bulk inserts and replacement operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These kinds of processes come up normally in real work.&lt;/p&gt;

&lt;p&gt;And whether or not you can write them naturally is a very big deal.&lt;/p&gt;

&lt;h3&gt;
  
  
  MySQL forces strict ordering
&lt;/h3&gt;

&lt;p&gt;MySQL does not have this mechanism.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NO ACTION&lt;/code&gt; is effectively &lt;code&gt;RESTRICT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In other words, you cannot structure processing in the form of “it is okay as long as it is consistent in the end.” You are always constrained by ordering rules where the parent must come first and the child must come after.&lt;/p&gt;

&lt;p&gt;This may look like a small issue, but it makes loading test data and writing migration processes much harder.&lt;/p&gt;

&lt;p&gt;For example, in test code, if you want to roughly load fixtures spanning multiple tables, in PostgreSQL you can write it so that consistency is satisfied by the end of the transaction. In MySQL you cannot do that, so you always need to manage fixture insertion order strictly.&lt;/p&gt;

&lt;p&gt;If using foreign keys makes testing more troublesome, I think it is natural that a culture emerges where people say, “Then let’s stop using foreign keys.”&lt;/p&gt;

&lt;h3&gt;
  
  
  MySQL makes it easy to escape by disabling foreign keys
&lt;/h3&gt;

&lt;p&gt;In MySQL, you can disable constraints with &lt;code&gt;foreign_key_checks=0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This looks convenient, but it is quite dangerous.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;foreign_key_checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;999&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;foreign_key_checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this, inconsistent data inserted while constraints are disabled can remain.&lt;/p&gt;

&lt;p&gt;Even if you enable them again, MySQL does not go back and verify all inconsistencies that were inserted during that time.&lt;/p&gt;

&lt;p&gt;With this behavior, it becomes easy for accidents to happen where constraints are turned off for testing or migration convenience, and inconsistent data is brought in as-is.&lt;/p&gt;

&lt;p&gt;PostgreSQL has a tool that says, “Keep the constraints, but delay the check timing.”&lt;/p&gt;

&lt;p&gt;MySQL tends to go in the direction of “turn off the constraints themselves.”&lt;/p&gt;

&lt;p&gt;This difference is quite large.&lt;/p&gt;

&lt;h3&gt;
  
  
  You can see the difference even from a foreign key example alone
&lt;/h3&gt;

&lt;p&gt;For example, consider an ordinary pair of tables with a parent-child relationship.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;orders_user_fk&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;DEFERRABLE&lt;/span&gt; &lt;span class="k"&gt;INITIALLY&lt;/span&gt; &lt;span class="k"&gt;DEFERRED&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In PostgreSQL, even during tests, you can write something like inserting into &lt;code&gt;orders&lt;/code&gt; first and then inserting into &lt;code&gt;users&lt;/code&gt; later, as long as it is within a transaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This flexibility helps a lot with fixture creation, data migration, and simplifying test code.&lt;/p&gt;

&lt;p&gt;MySQL does not have this.&lt;/p&gt;

&lt;p&gt;So in MySQL it is easier for people to drift toward thinking, “Foreign keys are in the way, so turn them off,” or “Guarantee it in the application,” while in PostgreSQL it is easier to drift toward thinking, “Let’s use foreign keys properly.”&lt;/p&gt;

&lt;p&gt;The reason PostgreSQL foreign keys are better is not simply that they have more features.&lt;/p&gt;

&lt;p&gt;A big part of it is that they make it easier to write tests, migrations, and data loading while keeping the constraints intact, and as a result, it becomes easier to actually use foreign keys in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  In MySQL, you cannot do vector operations
&lt;/h2&gt;

&lt;p&gt;Recently, I think this is probably the reason most often mentioned for adopting PostgreSQL.&lt;/p&gt;

&lt;p&gt;PostgreSQL has &lt;code&gt;pgvector&lt;/code&gt;, which not only allows you to store vectors, but also lets you use distance operations and similarity search directly from applications. It also has indexes for nearest-neighbor search, so it is easy to use directly in implementation.&lt;/p&gt;

&lt;p&gt;By contrast, assuming the OSS edition, MySQL added a Vector type in MySQL 9.0, which has already been released as an Innovation Release, while the LTS version has not yet been released. However, distance functions are provided only in MySQL HeatWave on OCI and MySQL AI, and are not included in MySQL Commercial or Community. In other words, in the OSS edition you cannot do vector operations, so it is not really usable for this. This is a clear difference from PostgreSQL + &lt;code&gt;pgvector&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Character sets and collations are still more complicated in MySQL
&lt;/h2&gt;

&lt;p&gt;This is also very important.&lt;/p&gt;

&lt;p&gt;I still think character sets and collations are more likely to cause trouble in MySQL than in PostgreSQL.&lt;/p&gt;

&lt;p&gt;However, this is not only a problem with MySQL itself. It includes frameworks, connectors, and default settings as well.&lt;/p&gt;

&lt;p&gt;There are well-known examples in Japan such as the so-called “Haha-Papa problem” and “Sushi-Beer problem.”&lt;/p&gt;

&lt;p&gt;Both of these are not so much character encoding problems as collation problems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Haha-Papa problem is when strings that look different are treated as the same because of collation rules.&lt;/li&gt;
&lt;li&gt;The Sushi-Beer problem is when comparisons involving emoji and similar characters do not behave the way you intuitively expect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What makes these problems troublesome is that “we changed it to utf8mb4, so we are done” is not enough. In reality, you need to understand both the character set and the collation.&lt;/p&gt;

&lt;p&gt;And in MySQL 8, rather than making this simpler, it actually gave us even more things to think about. New collations were added, and they coexist with older systems, so “the old style,” “the post-MySQL-8 style,” and “framework defaults” do not always line up.&lt;/p&gt;

&lt;p&gt;In other words, MySQL 8 certainly improved some things, but because old and new styles now coexist as a result of those improvements, the overall situation has in some ways become even more chaotic.&lt;/p&gt;

&lt;p&gt;I think this is not so much because MySQL itself is bad, but because it has a long history and has evolved while carrying compatibility with it.&lt;/p&gt;

&lt;p&gt;Still, from the point of view of an application developer, that complexity directly becomes an entry point for accidents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In the past, PostgreSQL had some clear weaknesses too.&lt;/p&gt;

&lt;p&gt;But now, many of them have become much less significant. The feature gap has narrowed, and under the assumption of managed services, there are more things users do not need to think about directly.&lt;/p&gt;

&lt;p&gt;On the other hand, from the point of view of application implementation, there are still reasons why PostgreSQL is easier to recommend.&lt;/p&gt;

&lt;p&gt;The especially big ones are these.&lt;/p&gt;

&lt;h3&gt;
  
  
  Things MySQL 8 added and narrowed the gap on
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CHECK&lt;/code&gt; constraints&lt;/li&gt;
&lt;li&gt;Window functions&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SKIP LOCKED&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Clear reasons to recommend PostgreSQL
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;RETURNING&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;VALUES&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;being able to bring window functions into update processing&lt;/li&gt;
&lt;li&gt;partial indexes&lt;/li&gt;
&lt;li&gt;the maturity of foreign keys&lt;/li&gt;
&lt;li&gt;vector operations through &lt;code&gt;pgvector&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;being less likely to cause trouble around character sets and collations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I like MySQL.&lt;/p&gt;

&lt;p&gt;I have used it for a long time, and I still think it is a good database that is easy to get good performance from.&lt;/p&gt;

&lt;p&gt;Even so, if the question is which one I would adopt for a new project today, the one I would recommend is PostgreSQL.&lt;/p&gt;

&lt;p&gt;That is not because MySQL is bad.&lt;/p&gt;

&lt;p&gt;It is because even now, after many of its old weaknesses have been filled in, I still think PostgreSQL has an advantage when it comes to ease of application implementation.&lt;/p&gt;

</description>
      <category>mysql</category>
      <category>postgres</category>
      <category>database</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
