<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Andrzej Górski</title>
    <description>The latest articles on DEV Community by Andrzej Górski (@andrzej3393).</description>
    <link>https://dev.to/andrzej3393</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F32726%2Fb743e684-782f-4bd2-8d44-5e5b3349b229.png</url>
      <title>DEV Community: Andrzej Górski</title>
      <link>https://dev.to/andrzej3393</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/andrzej3393"/>
    <language>en</language>
    <item>
      <title>How I (almost) rescued data from a failed ZFS pool</title>
      <dc:creator>Andrzej Górski</dc:creator>
      <pubDate>Sat, 22 Feb 2025 00:00:00 +0000</pubDate>
      <link>https://dev.to/andrzej3393/how-i-almost-rescued-data-from-a-failed-zfs-pool-2on8</link>
      <guid>https://dev.to/andrzej3393/how-i-almost-rescued-data-from-a-failed-zfs-pool-2on8</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Recently HDD storage pool in my Homelab started to work “funny”. Funkiness appeared as pool clogging when heavy traffic happened on it. I’ve lived with that, restarting machine occasionally for a month or so, being busy with the life.&lt;/p&gt;

&lt;p&gt;But one day I thought that maybe it’s time to make a backup of the data on that pool. Oh boy, was I wrong and right at the same time. Making backups was definitely right thing to do, but not at that specific time.&lt;/p&gt;

&lt;p&gt;After I’ve started the backup process, the pool started to clog again. “No problem, another reboot and we’re alive” I’ve thought. And this time it was different - the pool didn’t import anymore.&lt;/p&gt;

&lt;p&gt;All disks in pool had clear SMART status before the malfunction. So I’ve started to look into the zfs status output. And it was not good:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# zpool import hdd
              cannot import 'hdd': I/O error
        Destroy and re-create the pool from
        a backup source.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One of the disks in the pool was dead. And it was dead that much, that it interfered with controller’s operation. Moreover, last reboot not only didn’t help, but caused the pool to be exported forcefully (I guess) and it broke the metadata.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;

&lt;p&gt;First, I’ve tried to import the pool with &lt;code&gt;-f&lt;/code&gt; flag, then &lt;code&gt;-f -F&lt;/code&gt;, but it didn’t help either. The pool was definitely dead.&lt;/p&gt;

&lt;p&gt;Normally, I would just replace the broken disk with a new one, replace it in zfs and do a resilver. But the problem was that the metadata was corrupted. And pool with corrupted metadata can’t be imported. And without importing the pool, I can’t replace the disk.&lt;/p&gt;

&lt;p&gt;After some Googling and more or less proper troubleshooting, I’ve found a way to import the pool with broken metadata. These commands did the trick:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;echo 1 &amp;gt; /sys/module/zfs/parameters/zfs_max_missing_tvds
echo 0 &amp;gt; /sys/module/zfs/parameters/spa_load_verify_metadata
echo 0 &amp;gt; /sys/module/zfs/parameters/spa_load_verify_data

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First one makes it possible to import the pool with missing disk. The other two disable metadata and data verification. &lt;strong&gt;It’s not recommended to use these commands in production&lt;/strong&gt;. But in my case, I didn’t have anything to lose.&lt;/p&gt;

&lt;p&gt;After that, wiser with the knowledge gathered during Googling, I’ve imported the pool in read-only mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;zpool import -f -o readonly=on hdd

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And it worked! The pool was imported in read-only mode. I’ve checked the data and it was there. Mostly. Some files were corrupted, but most of them were intact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data recovery
&lt;/h2&gt;

&lt;p&gt;After the pool was imported, I’ve tried to copy the data to another disk. First with simple &lt;code&gt;cp&lt;/code&gt;, then with &lt;code&gt;rsync&lt;/code&gt;. But they all hanged on broken files.&lt;/p&gt;

&lt;p&gt;Another research later I’ve found a tool called &lt;code&gt;cpio&lt;/code&gt;. It’s a tool that can copy files to and from archives. But with some flags, and some pipes magic, I’ve managed to use it to copy file-by-file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;find /hdd -depth -print0 | cpio -pdmv0 /target

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command copies all files from &lt;code&gt;/hdd&lt;/code&gt; to &lt;code&gt;/target&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;After a few hours, the data was copied. I’ve managed to rescue about 70% of files intact. The rest was corrupted. But it was better than nothing :)&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Backups, people!
&lt;/h3&gt;

&lt;p&gt;Shame one me, cause it was second time when I lost some data because of broken disk. And it was second time when I didn’t have a proper, automated backup. I’ve learned my lesson and I’ve started to make backups of my data. And I recommend you to do the same.&lt;/p&gt;

&lt;p&gt;The good thing is, I’ve had some old (manual) backup of the very same pool, so I’ve managed to backfill some of the lost files from it.&lt;/p&gt;

&lt;h3&gt;
  
  
  RAID (or ZRAID) is not a backup
&lt;/h3&gt;

&lt;p&gt;I knew that, but I didn’t act on that knowledge. RAID is not a backup. It’s a redundancy. It’s good to have it, but it’s not enough. You should have a backup of your data.&lt;/p&gt;

&lt;h3&gt;
  
  
  SMART monitoring won’t help you sometimes
&lt;/h3&gt;

&lt;p&gt;All disks in the pool had clear SMART status before the malfunction. It just happens that disk can die without any SMART errors. So don’t rely only on SMART monitoring.&lt;/p&gt;

&lt;p&gt;But it’s still better to have them monitored than not. I’ve had few disks that started showing SMART errors, and I wast fast enough to replace them before they died.&lt;/p&gt;

</description>
      <category>zfs</category>
      <category>proxmox</category>
    </item>
    <item>
      <title>Backblaze B2 as a Terraform remote state storage</title>
      <dc:creator>Andrzej Górski</dc:creator>
      <pubDate>Fri, 21 Feb 2025 00:00:00 +0000</pubDate>
      <link>https://dev.to/andrzej3393/backblaze-b2-as-a-terraform-remote-state-storage-1ohp</link>
      <guid>https://dev.to/andrzej3393/backblaze-b2-as-a-terraform-remote-state-storage-1ohp</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Recently I’ve started The Big Migration™ from Ansible to Terraform in my homelab. But as soon as I started to write my first Terraform manifests, I’ve started to think about remote state storage too.&lt;/p&gt;

&lt;p&gt;First, I thought about storing the state file on selfhosted Minio instance. But the problem is, that Minio instance will be managed by the very same Terraform manifests.&lt;/p&gt;

&lt;p&gt;Later on, I started researching what else can I manage with Terraform. I’ve found out that Backblaze B2, where I keep backups anyway, is manageable with Terraform! That was the moment when it clicked in my head - all in all, B2 is S3 compatible, so I can use it as a remote state storage for Terraform!&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Backblaze B2 account, with:

&lt;ol&gt;
&lt;li&gt;Bucket in which you want to store the state file&lt;/li&gt;
&lt;li&gt;Application key with permissions to manage this bucket&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;Terraform installed on your machine&lt;/li&gt;

&lt;/ol&gt;

&lt;h2&gt;
  
  
  Terraform configuration
&lt;/h2&gt;

&lt;p&gt;The basic config for storing the state file in S3 would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
  &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="s2"&gt;"s3"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"my-terraform-state-bucket"&lt;/span&gt;
    &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform.tfstate"&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One might think that it’s enough to change the &lt;code&gt;region&lt;/code&gt; to B2 substitute, add Backblaze &lt;code&gt;endpoint&lt;/code&gt; and we’re good to go. But it’s not that simple - B2 is &lt;strong&gt;almost&lt;/strong&gt; S3 compatible, so we have to do some extra steps.&lt;/p&gt;

&lt;p&gt;What we need to do is to skip some checks and validations that Backblaze B2 doesn’t support. And there’s actually quite a lot of them.&lt;/p&gt;

&lt;p&gt;A fully working example of Terraform configuration for storing the state file in Backblaze B2 would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="s2"&gt;"s3"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"my-terraform-state-bucket"&lt;/span&gt;
    &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform.tfstate"&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-west-004"&lt;/span&gt;
    &lt;span class="nx"&gt;endpoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://s3.us-west-004.backblazeb2.com"&lt;/span&gt;

    &lt;span class="nx"&gt;skip_credentials_validation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;skip_region_validation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;skip_metadata_api_check&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;skip_requesting_account_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;skip_s3_checksum&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course, you’ll have to replace &lt;code&gt;bucket&lt;/code&gt;, &lt;code&gt;region&lt;/code&gt;, and &lt;code&gt;endpoint&lt;/code&gt; values with your proper ones according to your Backblaze B2 config.&lt;/p&gt;

&lt;p&gt;As you can see, there’s no &lt;code&gt;access_key&lt;/code&gt; and &lt;code&gt;secret_key&lt;/code&gt; provided. That’s because I provide them through environment variables (and you should too!). B2’s application key goes to &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt; and key ID goes to &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt; env var.&lt;/p&gt;

&lt;h2&gt;
  
  
  Some security considerations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  State bucket
&lt;/h3&gt;

&lt;p&gt;Keep it private. It’s not a good idea to make your state file publicly available, as it might contain secrets.&lt;/p&gt;

&lt;p&gt;You might also want to enable versioning on this bucket. With versioning you can easily revert to the previous state if something goes wrong. I’ve seen Terraform go bananas a few times, so it’s a good idea to have this feature enabled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Application key
&lt;/h3&gt;

&lt;p&gt;Don’t use your master key for this. Create a separate application key with permissions to manage only this bucket. It’s a good practice to have separate keys for different tasks.&lt;/p&gt;

&lt;p&gt;Don’t put your credentials in the Terraform code (or any code, really). Especially if you’re gonna ever put that code publicly eg. on GitHub. One “oops” too far and your keys leaked. Use environment variables to provide them to Terraform. I personally load them into env vars from 1password with &lt;code&gt;op&lt;/code&gt; cli tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;It seems that Backblaze B2 is enough S3-compatible to be used as a remote state storage for Terraform. It’s good to keep your state file in some remote storage, as it’s a good practice to have it versioned and not stored on your local machine. And if you already use B2 for backups, why not use it for Terraform state file as well?&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>backblaze</category>
      <category>devops</category>
    </item>
    <item>
      <title>Compose Key</title>
      <dc:creator>Andrzej Górski</dc:creator>
      <pubDate>Tue, 13 Jun 2023 00:00:00 +0000</pubDate>
      <link>https://dev.to/andrzej3393/compose-key-2g8k</link>
      <guid>https://dev.to/andrzej3393/compose-key-2g8k</guid>
      <description>&lt;h2&gt;
  
  
  What is a Compose Key?
&lt;/h2&gt;

&lt;p&gt;Compose Key is a special key on your keyboard that allows you to type special characters like &lt;code&gt;→&lt;/code&gt;, &lt;code&gt;°&lt;/code&gt;, &lt;code&gt;€&lt;/code&gt;, &lt;code&gt;ä&lt;/code&gt;, &lt;code&gt;™&lt;/code&gt;, &lt;code&gt;®&lt;/code&gt;, &lt;code&gt;¿&lt;/code&gt; and many more. It’s very useful if you’re writing in any foreign language that uses special characters and don’t want to mangle your keyboard settings all the time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Examples
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;-&lt;/code&gt; + &lt;code&gt;&amp;gt;&lt;/code&gt; will produce a &lt;code&gt;→&lt;/code&gt; character&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;o&lt;/code&gt; + &lt;code&gt;o&lt;/code&gt; will produce a &lt;code&gt;°&lt;/code&gt; character&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;c&lt;/code&gt; + &lt;code&gt;=&lt;/code&gt; will produce a &lt;code&gt;€&lt;/code&gt; character&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;a&lt;/code&gt; + &lt;code&gt;"&lt;/code&gt; will produce a &lt;code&gt;ä&lt;/code&gt; character&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;t&lt;/code&gt; + &lt;code&gt;m&lt;/code&gt; will produce a &lt;code&gt;™&lt;/code&gt; character&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;r&lt;/code&gt; + &lt;code&gt;o&lt;/code&gt; will produce a &lt;code&gt;®&lt;/code&gt; character&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Compose&lt;/code&gt; + &lt;code&gt;?&lt;/code&gt; + &lt;code&gt;?&lt;/code&gt; will produce a &lt;code&gt;¿&lt;/code&gt; character&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you can see in the examples above, the combinations are very thought out and easy to remember. Most of the time you don’t need to know them by heart. You just have to think what two (or more) keys from the standard key set combined will give you the character that you want.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to enable Compose Key?
&lt;/h2&gt;

&lt;p&gt;As I use Gnome 3, I’ll describe how to set it up in this environment. But don’t worry, it can be enabled in pretty much every WM/DE.&lt;/p&gt;

&lt;p&gt;So, at first you have to go to the &lt;code&gt;Settings&lt;/code&gt; and then &lt;code&gt;Keyboard&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--SMDBEmGv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://andrzejgor.ski/posts/compose_key/gnome_keyboard_settings.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SMDBEmGv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://andrzejgor.ski/posts/compose_key/gnome_keyboard_settings.png" alt="Keyboard Settings screenshot" width="601" height="751"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then you have to click the &lt;code&gt;Compose Key&lt;/code&gt; option in the &lt;code&gt;Special Character Entry&lt;/code&gt; section. There you can turn it on and choose which key you want to use as a Compose Key.&lt;/p&gt;

&lt;p&gt;As you can see on the screenshot, I use the &lt;code&gt;Caps Lock&lt;/code&gt; key, but choose whatever you like.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lists of possible combinations
&lt;/h2&gt;

&lt;p&gt;These two lists are the best that I found in the internet:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://help.ubuntu.com/community/GtkComposeTable"&gt;GTKComposeTable&lt;/a&gt; - rather short, very readable list of most useful combinations.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre"&gt;libX11 documentation&lt;/a&gt; - very long, hard to read but very complete list of all possible combinations.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>linux</category>
      <category>gnome</category>
      <category>keyboard</category>
    </item>
    <item>
      <title>Big PostgreSQL Problems: ID exhaustion</title>
      <dc:creator>Andrzej Górski</dc:creator>
      <pubDate>Tue, 13 Jun 2023 00:00:00 +0000</pubDate>
      <link>https://dev.to/andrzej3393/big-postgresql-problems-id-exhaustion-2a43</link>
      <guid>https://dev.to/andrzej3393/big-postgresql-problems-id-exhaustion-2a43</guid>
      <description>&lt;p&gt;If you’re a PostgreSQL user, you may have encountered some issues with ID exhaustion already. This is a common problem, especially for databases that handle a large amount of data and/or have a high volume of insertions and deletions.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does this happen?
&lt;/h2&gt;

&lt;p&gt;In my case, it was a web app backend. There was a heavily used many-to-many relationship between the two tables. The application operated in such a way that the associations between those two tables were created and removed quite often.&lt;/p&gt;

&lt;p&gt;One day I found that the app had crashed. On production! After a quick investigation, I found the cause of the problem - IDs in the associative table have been exhausted. It turns out that heavy associations and disassociations consumed all the available IDs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other causes of ID exhaustion that I met
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Simply, tons of data.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;INSERT ... ON CONFLICT ...&lt;/code&gt; - so, upserts most of the time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Simply put, each insert, successful or not, will bump the ID’s sequence. You can even finish with a very small number of rows in the table, like a hundred or something, and yet the IDs will be exhausted.&lt;/p&gt;

&lt;h2&gt;
  
  
  But they said in the docs that the number of rows in the table is unlimited!
&lt;/h2&gt;

&lt;p&gt;As I mentioned in the &lt;a href="https://andrzejgor.ski/posts/big_postgresql_problems/introduction/" rel="noopener noreferrer"&gt;first post of this series&lt;/a&gt;, the maximum number of rows in a single PostgreSQL table is unlimited. So why did we end up with a broken app in the middle of the night?&lt;/p&gt;

&lt;h2&gt;
  
  
  Types!
&lt;/h2&gt;

&lt;p&gt;To be exact, the type of the primary key field. The “default” way of creating a new table that can be spotted in many tutorials looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;serial&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;username&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see in the highlighted line, the primary key (&lt;code&gt;id&lt;/code&gt;) column, is gonna be of type &lt;code&gt;serial&lt;/code&gt;. Also, as far as I know, most of the frameworks/ORMs choose &lt;code&gt;serial&lt;/code&gt; as the default type for the primary key column. But what &lt;code&gt;serial&lt;/code&gt; is?&lt;/p&gt;

&lt;p&gt;In reality, it is just four bytes signed integer. Or, rather, a positive part of it. So, a table with a primary key column of type &lt;code&gt;serial&lt;/code&gt; can hold up to 2^31 (a bit more than two billion) rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to fix this?
&lt;/h2&gt;

&lt;p&gt;The easiest fix is to migrate &lt;strong&gt;both&lt;/strong&gt; the primary key column &lt;strong&gt;and&lt;/strong&gt; sequence for that column to another, bigger type. If there are any tables that have a relation to this column, they need to be migrated too. The most used, bigger than &lt;code&gt;serial&lt;/code&gt;, types for primary key columns are &lt;code&gt;bigserial&lt;/code&gt; and &lt;code&gt;UUID&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bigserial
&lt;/h3&gt;

&lt;p&gt;As you probably guessed, the &lt;code&gt;bigserial&lt;/code&gt; type is in reality a &lt;code&gt;bigint&lt;/code&gt; - an eight bytes signed integer. A table with a primary key of that type can hold up to 2^63 rows - it’s more than four billion &lt;strong&gt;times&lt;/strong&gt; more than the &lt;code&gt;serial&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Migration of both the column and sequence from &lt;code&gt;serial&lt;/code&gt; to &lt;code&gt;bigserial&lt;/code&gt; is rather easy and fast. But what if it’s still not enough?&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID
&lt;/h3&gt;

&lt;p&gt;UUID is an even bigger type - it’s 128 bits long! But it’s not numeric like the previous ones, so it has certain consequences.&lt;/p&gt;

&lt;p&gt;The most important one, in this case, is that the migration of the primary key column from &lt;code&gt;serial&lt;/code&gt; or &lt;code&gt;bigserial&lt;/code&gt; to &lt;code&gt;UUID&lt;/code&gt; is not that simple. The clue of the problem is that UUIDs aren’t generated in series, but randomly. The representation of the UUIDs also is different than numeric types. It all together causes that it’ll be needed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rewrite all the existing primary keys - both in the problematic table and in the tables that relate to it. And this isn’t a cheap (in terms of execution time) operation at all.&lt;/li&gt;
&lt;li&gt;Change the way they are generated - as they’re no more serial, you’ll need to generate it either on the app side or the DB side. On the DB side, the UUID-OSSP module is recommended.&lt;/li&gt;
&lt;li&gt;Pay attention to collisions - as they’re randomly generated, the more rows you’ll have, the more likely the collisions will appear.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But besides those downsides, there are some good sides to using UUID as the primary key:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Obviously, it’s size.&lt;/li&gt;
&lt;li&gt;As UUIDs aren’t serial, you’ll have an extra layer of “security by obscurity” for free - one can not simply guess the ID of the next or previous row in the table anymore.&lt;/li&gt;
&lt;li&gt;As they can be generated on the app side, it enables the development of the app in a more DDD way, as you can have a known entity ID even before it’s stored in DB and push it down through the layers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  But hey, they said in the docs that the number of rows in the table is unlimited!
&lt;/h2&gt;

&lt;p&gt;And obviously, that’s true, who would lie in the documentation :) As long as you have a primary key in the table, you’re limited by it. If you want to get rid of that limit, you’ll have to get rid of the primary key - and that is, too, a fix to ID exhaustion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exercise!
&lt;/h2&gt;

&lt;p&gt;If you want to try it on your own, below you can find a very simple showcase of ID exhaustion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;tbl&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;smallserial&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;tbl&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32767&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;tbl&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In lines 1-3, you can see the preparation of the table that we will test on. As you can see, I used type even smaller than &lt;code&gt;serial&lt;/code&gt;, a &lt;code&gt;smallserial&lt;/code&gt; - two bytes signed integer.&lt;/p&gt;

&lt;p&gt;Then in line 5, the table is filled to the brim.&lt;/p&gt;

&lt;p&gt;And finally, in line 7 with the last one insert, we’re causing the following error:&lt;code&gt;error: nextval: reached maximum value of sequence "tbl_id_seq" (32767)&lt;/code&gt;which means that the IDs were exhausted and this row (and the following rows) will not be inserted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Extra links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.postgresql.org/docs/current/datatype-numeric.html" rel="noopener noreferrer"&gt;More about numeric data types (including &lt;code&gt;serial&lt;/code&gt; and &lt;code&gt;bigserial&lt;/code&gt;) in PostgreSQL docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.postgresql.org/docs/current/datatype-uuid.html" rel="noopener noreferrer"&gt;Short writeup of &lt;code&gt;UUID&lt;/code&gt; type in PostgreSQL docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.postgresql.org/docs/current/uuid-ossp.html" rel="noopener noreferrer"&gt;UUID-OSSP module documentation in PostgreSQL docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dba.stackexchange.com/questions/214083/what-happens-when-an-automatically-generated-uuid-primary-key-collides-in-postgr" rel="noopener noreferrer"&gt;“What happens when an automatically generated UUID primary key collides in Postgres?” on DBA StackExchange&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  That’s it!
&lt;/h2&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

</description>
      <category>postgres</category>
    </item>
    <item>
      <title>Big PostgreSQL Problems: Introduction</title>
      <dc:creator>Andrzej Górski</dc:creator>
      <pubDate>Tue, 20 Sep 2022 00:00:00 +0000</pubDate>
      <link>https://dev.to/andrzej3393/big-postgresql-problems-introduction-3nd2</link>
      <guid>https://dev.to/andrzej3393/big-postgresql-problems-introduction-3nd2</guid>
      <description>&lt;p&gt;This is the first post in a series about problems that I encountered while working with rather big PostgreSQL databases. I will describe in it what are my assumptions about the database size and also some facts from PostgreSQL documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  How big table or database have to be to be considered really “big”?
&lt;/h2&gt;

&lt;p&gt;There’s no hard definition of that topic, so there are only my assumptions.&lt;/p&gt;

&lt;p&gt;About a single table, I’d say that it can be considered big when it approaches 100 million rows. &lt;strong&gt;But&lt;/strong&gt; of course, it depends on the row size itself. Things will be completely different for a table with two or three simple integers contrary to, let’s say twenty text columns loaded with heavy data.&lt;/p&gt;

&lt;p&gt;On the database topic – in my opinion, 100 GB is a quite large database. &lt;strong&gt;But&lt;/strong&gt; , again, it depends, on how many tables are there and how heavy single rows are.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are the PostgreSQL limits?
&lt;/h2&gt;

&lt;p&gt;Now we are in a much better situation, as they’re documented in the &lt;a href="https://wiki.postgresql.org/wiki/FAQ#What_is_the_maximum_size_for_a_row.2C_a_table.2C_and_a_database.3F" rel="noopener noreferrer"&gt;official PostgreSQL FAQ&lt;/a&gt;. So, after the FAQ:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Maximum size for a database?&lt;/strong&gt; Unlimited (32 TB databases exist)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum size for a table?&lt;/strong&gt; 32 TB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum size for a row?&lt;/strong&gt; 400 GB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum size for a field?&lt;/strong&gt; 1 GB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum number of rows in a table?&lt;/strong&gt; unlimited&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum number of columns in a table?&lt;/strong&gt; 250-1600, depending on column types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum number of indexes on a table?&lt;/strong&gt; unlimited&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  That’s all for today
&lt;/h2&gt;

&lt;p&gt;It wasn’t a very long post, was it? But no worries, the next articles in the series will be more substantial (at least I hope so 🙂 ). This article is more like a common point to have something to refer to in subsequent posts.&lt;/p&gt;

</description>
      <category>postgres</category>
    </item>
  </channel>
</rss>
