<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mike Knights</title>
    <description>The latest articles on DEV Community by Mike Knights (@datatoolkit).</description>
    <link>https://dev.to/datatoolkit</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3919717%2Faa51825f-a1a3-40db-83d9-d3c7027734a4.jpg</url>
      <title>DEV Community: Mike Knights</title>
      <link>https://dev.to/datatoolkit</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/datatoolkit"/>
    <language>en</language>
    <item>
      <title>The padlock doesn't mean what you think it means</title>
      <dc:creator>Mike Knights</dc:creator>
      <pubDate>Wed, 27 May 2026 09:22:01 +0000</pubDate>
      <link>https://dev.to/datatoolkit/the-padlock-doesnt-mean-what-you-think-it-means-11e7</link>
      <guid>https://dev.to/datatoolkit/the-padlock-doesnt-mean-what-you-think-it-means-11e7</guid>
      <description>&lt;p&gt;Everyone knows the padlock in the browser address bar means the site is "secure". But secure in what sense? Most people assume it means the site is trustworthy. It doesn't. It means the connection is encrypted. Those are very different things.&lt;/p&gt;

&lt;p&gt;A phishing site can have a padlock. A site stealing your credit card details can have a padlock. The padlock tells you nobody is intercepting the traffic between your browser and the server - it says nothing about what the server itself does with your data.&lt;/p&gt;

&lt;h2&gt;
  
  
  So what does HTTPS actually do?
&lt;/h2&gt;

&lt;p&gt;It wraps HTTP traffic in TLS (Transport Layer Security), encrypting everything in both directions. Your request, the response, the cookies, the headers - all of it. Without HTTPS, anyone on the same network can read it in plain text.&lt;/p&gt;

&lt;p&gt;Three things HTTPS gives you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Encryption&lt;/strong&gt; - nobody in the middle can read the traffic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrity&lt;/strong&gt; - nobody in the middle can tamper with it without detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authentication&lt;/strong&gt; - the certificate verifies you're talking to the actual server, not an impersonator&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The handshake
&lt;/h2&gt;

&lt;p&gt;Before any HTTP data flows, the browser and server do a TLS handshake. In TLS 1.3 (the current version) this takes one round trip:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Browser sends supported cipher suites and a random value&lt;/li&gt;
&lt;li&gt;Server sends its certificate and picks a cipher suite&lt;/li&gt;
&lt;li&gt;Browser verifies the certificate against a trusted Certificate Authority&lt;/li&gt;
&lt;li&gt;Both sides derive a shared session key using Diffie-Hellman - without ever sending the key itself across the network&lt;/li&gt;
&lt;li&gt;Encrypted traffic begins&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The clever bit is step 4. Asymmetric encryption (slow, uses public/private key pairs) is only used to establish the shared secret. After that, symmetric encryption (fast, single shared key) handles the actual data. You get the security of asymmetric with the speed of symmetric.&lt;/p&gt;

&lt;h2&gt;
  
  
  What certificates actually prove
&lt;/h2&gt;

&lt;p&gt;A TLS certificate contains the server's public key and a signature from a Certificate Authority (CA) - an organisation your browser already trusts. The CA has verified the domain owner controls the domain before signing.&lt;/p&gt;

&lt;p&gt;When your browser sees the certificate, it checks the CA signature. Valid signature from a trusted CA = the certificate is legitimate. This is why self-signed certificates get a browser warning - there's no trusted third party vouching for them.&lt;/p&gt;

&lt;h2&gt;
  
  
  SSL is dead, TLS is what you're using
&lt;/h2&gt;

&lt;p&gt;When people say "SSL certificate" they mean TLS certificate. SSL 2.0 and 3.0 are deprecated and broken. TLS 1.0 and 1.1 are disabled in modern browsers. TLS 1.2 is still around but being phased out. TLS 1.3 is what you want - it dropped several legacy features that had caused vulnerabilities in older versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What HTTPS doesn't protect
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The domain name&lt;/strong&gt; - the hostname you're connecting to is visible in DNS lookups and in the SNI field of the TLS handshake, even over HTTPS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The server itself&lt;/strong&gt; - once data arrives at the server, HTTPS is done. If the server stores passwords in plain text, HTTPS didn't help&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your device&lt;/strong&gt; - malware that intercepts traffic before it's encrypted bypasses HTTPS entirely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The padlock is meaningful. It's just not the whole story.&lt;/p&gt;

&lt;p&gt;For a more detailed breakdown of the handshake steps and the SSL/TLS version history, &lt;a href="https://www.datatoolkit.net/learn/how-does-https-work" rel="noopener noreferrer"&gt;datatoolkit.net/learn/how-does-https-work&lt;/a&gt; covers it. SHA-256 - the hash algorithm used in TLS certificate signatures - can be tested at &lt;a href="https://www.datatoolkit.net/sha256" rel="noopener noreferrer"&gt;datatoolkit.net/sha256&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>security</category>
      <category>programming</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Base64 is not encryption - here's what it actually does</title>
      <dc:creator>Mike Knights</dc:creator>
      <pubDate>Thu, 14 May 2026 08:49:20 +0000</pubDate>
      <link>https://dev.to/datatoolkit/base64-is-not-encryption-heres-what-it-actually-does-3b9d</link>
      <guid>https://dev.to/datatoolkit/base64-is-not-encryption-heres-what-it-actually-does-3b9d</guid>
      <description>&lt;p&gt;Base64 comes up constantly - in JWTs, email attachments, data URIs, API payloads. Most developers have used it dozens of times. But a surprising number have a slightly wrong mental model, and that leads to misuse.&lt;/p&gt;

&lt;p&gt;The biggest mistake: treating it as a form of obfuscation or lightweight encryption. It isn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it actually does
&lt;/h2&gt;

&lt;p&gt;Base64 takes binary data and converts it into a string of 64 printable ASCII characters (A-Z, a-z, 0-9, +, /). The original data is completely recoverable with no key required. Anyone who sees the output can decode it in seconds.&lt;/p&gt;

&lt;p&gt;The reason it exists has nothing to do with security. Many systems that transport text - email, HTTP headers, JSON, HTML attributes - were never designed to handle arbitrary binary data. If you embed raw binary in those systems, you get corruption or parsing errors. Base64 gives binary a safe disguise for the journey.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where you run into it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;JWTs.&lt;/strong&gt; A JSON Web Token is three Base64url-encoded sections separated by dots. The header and payload are just encoded JSON - paste either into a Base64 decoder and you can read them in plain text. The only security comes from the signature at the end, not the encoding. Don't put sensitive data in a JWT payload unless you're encrypting it separately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HTTP Basic Auth.&lt;/strong&gt; The Authorization: Basic header is username:password Base64-encoded. It looks obscure, but it isn't. HTTPS is what protects it in transit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data URIs.&lt;/strong&gt; data:image/png;base64,... in HTML or CSS is the image file encoded inline. Fine for small icons, bad idea for anything large.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Email attachments.&lt;/strong&gt; SMTP was built for plain text. Attachments are Base64-encoded inside the message body so binary files survive.&lt;/p&gt;

&lt;h2&gt;
  
  
  The padding thing
&lt;/h2&gt;

&lt;p&gt;The = characters at the end aren't significant - they're padding to make the output length a multiple of 4. Base64url (used in JWTs and URLs) drops the padding entirely and replaces + with - and / with _ so the output is URL-safe. Same data, different alphabet.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use it
&lt;/h2&gt;

&lt;p&gt;When you need to move binary data through a system that only speaks text. That's it. Not for hiding data, not for security, not as a step in encryption (unless you're encoding the output of actual encryption).&lt;/p&gt;

&lt;p&gt;If you want to encode or decode something quickly in the browser - nothing leaves your device - &lt;a href="https://www.datatoolkit.net/base64" rel="noopener noreferrer"&gt;datatoolkit.net/base64&lt;/a&gt; does it client-side. There's also a &lt;a href="https://www.datatoolkit.net/learn/what-is-base64" rel="noopener noreferrer"&gt;longer explanation&lt;/a&gt; of how the encoding works mechanically if you want the full picture.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>security</category>
      <category>javascript</category>
    </item>
    <item>
      <title>MD5 is broken - here is what to use instead</title>
      <dc:creator>Mike Knights</dc:creator>
      <pubDate>Fri, 08 May 2026 09:34:09 +0000</pubDate>
      <link>https://dev.to/datatoolkit/md5-is-broken-here-is-what-to-use-instead-m79</link>
      <guid>https://dev.to/datatoolkit/md5-is-broken-here-is-what-to-use-instead-m79</guid>
      <description>&lt;p&gt;MD5 is everywhere. It is in legacy codebases, old tutorials, and still used by developers who have not stopped to check whether it is still appropriate. In most cases it is not.&lt;/p&gt;

&lt;p&gt;Here is a clear breakdown of what hash functions are, why MD5 is broken, and what you should use instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is a hash function?
&lt;/h2&gt;

&lt;p&gt;A hash function takes any input - a word, a file, an entire database dump - and produces a fixed-length string called a &lt;strong&gt;hash&lt;/strong&gt; or &lt;strong&gt;digest&lt;/strong&gt;. The same input always produces the same hash. Change even one character and the output changes completely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SHA-256("hello")  = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
SHA-256("hello!") = ce06092fb948d9ffac7d1a376e404b26b7575bcc11ee05a4615fef4fec3a308b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This "tiny input, wildly different output" property is the &lt;strong&gt;avalanche effect&lt;/strong&gt; and is fundamental to why hash functions are useful.&lt;/p&gt;

&lt;p&gt;Hash functions are &lt;strong&gt;one-way&lt;/strong&gt;. You cannot reverse a hash back to the original input. This makes them useful for verifying data without storing the data itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The algorithms compared
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Use today?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MD5&lt;/td&gt;
&lt;td&gt;128-bit / 32 chars&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Broken&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Checksums only - never security&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SHA-1&lt;/td&gt;
&lt;td&gt;160-bit / 40 chars&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Broken&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legacy only - avoid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SHA-256&lt;/td&gt;
&lt;td&gt;256-bit / 64 chars&lt;/td&gt;
&lt;td&gt;Secure&lt;/td&gt;
&lt;td&gt;Yes - general purpose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SHA-512&lt;/td&gt;
&lt;td&gt;512-bit / 128 chars&lt;/td&gt;
&lt;td&gt;Secure&lt;/td&gt;
&lt;td&gt;Yes - where extra strength is needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why is MD5 broken?
&lt;/h2&gt;

&lt;p&gt;A hash function is considered broken when researchers can produce a &lt;strong&gt;collision&lt;/strong&gt; - two different inputs that produce the same hash output.&lt;/p&gt;

&lt;p&gt;MD5 collisions can be generated in seconds on consumer hardware. Researchers have demonstrated collision attacks that produce two entirely different files with the same MD5 hash. SHA-1 was formally broken in 2017 with the &lt;a href="https://shattered.io/" rel="noopener noreferrer"&gt;SHAttered attack&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;SHA-256 has no known practical collisions. Its 256-bit output space is so vast that even with all the computing power on Earth, brute-forcing a collision would take longer than the age of the universe.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is MD5 still acceptable for?
&lt;/h2&gt;

&lt;p&gt;MD5 is still fine when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need a fast, non-cryptographic checksum to detect accidental corruption (not adversarial tampering)&lt;/li&gt;
&lt;li&gt;You are checking whether a cached resource has changed&lt;/li&gt;
&lt;li&gt;The context has no security implications whatsoever&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MD5 is &lt;strong&gt;not&lt;/strong&gt; acceptable for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Password hashing (use bcrypt or Argon2 - not any raw hash function)&lt;/li&gt;
&lt;li&gt;File integrity verification where tampering is a concern&lt;/li&gt;
&lt;li&gt;Digital signatures&lt;/li&gt;
&lt;li&gt;Any security-sensitive context&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What should you use?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For file integrity / checksums:&lt;/strong&gt; SHA-256. It is fast enough for almost all use cases and is the standard for software distribution (you see it on download pages as the verification hash).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For password storage:&lt;/strong&gt; Never use a raw hash function. Use &lt;strong&gt;bcrypt&lt;/strong&gt;, &lt;strong&gt;Argon2id&lt;/strong&gt;, or &lt;strong&gt;scrypt&lt;/strong&gt;. These are purpose-built password hashing algorithms that are deliberately slow and include salting. See &lt;a href="https://www.datatoolkit.net/learn/how-passwords-are-hashed" rel="noopener noreferrer"&gt;How Passwords Are Hashed&lt;/a&gt; for the full explanation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For HMAC / API signing:&lt;/strong&gt; SHA-256 via HMAC (&lt;code&gt;HMAC-SHA256&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For digital signatures:&lt;/strong&gt; SHA-256 (part of RSA-SHA256 and ECDSA-SHA256).&lt;/p&gt;

&lt;h2&gt;
  
  
  Generate hashes online
&lt;/h2&gt;

&lt;p&gt;You can generate MD5, SHA-1, SHA-256, and SHA-512 hashes instantly in your browser - nothing is sent to any server - at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.datatoolkit.net/sha256" rel="noopener noreferrer"&gt;datatoolkit.net/sha256&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datatoolkit.net/md5" rel="noopener noreferrer"&gt;datatoolkit.net/md5&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a broader explanation of how hash functions work, see &lt;a href="https://www.datatoolkit.net/learn/what-is-a-hash-function" rel="noopener noreferrer"&gt;What Is a Hash Function?&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;MD5 and SHA-1 are cryptographically broken - collisions can be generated deliberately&lt;/li&gt;
&lt;li&gt;Use SHA-256 for checksums, file integrity, HMAC, and signatures&lt;/li&gt;
&lt;li&gt;Never use any raw hash for passwords - use bcrypt or Argon2id&lt;/li&gt;
&lt;li&gt;SHA-256 has no known practical collisions and is the current standard&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>security</category>
      <category>cryptography</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>Stop using UUID v4 as your database primary key</title>
      <dc:creator>Mike Knights</dc:creator>
      <pubDate>Fri, 08 May 2026 09:31:07 +0000</pubDate>
      <link>https://dev.to/datatoolkit/uuid-v7-is-here-why-you-should-stop-using-v4-for-database-primary-keys-l0a</link>
      <guid>https://dev.to/datatoolkit/uuid-v7-is-here-why-you-should-stop-using-v4-for-database-primary-keys-l0a</guid>
      <description>&lt;p&gt;I spent a while wondering why inserts on a particular table were getting slower as it grew. The table had a UUID v4 primary key and a few indexes. The data wasn't huge - a few million rows - but write performance was noticeably degrading.&lt;/p&gt;

&lt;p&gt;The problem wasn't the query. It was the UUID.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's actually happening
&lt;/h2&gt;

&lt;p&gt;UUID v4 is random by design. Every new ID lands at a completely unpredictable position in the B-tree index. So every insert causes the database to find that random position, potentially split a page to make room, and rebalance. Do this millions of times and you end up with a fragmented index, lots of wasted space, and slower writes.&lt;/p&gt;

&lt;p&gt;With an auto-incrementing integer, every new row goes at the end. No splits. No rebalancing. The index stays tight.&lt;/p&gt;

&lt;p&gt;UUID v4 throws all of that away.&lt;/p&gt;

&lt;h2&gt;
  
  
  UUID v7 fixes it
&lt;/h2&gt;

&lt;p&gt;UUID v7 was standardised in RFC 9562 (May 2024). The important bit: the first 48 bits are a Unix millisecond timestamp. Because time only moves forward, v7 UUIDs sort chronologically - new ones are always greater than old ones.&lt;/p&gt;

&lt;p&gt;The database sees the same sequential insertion pattern it gets from an integer primary key. No fragmentation. You keep all the benefits of a UUID (globally unique, no central registry, works across distributed systems) without the index penalty.&lt;/p&gt;

&lt;p&gt;It looks identical to v4:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;018f6e3a-2b4c-7d8e-9f0a-1b2c3d4e5f6a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop-in replacement. Same format, same length.&lt;/p&gt;

&lt;h2&gt;
  
  
  When v4 is still the right call
&lt;/h2&gt;

&lt;p&gt;If the ID is exposed publicly and you don't want to leak when a record was created - use v4. The timestamp in v7 is extractable, which can be a privacy concern for things like user account IDs.&lt;/p&gt;

&lt;p&gt;For internal records, order IDs, event logs, anything where sequential ordering is fine - v7.&lt;/p&gt;

&lt;h2&gt;
  
  
  Language support
&lt;/h2&gt;

&lt;p&gt;Most stacks have it now. Node.js &lt;code&gt;uuid&lt;/code&gt; package has &lt;code&gt;uuid.v7()&lt;/code&gt; since v9. Python 3.14 has it in stdlib, or use the &lt;code&gt;uuid7&lt;/code&gt; package. &lt;code&gt;ramsey/uuid&lt;/code&gt; in PHP, &lt;code&gt;google/uuid&lt;/code&gt; in Go, Prisma has &lt;code&gt;uuid(7)&lt;/code&gt; as a default option. PostgreSQL has the &lt;code&gt;pg_uuidv7&lt;/code&gt; extension.&lt;/p&gt;

&lt;p&gt;If you're starting a new project, just use v7. If you have an existing table with v4, you don't need to migrate - new rows can switch immediately and the index tightens up gradually as old pages get rewritten.&lt;/p&gt;

&lt;p&gt;For generating both versions with various formatting options, &lt;a href="https://www.datatoolkit.net/uuid" rel="noopener noreferrer"&gt;datatoolkit.net/uuid&lt;/a&gt; does the job. There's also a more detailed writeup on the performance implications at &lt;a href="https://www.datatoolkit.net/learn/uuid-v4-vs-v7" rel="noopener noreferrer"&gt;datatoolkit.net/learn/uuid-v4-vs-v7&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>database</category>
      <category>uuid</category>
      <category>postgres</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
