<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: IrinaBezgina</title>
    <description>The latest articles on DEV Community by IrinaBezgina (@irinabezgina).</description>
    <link>https://dev.to/irinabezgina</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1043873%2F782e4cdc-ca9f-466a-9f02-c179fd10f6e2.jpeg</url>
      <title>DEV Community: IrinaBezgina</title>
      <link>https://dev.to/irinabezgina</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/irinabezgina"/>
    <language>en</language>
    <item>
      <title>Handling NULL in the DBs</title>
      <dc:creator>IrinaBezgina</dc:creator>
      <pubDate>Wed, 03 May 2023 21:00:52 +0000</pubDate>
      <link>https://dev.to/irinabezgina/handling-null-in-the-dbs-14jl</link>
      <guid>https://dev.to/irinabezgina/handling-null-in-the-dbs-14jl</guid>
      <description>&lt;p&gt;&lt;em&gt;NULL&lt;/em&gt; represents an unknown value or its absence. But an absence of value doesn't mean an absence of problems with it.&lt;/p&gt;

&lt;p&gt;First of all different DBMSs may behave differently when handling &lt;em&gt;NULL&lt;/em&gt; values (depending on SQL dialect or architecture). But I'll concentrate on common patterns in this topic. Let's have a look at the typical hurdles and ways to deal with them based on the two main situations (data modeling and querying the data).&lt;/p&gt;




&lt;h3&gt;
  
  
  Data Modelling
&lt;/h3&gt;

&lt;p&gt;Any data storage begins with data modeling. And it would be useful to understand how can &lt;em&gt;NULL&lt;/em&gt; values affect your database design.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Primary key&lt;/strong&gt; uniquely identifies each record in a table, and cannot contain &lt;em&gt;NULL&lt;/em&gt; values.&lt;/li&gt;
&lt;li&gt;Avoid situations when &lt;strong&gt;foreign key&lt;/strong&gt; is a &lt;em&gt;NULL&lt;/em&gt; value. It might lead to broken relationships between tables. You could avoid it setting &lt;em&gt;NOT NULL&lt;/em&gt; constraint.&lt;/li&gt;
&lt;li&gt;Not all DBMSs treat &lt;em&gt;NULL&lt;/em&gt; as a unique value. So it's better to avoid &lt;strong&gt;unique constraints&lt;/strong&gt; in the columns which possibly could have &lt;em&gt;NULL&lt;/em&gt; values.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partitioning&lt;/strong&gt; by the column which potentially has &lt;em&gt;NULL&lt;/em&gt; values could lead to uneven data distribution and hence reduce the benefits of partitioning. You may encounter a similar problem in the &lt;strong&gt;bucketing&lt;/strong&gt;/&lt;strong&gt;clustering&lt;/strong&gt; process.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Querying the data
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Generally any &lt;strong&gt;arithmetic operation&lt;/strong&gt; involving a &lt;em&gt;NULL&lt;/em&gt; will result in &lt;em&gt;NULL&lt;/em&gt;. For example, 120 - &lt;em&gt;NULL&lt;/em&gt; = &lt;em&gt;NULL&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Be careful when &lt;strong&gt;comparing values&lt;/strong&gt; potentially containing &lt;em&gt;NULLs&lt;/em&gt;. &lt;em&gt;NULL = NULL&lt;/em&gt; is not true, and neither is &lt;em&gt;NULL &amp;lt;&amp;gt; NULL&lt;/em&gt;. Instead, you should use the &lt;em&gt;IS NULL&lt;/em&gt; or &lt;em&gt;IS NOT NULL&lt;/em&gt; operators to check for &lt;em&gt;NULL&lt;/em&gt; values.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aggregate functions&lt;/strong&gt; usually ignore &lt;em&gt;NULL&lt;/em&gt; values. However, the &lt;em&gt;COUNT&lt;/em&gt; function counts &lt;em&gt;NULL&lt;/em&gt; values when using the &lt;em&gt;COUNT(*)&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logical operations&lt;/strong&gt; involving &lt;em&gt;NULL&lt;/em&gt; values could result in &lt;em&gt;NULL&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;When you &lt;strong&gt;filter&lt;/strong&gt; the values in the column containing &lt;em&gt;NULLs&lt;/em&gt; you may lose some important data. For example, you have a table with columns Name and City, and you want to select all the names from all the cities except New York and Washington. Using &lt;em&gt;City IS NOT IN ('New York', 'Washington')&lt;/em&gt; will result in rows with the &lt;em&gt;NULL&lt;/em&gt; city being lost.&lt;/li&gt;
&lt;li&gt;When you &lt;strong&gt;INNER JOIN&lt;/strong&gt; tables if either of the join columns contains a NULL value, the row will not be included in the result set.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Classic handling approaches
&lt;/h3&gt;

&lt;p&gt;The general approach to handling &lt;em&gt;NULL&lt;/em&gt; values includes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use &lt;strong&gt;COALESCE&lt;/strong&gt; (or analogs depending on DBMS) for the columns which potentially can have a &lt;em&gt;NULL&lt;/em&gt; value.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;IS NULL&lt;/strong&gt; or &lt;strong&gt;IS NOT NULL&lt;/strong&gt; for comparison operations.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;OUTER JOIN&lt;/strong&gt; or handle the JOIN clause using COALESCE when comparing columns possibly containing &lt;em&gt;NULL&lt;/em&gt; values.&lt;/li&gt;
&lt;li&gt;Create &lt;strong&gt;Unit tests&lt;/strong&gt; and &lt;strong&gt;Quality metrics&lt;/strong&gt; for your data.&lt;/li&gt;
&lt;li&gt;Experiment more. Knowledge comes with practice.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>database</category>
      <category>dataengineering</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
