DEV Community

Cover image for Why 0 is better than NULL sometimes
Alina Ramfu
Alina Ramfu

Posted on

Why 0 is better than NULL sometimes

Working with big data requires attention to details, testing, retesting, comparing data returned by the stored procedures, views, table valued functions etc. with the expected result. So understanding the core concepts of SQL is a must.

I am currently working on a project for an insurance company and the amount of data they have is huge. One of their main functionality is having reports where they can interpret the data as they need.

While testing something on one of their reports, I found a bug in the stored procedure that fed the report with data. They were trying to JOIN two sets of data, the left hand side contained NULL on a particular column, and the right side contained 0 on the same particular column.

While you may think that representing NULL on the right hand side was the fix to go to, this wouldn't have worked. Instead, this was the point where I thought representing NULLs with 0s would work. I'll demonstrate below why NULL wasn't working and how representing it with 0 fixed this.

Let's consider the following example:
LEFT hand side:

VALUES (1, null), (2, 1), (3, 2)) leftHandSide(Id, fk)
Enter fullscreen mode Exit fullscreen mode

RIGHT hand side:

VALUES (null, '0'), (1, '1'), (2, '2')) rightHandSide(Id, fk)
Enter fullscreen mode Exit fullscreen mode

I'm performing a JOIN between them:

SELECT * FROM 
(VALUES (1, null), (2, 1), (3, 2)) leftHandSide(Id, fk)
INNER JOIN 
(VALUES (null, '0'), (1, '1'), (2, '2')) rightHandSide(Id, fk)
ON rightHandSide.Id = leftHandSide.fk
Enter fullscreen mode Exit fullscreen mode

And I want my result set to look like this, it won't work.

Id  fk  Id  fk
1 NULL NULL NULL
2   1   1   1
3   2   2   2
Enter fullscreen mode Exit fullscreen mode

Instead I will only get the last 2 rows

Id  fk  Id  fk
2   1   1   1
3   2   2   2
Enter fullscreen mode Exit fullscreen mode

When you have NULL values in a column, these will never be matched to other NULL values. This is because NULL signifies the absence of any value, and cannot be compared as they will never equal anything.

So, what I did was to represent NULL to 0 so that my result set contains the 3 rows:

Id  fk  Id  fk
1   0   0   0
2   1   1   1
3   2   2   2
Enter fullscreen mode Exit fullscreen mode

by simply doing

SELECT * FROM
(VALUES (1, ISNULL(NULL, 0)), (2, 1), (3, 2)) leftHandSide(Id, fk)
INNER JOIN 
(VALUES (ISNULL(NULL, 0), '0'), (1, '1'), (2, '2')) rightHandSide(Id, fk)
ON rightHandSide.Id = leftHandSide.fk
Enter fullscreen mode Exit fullscreen mode

Lesson learned: whenever you have to deal with NULL, and you don't understand why your result set is missing some rows that you expect, think about representing NULLs to 0s.

Top comments (2)

Collapse
 
rodmatola profile image
Rodrigo Matola

I'd be careful just replacing NULL to 0, because the actual data could be 0 and you would be entering false data. For example, air temperature or other meteorological data.

Suppose there was a thermometer failure and some temperatures were not registered. If you simply put 0 to these you are introducing false data that can lead to wrong forecasting.

It's common to replace missing meteorological data with -999 or -9999, because they aren't possible values.

So, instead replacing NULL to 0, I'd replace it for an impossible value in your context.

Collapse
 
alinaramfu profile image
Alina Ramfu

Hey, thank you for the comment. For this particular case you explained would indeed be better to have an impossible value. For my case though, 0 could never have been a possible one, reason why I replaced NULL to 0. It's really depending on the context. :) Also, thank you for reading my first article.