DEV Community

Cover image for Think twice before using AGE in PotgreSQL
Jan Bajena
Jan Bajena

Posted on

 

Think twice before using AGE in PotgreSQL

Recently my friend at Productboard noticed an interesting bug in one of our services. For some reason our code responsible for calculating how many days our customers' features spend in certain states (Idea, Discovery, Delivery, etc) in some cases would give us wrong results.

For example: even though you can tell by looking at the calendar that the days difference between 2021-02-28 00:00:00 and 2022-05-03 00:00:00 is 429 days, our code would return 428.25 days for that timespan.

After some research it turned out that the problem was caused by the fact that we were using PostgreSQL's AGE function for calculating days difference.

According to PostgreSQL's docs AGE function calculates “symbolic” result that uses years and months, rather than just days.
It wasn't super clear to me what "symbolic" means, so I started digging a bit and I realised that the purpose of AGE is not to calculate a precise time difference, but rather calculate age the way we, humans do it. Instead of subtracting UNIX timestamps like computers do, we subtract each component of the date and then adjust the negative values.

So, in the previous example (2021-02-28 - 2022-05-03):

  • Years difference is 1
  • Months difference is 3
  • Days difference is -25, so we subtract 1 month, check how many days are left in February (0) and then add the days from May (3). Eventually we end up with 3 days.

We end up with 1 year, 2 months and 3 days. Now, why does Postgres return 428.25 here?

It's because:

  • The number of days returned by AGE is 365.25 - it's an average number of days in a year when we take leap years into consideration.
  • Postgres uses 30 as number of days in each month

So, now it all makes sense - 1 year, 2 months and 3 days leave us with 365.25 + 2 * 30 + 3 = 428.25 days 🤓.

Fortunately the solution to our problem was extremely simple - we just had to replace the AGE function with the subtraction operator.
In order to show you the difference, here's a query that I ran in an online postgres query tool:

SELECT 
    EXTRACT(epoch FROM ('2022-05-03 00:00:00'::timestamp - '2021-02-28 00:00:00'::timestamp)) / (3600 * 24) as subtraction_days, 
    EXTRACT(epoch FROM AGE('2022-05-03 00:00:00'::timestamp, '2021-02-28 00:00:00'::timestamp)) / (3600 * 24) as age_days 
FROM "current_schema"()
Enter fullscreen mode Exit fullscreen mode

image

The moral of this story is that time calculations are an extremely sensitive matter that can be approached by computers and humans differently. Fortunately PostgreSQL has all of the possible approaches covered. We, as developers, just have to understand our use case, read the documentation and think twice in order to pick an appropriate one.

Top comments (0)