Skip to content

Add a datediff implementation for postgres #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 25, 2020

Conversation

mjumbewu
Copy link
Contributor

This PR adds a datediff implementation for PostgreSQL, and updates the integration tests to exercise the various types of date parts the implementation covers.

@clrcrl clrcrl self-requested a review February 10, 2020 22:47
@clrcrl
Copy link
Contributor

clrcrl commented Feb 10, 2020

Hi, so sorry about the delay here, kicking off the tests now! I think they might actually fail as we have two concurrent test suits running, but I'll re-trigger the tests if they do fail

@clrcrl
Copy link
Contributor

clrcrl commented Feb 10, 2020


Database Error in model test_datediff (models/cross_db_utils/test_datediff.sql)
  002151 (22023): SQL compilation error: ['DECADE'] is not a valid date/time component for function DATEDIFF.
  compiled SQL at target/run/dbt_utils_integration_tests/cross_db_utils/test_datediff.sql

The tests almost passed!

I'd be happy to not have a decade implementation here, I feel like it would rarely get used. I suspect 'century', 'millennium' will also fail

I'm going to take a look at the code now as well :)

Copy link
Contributor

@clrcrl clrcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Snowflake doesn't support decade and I suspect century and millenium, let's remove them from this implementation and the tests. Other than that, it's LGTM!

{% elif datepart == 'day' %}
(({{second_date}})::date - ({{first_date}})::date)
{% elif datepart == 'week' %}
({{ dbt_utils.datediff(first_date, second_date, 'day') }} / 7 + case when date_part('dow', ({{second_date}})::timestamp) < date_part('dow', ({{first_date}})::timestamp) then 1 else 0 end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've spent far longer than I'd like to admit deciphering this line (and am still left scratching my head). Fortunately, I've tested it pretty thoroughly and I'm convinced it works 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so I found a bit of weirdness. If you do a datediff between:

  • First date: Tuesday 4th Feb 2020
  • Second date: Monday 3rd Feb 2020 (yes a prior day)
    Then the result is 1, whereas I think the result should be 0.

This is certainly a strange edge case, so I'm not super concerned here. But wanted to flag it just in case. I'm happy to merge it as is though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, you're right. Thanks for catching that. Turns out postgres rounds toward 0 in integer division instead of always down (but casting from double to integer rounds down; I expected the opposite behavior).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so glad you know what's happening here ha. I truly had no idea!

@philippe-lavoie
Copy link

This is my version of the fix

{% macro postgres__datediff_year(first_date, second_date) %}
    DATE_PART('year', ({{second_date}})::timestamp) - DATE_PART('year', ({{first_date}})::timestamp) 
{% endmacro %}

{% macro postgres__datediff_month(first_date, second_date) %}
    {{ postgres__datediff_year(first_date, second_date) }} * 12 + (DATE_PART('month', ({{second_date}})::timestamp) - DATE_PART('month', ({{first_date}})::timestamp))
{% endmacro %}

{% macro postgres__datediff_day(first_date, second_date) %}
    DATE_PART('day', ({{second_date}})::timestamp - ({{first_date}})::timestamp) 
{% endmacro %}

{% macro postgres__datediff_week(first_date, second_date) %}
    TRUNC(DATE_PART('day', ({{second_date}})::timestamp - ({{first_date}})::timestamp)/7) 
{% endmacro %}

{% macro postgres__datediff_hour(first_date, second_date) %}
    {{ postgres__datediff_day(first_date, second_date) }} * 24 + DATE_PART('hour', ({{second_date}})::timestamp - ({{first_date}})::timestamp )  
{% endmacro %}

{% macro postgres__datediff_minute(first_date, second_date) %}
    {{ postgres__datediff_hour(first_date, second_date) }} * 60 + DATE_PART('minute', ({{second_date}})::timestamp - ({{first_date}})::timestamp ) 
{% endmacro %}

{% macro postgres__datediff_second(first_date, second_date) %}
    {{ postgres__datediff_minute(first_date, second_date) }} * 60 + DATE_PART('minute', ({{second_date}})::timestamp - ({{first_date}})::timestamp ) 
{% endmacro %}

{% macro postgres__datediff(first_date, second_date, datepart) %}

    {% if datepart == 'second' %}
        {{  postgres__datediff_second(first_date, second_date)}}
    {% elif datepart == 'minute' %}
        {{  postgres__datediff_minute(first_date, second_date)}}
    {% elif datepart == 'hour' %}
        {{  postgres__datediff_hour(first_date, second_date)}}
    {% elif datepart == 'week' %}
        {{  postgres__datediff_week(first_date, second_date)}}
    {% elif datepart == 'day' %}
        {{  postgres__datediff_day(first_date, second_date)}}
    {% elif datepart == 'month' %}
        {{  postgres__datediff_month(first_date, second_date)}}
    {% elif datepart == 'year' %}
        {{  postgres__datediff_year(first_date, second_date)}}
    {% else %}
        {{ exceptions.raise_compiler_error("macro datediff for PostgreSQL does not support '" + datepart +"'") }}
    {% endif %}

{% endmacro %}

Since postgres rounds integer division toward 0, we should treat positive and negative date differences distinctly
@mjumbewu mjumbewu requested a review from clrcrl February 24, 2020 18:55
Week calculation tests are more about correctness than parsing, so belong in the seed data.
@clrcrl
Copy link
Contributor

clrcrl commented Feb 24, 2020

I've kicked off the tests but won't have time to review this properly until later this week 😅

first_date and second_date need to be compared in sql instead of jinja, as their values are going to come from the sql data.
@mjumbewu
Copy link
Contributor Author

mjumbewu commented Feb 24, 2020

@clrcrl No worries. The tests picked up something I missed anyway. Thanks for kicking those off!

@clrcrl
Copy link
Contributor

clrcrl commented Feb 25, 2020

:shipit:

@clrcrl clrcrl merged commit fcb2de5 into dbt-labs:master Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants