Skip to content

Source Google Ads: make streams incremental #10315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 16, 2022

Conversation

augan-rymkhan
Copy link
Contributor

@augan-rymkhan augan-rymkhan commented Feb 14, 2022

What

Resolves #103
In the logs provided here I see that when reading ad_group_ads stream page token has expired, because currently filtering by segments.date is not implemented for this stream.

How

Make streams: ad_group_ads, ad_groups, accounts, campaigns support incremental.

One thing to note, when filtering by segments.date records returned from stream ad_group_ad might be duplicated, this is discussed in the resources below:
link1
link2

Recommended reading order

  1. source_google_ads/streams.py
  2. source_google_ads/source.py

@github-actions github-actions bot added the area/connectors Connector related issues label Feb 14, 2022
@augan-rymkhan augan-rymkhan changed the title make streams as incremental Source Google Ads: make streams incremental Feb 14, 2022
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 14, 2022 12:20 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 14, 2022 12:20 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Feb 14, 2022

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1840989553
❌ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1840989553
🐛 https://gradle.com/s/ez25y55z3bmcm
Python short test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestDiscovery::test_defined_cursors_exist_in_schema[inputs0]
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
============= 3 failed, 18 passed, 1 skipped in 760.61s (0:12:40) ==============

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 14, 2022 12:24 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Feb 14, 2022

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1841573498
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1841573498
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/tests/test_core.py              275    106    61%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/tests/test_incremental.py        69     38    45%
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
------------------------------------------------------------------------
TOTAL                                                  876    259    70%
Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
source_google_ads/__init__.py                  2      0   100%
source_google_ads/custom_query_stream.py      75     50    33%
source_google_ads/google_ads.py               68     10    85%
source_google_ads/source.py                   73     24    67%
source_google_ads/streams.py                 129     25    81%
--------------------------------------------------------------
TOTAL                                        347    109    69%
Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
source_google_ads/__init__.py                  2      0   100%
source_google_ads/custom_query_stream.py      75      6    92%
source_google_ads/google_ads.py               68      7    90%
source_google_ads/source.py                   73     19    74%
source_google_ads/streams.py                 129     11    91%
--------------------------------------------------------------
TOTAL                                        347     43    88%

Python short test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
================== 21 passed, 1 skipped in 1382.83s (0:23:02) ==================

@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 14, 2022 14:24 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 14, 2022 14:24 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 14, 2022 14:26 Inactive
@augan-rymkhan augan-rymkhan marked this pull request as ready for review February 14, 2022 16:15
@keu
Copy link
Contributor

keu commented Feb 15, 2022

@augan-rymkhan the links you posted don't explain why duplication will happen. They explain why this is not a duplicate, this is expected behavior (it is not filtering but grouping) and we see different records this means all records carry useful information for us.
Actually, this exposes another issue, GoogleAds implements incremental sync incorrectly and may lose data in case of failure because of a premature advancing of the cursor.

Unfortunately, the fact that this problem and similar in other connectors went unnoticed so far shows how badly we test this with SAT.

Also the PK definition is wrong in all streams (how destination is OK with this??? @sherifnada)
PK is date + id, not just id.
The streams show metrics for entities for each day, so it is MxN not N

@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 06:05 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 06:05 Inactive
@codecov
Copy link

codecov bot commented Feb 16, 2022

Codecov Report

❗ No coverage uploaded for pull request base (master@9360508). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 407a3cc differs from pull request most recent head 9bf1c0e. Consider uploading reports for the commit 9bf1c0e to get more accurate results

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #10315   +/-   ##
=========================================
  Coverage          ?   69.82%           
=========================================
  Files             ?        5           
  Lines             ?      348           
  Branches          ?        0           
=========================================
  Hits              ?      243           
  Misses            ?      105           
  Partials          ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9360508...9bf1c0e. Read the comment docs.

"""
Accounts stream: https://developers.google.com/google-ads/api/fields/v8/customer
"""

primary_key = "customer.id"
primary_key = ["customer.id", "segments.date"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all incremental streams must be updated

"source_defined_primary_key": [["ad_group_ad.ad.id"]]
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"source_defined_primary_key": [["ad_group_ad.ad.id"]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here as well

Copy link
Contributor

@keu keu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm after my comments addressed

@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 07:14 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 07:14 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Feb 16, 2022

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1851590321
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1851590321
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/tests/test_core.py              275    106    61%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/tests/test_incremental.py        69     38    45%
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
------------------------------------------------------------------------
TOTAL                                                  876    259    70%
Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
source_google_ads/__init__.py                  2      0   100%
source_google_ads/custom_query_stream.py      75     50    33%
source_google_ads/google_ads.py               68     10    85%
source_google_ads/source.py                   73     24    67%
source_google_ads/streams.py                 130     24    82%
--------------------------------------------------------------
TOTAL                                        348    108    69%
Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
source_google_ads/__init__.py                  2      0   100%
source_google_ads/custom_query_stream.py      75      6    92%
source_google_ads/google_ads.py               68      7    90%
source_google_ads/source.py                   73     19    74%
source_google_ads/streams.py                 130     11    92%
--------------------------------------------------------------
TOTAL                                        348     43    88%

Python short test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
================== 21 passed, 1 skipped in 1464.81s (0:24:24) ==================

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 16, 2022 08:05 Inactive
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Feb 16, 2022
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 14:22 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 14:22 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 14:41 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 14:41 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Feb 16, 2022

/publish connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1853423371
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1853423371

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets February 16, 2022 14:46 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 15:26 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets February 16, 2022 15:26 Inactive
@augan-rymkhan augan-rymkhan merged commit 741e672 into master Feb 16, 2022
@augan-rymkhan augan-rymkhan deleted the arymkhan/google-ads-enable-incremental-stream branch February 16, 2022 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants