Skip to content

🎉 🎉 Source FB Marketing: performance and reliability fixes #9805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 73 commits into from
Feb 17, 2022
Merged
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
57b47cb
Facebook Marketing performance improvement
Dec 1, 2021
69b2958
add comments and little refactoring
Dec 3, 2021
440fc6e
fix after performance test
Dec 6, 2021
dff2421
Check job status in advance, some other changes.
Dec 10, 2021
4112279
Bump rc version
Dec 10, 2021
939d646
fix usage of AsyncJob methods, correct batch job retry
eugene-kulak Dec 22, 2021
0f8439d
bump dev version
eugene-kulak Dec 22, 2021
86ce7db
revert extra diff with master
eugene-kulak Dec 22, 2021
fa650b8
Merge remote-tracking branch 'origin/master' into drezchykov/fb-perf
eugene-kulak Dec 22, 2021
d8f57d6
fix bad merge
eugene-kulak Dec 22, 2021
5c7e18f
isort
eugene-kulak Dec 22, 2021
c9deade
fix flake
eugene-kulak Dec 22, 2021
07acea5
fix unit tests
eugene-kulak Dec 22, 2021
213ac36
fix integration tests with the new config
eugene-kulak Dec 23, 2021
93a03e1
improve job status handling, limit concurrency to 10
eugene-kulak Dec 28, 2021
47ad837
WIP
eugene-kulak Jan 5, 2022
483f5cd
fix campaign jobs, refactor manager
eugene-kulak Jan 13, 2022
4faabcb
big refactoring of async jobs, support random order of slices
eugene-kulak Jan 20, 2022
ea1ac91
update source _read_incremental to hook new state logic
eugene-kulak Jan 20, 2022
c7ca0e3
fix issues with timeout
eugene-kulak Jan 20, 2022
f66ce53
remove debugging and clean up, improve retry logic
eugene-kulak Jan 21, 2022
cc5baea
Merge remote-tracking branch 'origin/master' into drezchykov/fb-perf
eugene-kulak Jan 21, 2022
218db5c
merge changes from #8234
eugene-kulak Jan 21, 2022
2e56c22
fix call super _read_increment
eugene-kulak Jan 22, 2022
8d560d8
generalize batch execution, add use_batch flag
eugene-kulak Jan 22, 2022
3c0754d
fixing unittests
eugene-kulak Jan 23, 2022
76b529c
fix integration tests
eugene-kulak Jan 23, 2022
ca42a4e
fix tests for AsyncJob
eugene-kulak Jan 23, 2022
15e73f1
fix unit_tests and batch yield
eugene-kulak Jan 23, 2022
d09994f
fix SAT
eugene-kulak Jan 23, 2022
06a75ef
fix flake
eugene-kulak Jan 23, 2022
ae503c3
format
eugene-kulak Jan 23, 2022
8df71c9
refactor file layout and fix formatting
eugene-kulak Jan 23, 2022
01f8bd4
format
eugene-kulak Jan 23, 2022
3cbd8fc
format
eugene-kulak Jan 23, 2022
a969afa
improve coverage, do some refactoring of spec
eugene-kulak Jan 24, 2022
67841e0
small fix
eugene-kulak Jan 24, 2022
da19125
format
eugene-kulak Jan 26, 2022
3f038c7
Merge remote-tracking branch 'origin/master' into keu/source-fb-perfo…
eugene-kulak Jan 26, 2022
af9e988
WIP
eugene-kulak Jan 27, 2022
b728348
WIP
eugene-kulak Jan 28, 2022
780596f
update test, remove overrides of source
eugene-kulak Jan 31, 2022
9aa7307
format
eugene-kulak Jan 31, 2022
c2c6833
Merge remote-tracking branch 'origin/master' into keu/source-fb-perfo…
eugene-kulak Jan 31, 2022
2a4bf87
fix code smell
eugene-kulak Feb 3, 2022
58b21c2
address comments from @sherifnada
eugene-kulak Feb 3, 2022
918720d
address comments from @sherifnada
eugene-kulak Feb 3, 2022
8812763
fix after small refactoring
eugene-kulak Feb 4, 2022
16b7dc1
fix tests
eugene-kulak Feb 4, 2022
8782919
add split by AdSet
eugene-kulak Feb 4, 2022
d623abd
add smaller insights
eugene-kulak Feb 4, 2022
529f773
address comments from @sherifnada
eugene-kulak Feb 8, 2022
a192588
address comments from @sherifnada
eugene-kulak Feb 8, 2022
0f2bcbc
format
eugene-kulak Feb 8, 2022
cb5bc4b
fix batch loop
eugene-kulak Feb 9, 2022
780ee72
fix tests
eugene-kulak Feb 9, 2022
f6bcaba
cover last bit of async_job
eugene-kulak Feb 9, 2022
2f1453b
fix end_date < start_date case
eugene-kulak Feb 10, 2022
dea05a8
format
eugene-kulak Feb 10, 2022
d3c2f92
add account_id to PK
eugene-kulak Feb 10, 2022
52b000f
address comment from @sherifnada
eugene-kulak Feb 11, 2022
9544c1a
fix tests
eugene-kulak Feb 11, 2022
f88cf21
Merge remote-tracking branch 'origin/master' into keu/source-fb-perfo…
eugene-kulak Feb 12, 2022
195e8ac
fix streams after merge
eugene-kulak Feb 12, 2022
9edaf8f
fix tests
eugene-kulak Feb 12, 2022
1b5431f
add notes
eugene-kulak Feb 12, 2022
63ea958
fix new streams
eugene-kulak Feb 12, 2022
c4e8ce2
format
eugene-kulak Feb 12, 2022
2d04abd
fix reversed incremental stream
eugene-kulak Feb 12, 2022
5101e85
update spec.json for SAT
eugene-kulak Feb 12, 2022
3d92326
upgrade CDK and bump version
eugene-kulak Feb 16, 2022
2e119a8
bump version
eugene-kulak Feb 17, 2022
6965ef4
Merge remote-tracking branch 'origin/master' into keu/source-fb-perfo…
eugene-kulak Feb 17, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ From the Airbyte repository root, run:
**If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.io/integrations/sources/facebook-marketing)
to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_facebook_marketing/spec.json` file.
Note that any directory named `secrets` is gitignored across the entire Airbyte repo, so there is no danger of accidentally checking in sensitive information.
See `sample_files/sample_config.json` for a sample config file.
See `integration_tests/sample_config.json` for a sample config file.

**If you are an Airbyte core member**, copy the credentials in Lastpass under the secret name `source facebook-marketing test creds`
and place them into `secrets/config.json`.
Expand Down Expand Up @@ -73,7 +73,7 @@ Then run any of the connector commands as follows:
docker run --rm airbyte/source-facebook-marketing:dev spec
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-facebook-marketing:dev check --config /secrets/config.json
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-facebook-marketing:dev discover --config /secrets/config.json
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/sample_files:/sample_files airbyte/source-facebook-marketing:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-facebook-marketing:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
```
## Testing
Make sure to familiarize yourself with [pytest test discovery](https://docs.pytest.org/en/latest/goodpractices.html#test-discovery) to know how your test files and methods should be named.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,23 +13,10 @@ tests:
- config_path: "secrets/config.json"
basic_read:
- config_path: "secrets/config.json"
configured_catalog_path: "integration_tests/configured_catalog.json"
timeout_seconds: 600
empty_streams: ["videos"]
incremental:
- config_path: "secrets/config.json"
configured_catalog_path: "integration_tests/configured_catalog_without_insights.json"
future_state_path: "integration_tests/future_state.json"
full_refresh:
- config_path: "secrets/config.json"
configured_catalog_path: "integration_tests/configured_catalog.json"
# Ad Insights API has estimated metrics in response, which is calculated based on another metrics.
# Sometimes API doesn't return estimated metrics. E.g, cost_per_estimated_ad_recallers is calculated
# as total amount spent divided by estimated ad recall lift rate. When second metric is equal to zero
# API may or may not return value. Such behavior causes sequential reads test failing.
# Because one read response contains this metric, and other doesn't.
# Therefore, it's needed to ignore fields like this in API responses.
ignored_fields:
"ads_insights_age_and_gender": ["cost_per_estimated_ad_recallers"]
"ad_creatives": ["thumbnail_url"]
"ad_account": ["age"]
"images": ["permalink_url"]
Original file line number Diff line number Diff line change
Expand Up @@ -64,61 +64,29 @@
"stream": {
"name": "ads_insights",
"json_schema": {},
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"default_cursor_field": ["date_start"],
"source_defined_primary_key": null,
"namespace": null
},
"sync_mode": "incremental",
"cursor_field": ["date_start"],
"destination_sync_mode": "append",
"primary_key": null
},
{
"stream": {
"name": "ads_insights_age_and_gender",
"json_schema": {},
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"default_cursor_field": ["date_start"],
"source_defined_primary_key": null,
"namespace": null
"source_defined_primary_key": []
},
"sync_mode": "incremental",
"primary_key": [],
"cursor_field": ["date_start"],
"destination_sync_mode": "append",
"primary_key": null
"destination_sync_mode": "append"
},
{
"stream": {
"name": "images",
"name": "ads_insights_platform_and_device",
"json_schema": {},
"default_cursor_field": ["date_start"],
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"default_cursor_field": ["updated_time"],
"source_defined_primary_key": [["id"]],
"namespace": null
"source_defined_primary_key": []
},
"sync_mode": "incremental",
"cursor_field": null,
"destination_sync_mode": "append",
"primary_key": null
},
{
"stream": {
"name": "ad_account",
"json_schema": {},
"supported_sync_modes": ["full_refresh"],
"source_defined_cursor": null,
"default_cursor_field": null,
"source_defined_primary_key": [["id"]],
"namespace": null
},
"sync_mode": "full_refresh",
"cursor_field": null,
"destination_sync_mode": "append",
"primary_key": null
"primary_key": [],
"cursor_field": ["date_start"],
"destination_sync_mode": "append"
}
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"default_cursor_field": ["updated_time"],
"source_defined_primary_key": [["id"]],
"namespace": null
"source_defined_primary_key": [["id"]]
},
"sync_mode": "incremental",
"cursor_field": null,
Expand All @@ -22,13 +21,11 @@
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"default_cursor_field": ["updated_time"],
"source_defined_primary_key": [["id"]],
"namespace": null
"source_defined_primary_key": [["id"]]
},
"sync_mode": "incremental",
"cursor_field": null,
"destination_sync_mode": "append",
"primary_key": null
"destination_sync_mode": "append"
},
{
"stream": {
Expand All @@ -37,8 +34,7 @@
"supported_sync_modes": ["full_refresh", "incremental"],
"source_defined_cursor": true,
"default_cursor_field": ["updated_time"],
"source_defined_primary_key": [["id"]],
"namespace": null
"source_defined_primary_key": [["id"]]
},
"sync_mode": "incremental",
"cursor_field": null,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,7 @@ def config_with_wrong_account_fixture(config):

@pytest.fixture(scope="session", name="config_with_include_deleted")
def config_with_include_deleted_fixture(config):
return {**config, "include_deleted": True}
new_config = {**config, "include_deleted": True}
new_config.pop("_limit", None)
new_config.pop("end_date", None)
return new_config
Loading