Skip to content

Google Sheets Destination flushes on every state message and flush when buffer gets too large #14751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 15, 2022

Conversation

evantahler
Copy link
Contributor

@evantahler evantahler commented Jul 15, 2022

Closes #14748

This PR flushes any pending RECORD writes to the Google Sheet whenever the destination sees a STATE message. This fixes a broken spec implementation and should lead to lower memory consumption as well.

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Jul 15, 2022
@evantahler evantahler temporarily deployed to more-secrets July 15, 2022 17:17 Inactive
@evantahler evantahler temporarily deployed to more-secrets July 15, 2022 17:21 Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Jul 15, 2022

/test connector=connectors/destination-google_sheets

🕑 connectors/destination-google_sheets https://github.com/airbytehq/airbyte/actions/runs/2678366594
❌ connectors/destination-google_sheets https://github.com/airbytehq/airbyte/actions/runs/2678366594
🐛

Build Failed

Test summary info:

Could not find result summary

@evantahler
Copy link
Contributor Author

evantahler commented Jul 15, 2022

/test connector=connectors/destination-google-sheets

🕑 connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2678367697
❌ connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2678367697
🐛 https://gradle.com/s/64etrvp45rvyu

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 ERROR integration_tests/test_client.py - google.auth.exceptions.RefreshError:...
	 ERROR integration_tests/test_destination.py - google.auth.exceptions.RefreshE...
	 ERROR integration_tests/test_helpers.py - google.auth.exceptions.RefreshError...
	 ERROR integration_tests/test_spreadsheet.py - google.auth.exceptions.RefreshE...
	 ERROR integration_tests/test_writer.py - google.auth.exceptions.RefreshError:...
	 !!!!!!!!!!!!!!!!!!! Interrupted: 5 errors during collection !!!!!!!!!!!!!!!!!!!!
	 �[31m======================== �[33m3 warnings�[0m, �[31m�[1m5 errors�[0m�[31m in 2.21s�[0m�[31m =========================�[0m

@evantahler
Copy link
Contributor Author

Note for the future - the directory structure doesn't match the gradle names #14751 (comment)

@evantahler
Copy link
Contributor Author

The test failures are realy about permissions:

E   google.auth.exceptions.RefreshError: ('invalid_grant: Bad Request', ***'error': 'invalid_grant', 'error_description': 'Bad Request'***)

@sherifnada or @bazarnov: any tips to running these tests?

@evantahler evantahler temporarily deployed to more-secrets July 15, 2022 18:43 Inactive
@bazarnov
Copy link
Collaborator

Note for the future - the directory structure doesn't match the gradle names #14751 (comment)

What exactly should match?

@bazarnov bazarnov temporarily deployed to more-secrets July 15, 2022 19:23 Inactive
@bazarnov
Copy link
Collaborator

bazarnov commented Jul 15, 2022

@sherifnada or @bazarnov: any tips to running these tests?

Looks like it didn't recognise the secret, all tests are passed locally just now see the screenshot:

Screenshot 2022-07-15 at 22 26 18

@bazarnov
Copy link
Collaborator

bazarnov commented Jul 15, 2022

/test connector=connectors/destination-google-sheets

🕑 connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2678921580
✅ connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2678921580
Python tests coverage:

Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
destination_google_sheets/buffer.py           27      0   100%
destination_google_sheets/__init__.py          2      0   100%
destination_google_sheets/writer.py           41      1    98%
destination_google_sheets/client.py           25      1    96%
destination_google_sheets/spreadsheet.py      46      2    96%
destination_google_sheets/helpers.py          47      4    91%
destination_google_sheets/destination.py      47      9    81%
--------------------------------------------------------------
TOTAL                                        235     17    93%

Build Passed

Test summary info:

All Passed

@evantahler
Copy link
Contributor Author

Thanks for fixing the test @bazarnov, but I don't quite understand your logical change

@bazarnov bazarnov temporarily deployed to more-secrets July 15, 2022 19:54 Inactive
@bazarnov
Copy link
Collaborator

Thanks for fixing the test @bazarnov, but I don't quite understand your logical change

There was no change), I've simply updated the GSM secret to the one I've got locally, probably I forgot to update it after the successful /publish the initial version. (There was another link for the google sheet I believe, which was not accessible for anyone)

@evantahler evantahler marked this pull request as ready for review July 15, 2022 20:03
@bazarnov bazarnov temporarily deployed to more-secrets July 15, 2022 20:32 Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Jul 15, 2022

/test connector=connectors/destination-google-sheets

🕑 connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2679211386
✅ connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2679211386
Python tests coverage:

Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
destination_google_sheets/buffer.py           27      0   100%
destination_google_sheets/__init__.py          2      0   100%
destination_google_sheets/writer.py           41      1    98%
destination_google_sheets/client.py           25      1    96%
destination_google_sheets/spreadsheet.py      46      2    96%
destination_google_sheets/helpers.py          47      4    91%
destination_google_sheets/destination.py      47      9    81%
--------------------------------------------------------------
TOTAL                                        235     17    93%

Build Passed

Test summary info:

All Passed

@bazarnov bazarnov temporarily deployed to more-secrets July 15, 2022 21:03 Inactive
@bazarnov
Copy link
Collaborator

bazarnov commented Jul 15, 2022

@sherifnada @evantahler
I've added some simple check for the size of the records_buffer(target_stream) in addition to flush_interval check, before we write records to the destination. Now, when the size of the buffer for particular stream being processed is more than flush_interval_size_in_kb we will flush the records to the destination and clean the buffer. Please adjust the value of flush_interval_size_in_kb to reflect your best practices to handle OOM issues once needed.

@bazarnov
Copy link
Collaborator

bazarnov commented Jul 15, 2022

/test connector=connectors/destination-google-sheets

🕑 connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2679364329
✅ connectors/destination-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2679364329
Python tests coverage:

Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
destination_google_sheets/buffer.py           28      0   100%
destination_google_sheets/__init__.py          2      0   100%
destination_google_sheets/writer.py           42      1    98%
destination_google_sheets/client.py           25      1    96%
destination_google_sheets/spreadsheet.py      46      2    96%
destination_google_sheets/helpers.py          47      4    91%
destination_google_sheets/destination.py      47      9    81%
--------------------------------------------------------------
TOTAL                                        237     17    93%

Build Passed

Test summary info:

All Passed

@evantahler
Copy link
Contributor Author

That's great @bazarnov! If you review this PR, I'll get it published.

@evantahler evantahler changed the title Google Sheets Destination flushes on every state message Google Sheets Destination flushes on every state message and flush when buffer gets too large Jul 15, 2022
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks to you both!

…ination_google_sheets/writer.py

Co-authored-by: Sherif A. Nada <[email protected]>
@evantahler evantahler temporarily deployed to more-secrets July 15, 2022 21:48 Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Jul 15, 2022

/publish connector=connectors/destination-google-sheets

🕑 Publishing the following connectors:
connectors/destination-google-sheets
https://github.com/airbytehq/airbyte/actions/runs/2679610819


Connector Did it publish? Were definitions generated?
connectors/destination-google-sheets

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets July 15, 2022 22:25 Inactive
@evantahler evantahler merged commit b0f559d into master Jul 15, 2022
@evantahler evantahler deleted the evan/google-sheet-destination-flush-better branch July 15, 2022 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/destination/google-sheets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Google Sheets Destination does not emit state when records are persisted.
4 participants