Skip to content

mysql-source:fix tinyint unsigned handling #18619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 2, 2022

Conversation

subodh1810
Copy link
Contributor

Issue : #17510

When we upgraded the debezium version from 1.4.2 to 1.9, the default TinyIntOneToBooleanConverter that is provided by debezium started converting the values for TINYINT unsigned to boolean values as well. The old version (1.4) only converted the values for TINYINT(1).
We recognise TINYINT UNSIGNED as INTEGER in the catalog and so while trying to map the boolean values true/false to INTEGER type, normalisation started failing.

We can not identify TINYINT UNSIGNED as boolean is because it lacks the length information. MySQL8 doesn't show the length of tinyint unsigned type.
For instance take a look at the following table structure.
I used the following query to create the table create table blah_3 (id int primary key, col_1 tinyint(1), col_2 tinyint(1) unsigned, col_3 tinyint(3), col_4 tinyint(3) unsigned);.

If you look at the information schema details of the column types listed below and compare col_1 which is tinyint(1) and col_2 which is tinyint(1) unsigned , you will see that the COLUMN_TYPE info of both the columns is different. One clarifies its a tinyint(1) while the other one doesnt.

mysql> select * FROM INFORMATION_SCHEMA.COLUMNS where TABLE_SCHEMA = "models_schema" AND TABLE_NAME = "blah_3" \G;
*************************** 1. row ***************************
           TABLE_CATALOG: def
            TABLE_SCHEMA: models_schema
              TABLE_NAME: blah_3
             COLUMN_NAME: id
        ORDINAL_POSITION: 1
          COLUMN_DEFAULT: NULL
             IS_NULLABLE: NO
               DATA_TYPE: int
CHARACTER_MAXIMUM_LENGTH: NULL
  CHARACTER_OCTET_LENGTH: NULL
       NUMERIC_PRECISION: 10
           NUMERIC_SCALE: 0
      DATETIME_PRECISION: NULL
      CHARACTER_SET_NAME: NULL
          COLLATION_NAME: NULL
             COLUMN_TYPE: int
              COLUMN_KEY: PRI
                   EXTRA: 
              PRIVILEGES: select,insert,update,references
          COLUMN_COMMENT: 
   GENERATION_EXPRESSION: 
                  SRS_ID: NULL
*************************** 2. row ***************************
           TABLE_CATALOG: def
            TABLE_SCHEMA: models_schema
              TABLE_NAME: blah_3
             COLUMN_NAME: col_1
        ORDINAL_POSITION: 2
          COLUMN_DEFAULT: NULL
             IS_NULLABLE: YES
               DATA_TYPE: tinyint
CHARACTER_MAXIMUM_LENGTH: NULL
  CHARACTER_OCTET_LENGTH: NULL
       NUMERIC_PRECISION: 3
           NUMERIC_SCALE: 0
      DATETIME_PRECISION: NULL
      CHARACTER_SET_NAME: NULL
          COLLATION_NAME: NULL
             COLUMN_TYPE: tinyint(1)
              COLUMN_KEY: 
                   EXTRA: 
              PRIVILEGES: select,insert,update,references
          COLUMN_COMMENT: 
   GENERATION_EXPRESSION: 
                  SRS_ID: NULL
*************************** 3. row ***************************
           TABLE_CATALOG: def
            TABLE_SCHEMA: models_schema
              TABLE_NAME: blah_3
             COLUMN_NAME: col_2
        ORDINAL_POSITION: 3
          COLUMN_DEFAULT: NULL
             IS_NULLABLE: YES
               DATA_TYPE: tinyint
CHARACTER_MAXIMUM_LENGTH: NULL
  CHARACTER_OCTET_LENGTH: NULL
       NUMERIC_PRECISION: 3
           NUMERIC_SCALE: 0
      DATETIME_PRECISION: NULL
      CHARACTER_SET_NAME: NULL
          COLLATION_NAME: NULL
             COLUMN_TYPE: tinyint unsigned
              COLUMN_KEY: 
                   EXTRA: 
              PRIVILEGES: select,insert,update,references
          COLUMN_COMMENT: 
   GENERATION_EXPRESSION: 
                  SRS_ID: NULL
*************************** 4. row ***************************
           TABLE_CATALOG: def
            TABLE_SCHEMA: models_schema
              TABLE_NAME: blah_3
             COLUMN_NAME: col_3
        ORDINAL_POSITION: 4
          COLUMN_DEFAULT: NULL
             IS_NULLABLE: YES
               DATA_TYPE: tinyint
CHARACTER_MAXIMUM_LENGTH: NULL
  CHARACTER_OCTET_LENGTH: NULL
       NUMERIC_PRECISION: 3
           NUMERIC_SCALE: 0
      DATETIME_PRECISION: NULL
      CHARACTER_SET_NAME: NULL
          COLLATION_NAME: NULL
             COLUMN_TYPE: tinyint
              COLUMN_KEY: 
                   EXTRA: 
              PRIVILEGES: select,insert,update,references
          COLUMN_COMMENT: 
   GENERATION_EXPRESSION: 
                  SRS_ID: NULL
*************************** 5. row ***************************
           TABLE_CATALOG: def
            TABLE_SCHEMA: models_schema
              TABLE_NAME: blah_3
             COLUMN_NAME: col_4
        ORDINAL_POSITION: 5
          COLUMN_DEFAULT: NULL
             IS_NULLABLE: YES
               DATA_TYPE: tinyint
CHARACTER_MAXIMUM_LENGTH: NULL
  CHARACTER_OCTET_LENGTH: NULL
       NUMERIC_PRECISION: 3
           NUMERIC_SCALE: 0
      DATETIME_PRECISION: NULL
      CHARACTER_SET_NAME: NULL
          COLLATION_NAME: NULL
             COLUMN_TYPE: tinyint unsigned
              COLUMN_KEY: 
                   EXTRA: 
              PRIVILEGES: select,insert,update,references
          COLUMN_COMMENT: 
   GENERATION_EXPRESSION: 
                  SRS_ID: NULL
5 rows in set (0.10 sec)

mysql> desc blah_3;
+-------+------------------+------+-----+---------+-------+
| Field | Type             | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| id    | int              | NO   | PRI | NULL    |       |
| col_1 | tinyint(1)       | YES  |     | NULL    |       |
| col_2 | tinyint unsigned | YES  |     | NULL    |       |
| col_3 | tinyint          | YES  |     | NULL    |       |
| col_4 | tinyint unsigned | YES  |     | NULL    |       |
+-------+------------------+------+-----+---------+-------+
5 rows in set (0.01 sec)

This PR brings back the old behaviour and only converts tinyint column types to boolean. This also updates the document.

@subodh1810 subodh1810 requested a review from a team as a code owner October 28, 2022 18:23
@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Oct 28, 2022
@github-actions
Copy link
Contributor

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

  • source-mysql-strict-encrypt
  • source-mysql
  • source-postgres

@subodh1810
Copy link
Contributor Author

subodh1810 commented Oct 28, 2022

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/3347655569
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/3347655569
No Python unittests run

Build Passed

Test summary info:

All Passed

@rodireich
Copy link
Contributor

Great find!

@danieldiamond
Copy link
Contributor

@subodh1810 awesome work! send it 🚀

@subodh1810
Copy link
Contributor Author

subodh1810 commented Oct 31, 2022

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/3359566206
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/3359566206
No Python unittests run

Build Passed

Test summary info:

All Passed

@github-actions
Copy link
Contributor

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

  • source-postgres
  • source-mysql-strict-encrypt
  • source-mysql

@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2022

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

Sources (3)
  • source-mysql-strict-encrypt
  • source-postgres
  • source-mysql
Destinations (0)

@subodh1810
Copy link
Contributor Author

subodh1810 commented Nov 2, 2022

/publish connector=connectors/source-mysql

🕑 Publishing the following connectors:
connectors/source-mysql
https://github.com/airbytehq/airbyte/actions/runs/3376510617


Connector Did it publish? Were definitions generated?
connectors/source-mysql

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@subodh1810
Copy link
Contributor Author

subodh1810 commented Nov 2, 2022

/publish connector=connectors/source-mysql-strict-encrypt

🕑 Publishing the following connectors:
connectors/source-mysql-strict-encrypt
https://github.com/airbytehq/airbyte/actions/runs/3376510688


Connector Did it publish? Were definitions generated?
connectors/source-mysql-strict-encrypt

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets November 2, 2022 11:40 Inactive
@subodh1810 subodh1810 merged commit 0a37a8d into master Nov 2, 2022
@subodh1810 subodh1810 deleted the mysql-handle-tinyint-unsigned branch November 2, 2022 15:24
drewrasm pushed a commit to drewrasm/airbyte that referenced this pull request Nov 2, 2022
* mysql-source:fix tinyint unsigned handling

* update doc

* format

* upgrade version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>
natalyjazzviolin pushed a commit that referenced this pull request Nov 3, 2022
* mysql-source:fix tinyint unsigned handling

* update doc

* format

* upgrade version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>
arsenlosenko pushed a commit that referenced this pull request Nov 8, 2022
* solve conflicts

* solve conflict in json schema

* bump to version 0.1.8 for the changes of this pr

* change ad account id in the schemas

* query to include data plane attributes (#18531)

* query to include data plane attributes

* rename functions

* fix java build

* more renaming fix

* Fix unit tests in source relational db (#18789)

* Fix unit tests

* Add extra test case for record count > 1

* Store record count in variable

* ci: use custom test-reporter action to upload job results (#18004)

* ci: use custom action to upload job results

* Correct coinmarket spec (#18790)

* correct coinmarket spec

* remove duplicate support normalization from source spec

* rollback coinmarketcap version in source def seed

* update connector version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>

* Parameterize test_empty_streams and test_stream_with_1_airbyte_column by destination (#18197)

* Remove lines that always add Postgres to list of destinations
* Parameterize all tests in test_ephemeral by destination

* 🐛 Source Facebook Marketing: reduce request limit after specific error (#18734)

* 🐛 Source Facebook Marketing: reduce request limit after specific error

* 🐛 Source Facebook Marketing: bump version; update docs

* 🐛 Source Facebook Marketing: add test

* 🐛 Source Facebook Marketing: increase timeout

* [charts/airbyte-cron] Cleanup env vars (#18787)

* [charts/airbyte-cron] Cleanup env vars

* Remove unused env var

* Use equalsIgnoreCase (#18810)

* Bump helm chart version reference to 0.40.40 (#18815)

Co-authored-by: perangel <[email protected]>
Co-authored-by: Kyryl Skobylko <[email protected]>

* 🐛Destination Google Sheets: Fix empty headers list (#18729)

* Fix empty headers list

* Updated PR number

* Bumped version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>

* 🐛Source Exchange Rates: Fix handling error during check connection (#18726)

* Fix handling error during check connection

* Updated PR number

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>

* Add normalization changelog and bump normalization version in platform (#18813)

* Remove ConfigPersistence usage from SecretsMigrator (#18747)

* remove config persistence from seeding logic (#18749)

* Remove the bulk actions from ConfigPersistence (#18800)

* hide ConfigPersistence inside ConfigRepository to discourage use (#18803)

* ci: add job and run id to test reports (#18832)

* Bump Airbyte version from 0.40.17 to 0.40.18 (#18827)

Co-authored-by: grishick <[email protected]>

* 🪟🔧 Remove styled components (round 1) (#18766)

* refactor EditorHeader (untested)

* refactor BaseClearView

* delete unused Subtitle

* refactor ConfirmationModal

* refactor Arrow

* refactor BulkHeader

* refactor CatalogTreeSearch

* refactor StreamFieldTable

* refactor StreamHeader

* refactor ConnectorIcon

* refactor TreeRowWrapper

* refactor DeleteBlock

* refactor EmptyResourceBlock

* revert unintended element change

* fixed acceptance tests (#18699)

* 🪟🔧 Reactor Breadcrumbs component to use anchors (#18764)

* refactor breadcrumbs to use actual links

* PR comments on styles

* increase test timeout for some webapp tests to prevent flakes (#18807)

* Remove "Filters and Segments" from Google Analytics v4 (#18508)

Filters and Segments info was incorrectly added to the Google Analytics v4 connector instead of the Google Analytics (Universal Analytics) Connector.

* Add notes about EU OAUth (#18835)

EU OAuth is not fully tested so adding a note to account for that.

* 🪟🐛 Fix: visual regression in ConnectorIcon (#18849)

* fix visual regression

* remove unused prop

* Add links to demo page (#18828)

* mysql-source:fix tinyint unsigned handling (#18619)

* mysql-source:fix tinyint unsigned handling

* update doc

* format

* upgrade version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>

* 🪟 🎉 Allow environment specific sections in docs (#18829)

* Allow environment specific sections in docs

* Change syntax to lower case

* ci: replace GITHUB_OUTPUT with GITHUB_ENV on multiline variables (#18809)

* ci: replace GITHUB_OUTPUT with GITHUB_ENV on multiline variables

* ci: replace github set-ouput with new syntax in ./tools/bin/

* Add connection ID to span (#18852)

* edited connector docs (#18855)

* 🪟 🔧 Upgrade husky to 8.0.1 (#18719)

* Upgrade Husky

* Upgrade Husky

* Upgrade Husky

* Upgrade Husky

* Upgrade Husky

* ci: replace GITHUB_OUTPUT with GITHUB_ENV for multiline variables (#18857)

* Avoid NPE when adding connection ID to trace (#18856)

* Filter exit errors by operation name (#18850)

* add label

* auto-bump connector version

Co-authored-by: marcosmarxm <[email protected]>
Co-authored-by: Xiaohan Song <[email protected]>
Co-authored-by: Liren Tu <[email protected]>
Co-authored-by: Conor <[email protected]>
Co-authored-by: Marcos Marx <[email protected]>
Co-authored-by: Octavia Squidington III <[email protected]>
Co-authored-by: Greg Solovyev <[email protected]>
Co-authored-by: Artem Inzhyyants <[email protected]>
Co-authored-by: perangel <[email protected]>
Co-authored-by: Jonathan Pearlin <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: perangel <[email protected]>
Co-authored-by: Kyryl Skobylko <[email protected]>
Co-authored-by: Serhii Lazebnyi <[email protected]>
Co-authored-by: Charles <[email protected]>
Co-authored-by: Octavia Squidington III <[email protected]>
Co-authored-by: Joey Marshment-Howell <[email protected]>
Co-authored-by: darynaishchenko <[email protected]>
Co-authored-by: Michael Siega <[email protected]>
Co-authored-by: Tyler B <[email protected]>
Co-authored-by: Yowan Ramchoreeter <[email protected]>
Co-authored-by: Tim Roes <[email protected]>
Co-authored-by: Subodh Kant Chaturvedi <[email protected]>
Co-authored-by: Volodymyr Pochtar <[email protected]>
Co-authored-by: Amruta Ranade <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants