Skip to content

Commit 5464b1c

Browse files
authored
🐛 Normalization: Fix sync from HubSpot to MySQL fails with "Row size too large" on create table (#10485)
* Update mysql normalization to cast string as text. Bump docker version. Update basic-normalization.md docs. * Update docs PR reference * Update mysql normalization to cast string as for is_timestamp_with_time_zone type
1 parent f66fc19 commit 5464b1c

File tree

4 files changed

+8
-2
lines changed

4 files changed

+8
-2
lines changed

airbyte-integrations/bases/base-normalization/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,5 +28,5 @@ WORKDIR /airbyte
2828
ENV AIRBYTE_ENTRYPOINT "/airbyte/entrypoint.sh"
2929
ENTRYPOINT ["/airbyte/entrypoint.sh"]
3030

31-
LABEL io.airbyte.version=0.1.67
31+
LABEL io.airbyte.version=0.1.68
3232
LABEL io.airbyte.name=airbyte/normalization

airbyte-integrations/bases/base-normalization/normalization/transform_catalog/stream_processor.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -516,6 +516,8 @@ def cast_property_type(self, property_name: str, column_name: str, jinja_column:
516516
return f"parseDateTime64BestEffortOrNull(trim(BOTH '\"' from {replace_operation})) as {column_name}"
517517
# in all other cases
518518
sql_type = jinja_call("type_timestamp_with_timezone()")
519+
if self.destination_type == DestinationType.MYSQL:
520+
sql_type = f"{sql_type}(1024)"
519521
return f"cast({replace_operation} as {sql_type}) as {column_name}"
520522
elif is_date(definition):
521523
if self.destination_type.value == DestinationType.MYSQL.value:
@@ -538,6 +540,9 @@ def cast_property_type(self, property_name: str, column_name: str, jinja_column:
538540
trimmed_column_name = f"trim(BOTH '\"' from {column_name})"
539541
sql_type = f"'{sql_type}'"
540542
return f"nullif(accurateCastOrNull({trimmed_column_name}, {sql_type}), 'null') as {column_name}"
543+
elif self.destination_type == DestinationType.MYSQL:
544+
# Cast to `text` datatype. See https://github.com/airbytehq/airbyte/issues/7994
545+
sql_type = f"{sql_type}(1024)"
541546
else:
542547
print(f"WARN: Unknown type {definition['type']} for column {property_name} at {self.current_json_path()}")
543548
return column_name

airbyte-workers/src/main/java/io/airbyte/workers/normalization/NormalizationRunnerFactory.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
public class NormalizationRunnerFactory {
1515

1616
public static final String BASE_NORMALIZATION_IMAGE_NAME = "airbyte/normalization";
17-
public static final String NORMALIZATION_VERSION = "0.1.67";
17+
public static final String NORMALIZATION_VERSION = "0.1.68";
1818

1919
static final Map<String, ImmutablePair<String, DefaultNormalizationRunner.DestinationType>> NORMALIZATION_MAPPING =
2020
ImmutableMap.<String, ImmutablePair<String, DefaultNormalizationRunner.DestinationType>>builder()

docs/understanding-airbyte/basic-normalization.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,7 @@ Therefore, in order to "upgrade" to the desired normalization version, you need
350350

351351
| Airbyte Version | Normalization Version | Date | Pull Request | Subject |
352352
|:----------------| :--- | :--- | :--- | :--- |
353+
| 0.35.32-alpha | 0.1.68 | 2022-02-20 | [\#10485](https://github.com/airbytehq/airbyte/pull/10485) | Fix row size too large for table with numerous `string` fields |
353354
| | 0.1.66 | 2022-02-04 | [\#9341](https://github.com/airbytehq/airbyte/pull/9341) | Fix normalization for bigquery datasetId and tables |
354355
| 0.35.13-alpha | 0.1.65 | 2021-01-28 | [\#9846](https://github.com/airbytehq/airbyte/pull/9846) | Tweak dbt multi-thread parameter down |
355356
| 0.35.12-alpha | 0.1.64 | 2021-01-28 | [\#9793](https://github.com/airbytehq/airbyte/pull/9793) | Support PEM format for ssh-tunnel keys |

0 commit comments

Comments
 (0)