-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WideRowHandling: spanner to source db IT #2252
base: main
Are you sure you want to change the base?
WideRowHandling: spanner to source db IT #2252
Conversation
Sync master
Sync main branch
Sync main branch
tests: Adding Forward Migration Tests (GoogleCloudPlatform#2001)
Sync main branch
Use [self-hosted, it] for prepare java cache workflow (GoogleCloudPlatform#2080)
Sync main branch
Sync main branch
Sync main branch
Sync main branch
Metadata config and pipeline options (GoogleCloudPlatform#2081)
fix: eager source row fetching logic (GoogleCloudPlatform#2071)
sync upstream/main
Sync main with upstream
Sync main branch
Support avro arrays for postgres insertion. (GoogleCloudPlatform#2154)
Referesh `SpannerToSourceDbCustomTransformationIT` tables for re-runs…
Co-authored-by: pawankashyapollion <[email protected]>
* Adding test table for Map types * Adding support for cassandra map
* add inf fix to ts fixing bug on timestamp type * Update FormatDatastreamRecordToJson.java use equals not ==
* Upgrade Beam version to 2.63.0 * add cache factory to local spanner io * remove cache pass to ReadChangeStreamPartitionDoFn * Add unimplemented/unused stubs to TestChangeStreamMutation * Use java Instant instead of threeten Instant * Fix low watermark setter call * fix testchangestreammutation * Recreate validation PR * fix import order * linux-env requirements files * fix v1 pom
…#2231) * Removed LocalSpannerIO * Moving all of SpannerIO into Teleport * Copying tests also * Updated tests and excluded SpannerIO changestream from coverage checks * spotless apply * Excluding coverage check * Adding retry settings which were overwritten in LocalSpannerAccessor * Changing deprecated retrysettings function calls * Added warning and TODO comments to remove the local copy * spotless
…eCloudPlatform#2239) * Bump timeouts for datastream to spanner test * Update FK timeout for reverse template * Bump timeouts for old reverse repl template * Enable SpannerToSourceDbInterleaveMultiShardIT * Revert loadtest timeout
* * Addition of Load Tests in SpannerToSourceDB For Cassandra (#89) * Addition of Load Tests in SpannerToSourceDB For Cassandra * Address Merge conflict * Added LT Refectored (#92) * Added POM Dependecies * sync upstream/main (#98) * RR LOAD TEST FIXES (#101) * Resolved PR comments (#115) * Added Module Dependency Fixes * Added Copyrigh * Added missing commit * Enhanced Retry Logic (GoogleCloudPlatform#2196) Co-authored-by: pawankashyapollion <[email protected]> * Adding support for Cassandra map (GoogleCloudPlatform#2209) * Adding test table for Map types * Adding support for cassandra map * changes (GoogleCloudPlatform#2212) * Fix inf issues in Datastream reader (GoogleCloudPlatform#2213) * add inf fix to ts fixing bug on timestamp type * Update FormatDatastreamRecordToJson.java use equals not == * Upgrade Beam version to 2.63.0 (GoogleCloudPlatform#2206) * Upgrade Beam version to 2.63.0 * add cache factory to local spanner io * remove cache pass to ReadChangeStreamPartitionDoFn * Add unimplemented/unused stubs to TestChangeStreamMutation * Use java Instant instead of threeten Instant * Fix low watermark setter call * fix testchangestreammutation * Recreate validation PR * fix import order * linux-env requirements files * fix v1 pom * SkipShade for Spanner common module (GoogleCloudPlatform#2194) * Add load test for cross db txn (GoogleCloudPlatform#2199) * Add load test for cross db txn * Change test timeout to 2 days * Revert spanner-pr.yml changes * Report Lineage for CsvToBigQuery template (GoogleCloudPlatform#2205) * Report Lineage for CsvToBigQuery template * Reply beampr-32662 to CsvConverters * Spanner Import/Export INTERLEAVE IN (GoogleCloudPlatform#2128) * Changes to write read interleave type from information schema, and write/read the type to/from avro. No change to tests yet. * Fix warnings, bug in InformationSchemaScanner, and bug in AvroSchemaToDdlConverter. Also properly default to IN PARENT when emitting ddl, in case the interleave type is not set (really only necessary for tests, since otherwise it will always be set. * Set interleaveType in InfoSchemaScanner * Style fixes, and only generate INTERLEAVE IN ddl for gsql. * another style fix - remove unused import * Make conditions more readable and add comments * Tests * Fix condition * Add interleave in table to ExportPipelineIT * Add SCRAM-SHA-512 authentication support to Kafka templates (GoogleCloudPlatform#2181) Added SCRAM-SHA-512 authentication support to Kafka to Kafka, Kafka to GCS, and Kafka to BigQuery templates. * Update the required Java version in the base doc, then regenerate docs (GoogleCloudPlatform#2218) * Update java requirement * Generate docs * Post 2.63.0 fixes (GoogleCloudPlatform#2216) * Uncomment kinesis * bump protoc to 4 in v2 * bug-fix: Use jdbc connection properties for reverse migration (GoogleCloudPlatform#2198) * changes * Changes * changes * changes * docs * Support partitioned reads for DateTime column type in JDBC to BigQuery template (GoogleCloudPlatform#2084) * Support partitioned reads for DateTime column type * minor changes * Support backward compatibility and timezone in lower/upper bounds * removed test cases for bounds in unit tests as derby does not support timezone * Fixed typo's in JdbcToBigQuery.java Corrected testcases expected output * Corrected unit test cases expected output. Removed DateTime integration test cases. * Corrected a typo * Added default value for partitionColumnType * spotless apply * Disabling flaky test to unblock dataflow release (GoogleCloudPlatform#2220) * disabling flaky test to unblock dataflow release * minor change * Add logic to skip runnerV2 for the ITs (GoogleCloudPlatform#2219) * skip runnerv2 tests in TemplateTestBase * add logic to skip use_runner_v2 experiment in the launchTemplate * Remove redundant property skipRunnerV2Test in pom and update logic in launchTemplate * Fix format violations using mvn spotless:apply * Add warning about caching with plugin (GoogleCloudPlatform#2221) * [DatastreamToSpanner] Spanner Exception handling (GoogleCloudPlatform#2185) * SpannerExceptionClassifier class and IT * Unit tests and Integration tests * Formatting changes * Correcting UT * Unit test for SpannerResourceManager * Creating Spanner Migration Exception * Formatting changes * Correcting tests * Addressing comments * Correcting a UT * Addressing comments * Changing SpannerMigrationException to extend RuntimeException * Added Cassandra Resource Manager Refectoring and removed Generics * Added Keyspace Voilation fixes * minor changes * Create session for row check --------- Co-authored-by: taherkl <[email protected]> Co-authored-by: Akash Thawait <[email protected]> Co-authored-by: pawankashyapollion <[email protected]> Co-authored-by: Vardhan Vinay Thigle <[email protected]> Co-authored-by: Astha Mohta <[email protected]> Co-authored-by: Dylan Hercher <[email protected]> Co-authored-by: Jack McCluskey <[email protected]> Co-authored-by: Yi Hu <[email protected]> Co-authored-by: Deep1998 <[email protected]> Co-authored-by: jjfox15 <[email protected]> Co-authored-by: vgnanasekaran <[email protected]> Co-authored-by: Danny McCormick <[email protected]> Co-authored-by: Sharan Teja M <[email protected]> Co-authored-by: shreyakhajanchi <[email protected]> Co-authored-by: Rudra-Gujarathi <[email protected]> Co-authored-by: Derrick Williams <[email protected]> Co-authored-by: darshan-sj <[email protected]>
* sync upstream/main (#98) * Enhanced Retry Logic (GoogleCloudPlatform#2196) Co-authored-by: pawankashyapollion <[email protected]> * Adding support for Cassandra map (GoogleCloudPlatform#2209) * Adding test table for Map types * Adding support for cassandra map * changes (GoogleCloudPlatform#2212) * Fix inf issues in Datastream reader (GoogleCloudPlatform#2213) * add inf fix to ts fixing bug on timestamp type * Update FormatDatastreamRecordToJson.java use equals not == * Upgrade Beam version to 2.63.0 (GoogleCloudPlatform#2206) * Upgrade Beam version to 2.63.0 * add cache factory to local spanner io * remove cache pass to ReadChangeStreamPartitionDoFn * Add unimplemented/unused stubs to TestChangeStreamMutation * Use java Instant instead of threeten Instant * Fix low watermark setter call * fix testchangestreammutation * Recreate validation PR * fix import order * linux-env requirements files * fix v1 pom * SkipShade for Spanner common module (GoogleCloudPlatform#2194) * Add load test for cross db txn (GoogleCloudPlatform#2199) * Add load test for cross db txn * Change test timeout to 2 days * Revert spanner-pr.yml changes * Report Lineage for CsvToBigQuery template (GoogleCloudPlatform#2205) * Report Lineage for CsvToBigQuery template * Reply beampr-32662 to CsvConverters * Spanner Import/Export INTERLEAVE IN (GoogleCloudPlatform#2128) * Changes to write read interleave type from information schema, and write/read the type to/from avro. No change to tests yet. * Fix warnings, bug in InformationSchemaScanner, and bug in AvroSchemaToDdlConverter. Also properly default to IN PARENT when emitting ddl, in case the interleave type is not set (really only necessary for tests, since otherwise it will always be set. * Set interleaveType in InfoSchemaScanner * Style fixes, and only generate INTERLEAVE IN ddl for gsql. * another style fix - remove unused import * Make conditions more readable and add comments * Tests * Fix condition * Add interleave in table to ExportPipelineIT * Add SCRAM-SHA-512 authentication support to Kafka templates (GoogleCloudPlatform#2181) Added SCRAM-SHA-512 authentication support to Kafka to Kafka, Kafka to GCS, and Kafka to BigQuery templates. * Update the required Java version in the base doc, then regenerate docs (GoogleCloudPlatform#2218) * Update java requirement * Generate docs * Post 2.63.0 fixes (GoogleCloudPlatform#2216) * Uncomment kinesis * bump protoc to 4 in v2 * bug-fix: Use jdbc connection properties for reverse migration (GoogleCloudPlatform#2198) * changes * Changes * changes * changes * docs * Support partitioned reads for DateTime column type in JDBC to BigQuery template (GoogleCloudPlatform#2084) * Support partitioned reads for DateTime column type * minor changes * Support backward compatibility and timezone in lower/upper bounds * removed test cases for bounds in unit tests as derby does not support timezone * Fixed typo's in JdbcToBigQuery.java Corrected testcases expected output * Corrected unit test cases expected output. Removed DateTime integration test cases. * Corrected a typo * Added default value for partitionColumnType * spotless apply * Disabling flaky test to unblock dataflow release (GoogleCloudPlatform#2220) * disabling flaky test to unblock dataflow release * minor change * Add logic to skip runnerV2 for the ITs (GoogleCloudPlatform#2219) * skip runnerv2 tests in TemplateTestBase * add logic to skip use_runner_v2 experiment in the launchTemplate * Remove redundant property skipRunnerV2Test in pom and update logic in launchTemplate * Fix format violations using mvn spotless:apply * Add warning about caching with plugin (GoogleCloudPlatform#2221) * [DatastreamToSpanner] Spanner Exception handling (GoogleCloudPlatform#2185) * SpannerExceptionClassifier class and IT * Unit tests and Integration tests * Formatting changes * Correcting UT * Unit test for SpannerResourceManager * Creating Spanner Migration Exception * Formatting changes * Correcting tests * Addressing comments * Correcting a UT * Addressing comments * Changing SpannerMigrationException to extend RuntimeException * Remove Python version from `pom.xml` (GoogleCloudPlatform#2234) * Update pom.xml * Update pom.xml * Add SkipRunnerV2Test category to JmsToPubsubIT and PubSubCdcToBigQueryIT (GoogleCloudPlatform#2235) * Fix a bug in CSVToBigQuery where commas in fields are not handled correctly. (GoogleCloudPlatform#2229) * Attempt to fix csv bug where commas in fields are not handled correctly. * Replace ImmutableList with Iterable * Add tests to cover the scenario of commas within quotes. * Polish tests and add a test case to cover csv without headers. * Update Dockerfile-template-yaml (GoogleCloudPlatform#2222) * Update Dockerfile-template-yaml * Update Dockerfile-template-yaml * Update Dockerfile-template-yaml * Update Dockerfile-template-yaml * Adding All Datatypes IT for Cassandra Migration (GoogleCloudPlatform#2230) * Add IF NOT EXISTS clause for spanner ddls used in ITs (GoogleCloudPlatform#2237) * Enable DatastreamToSpannerIT with if not exists clause * Update Datastream to Spanner IT spanner schemas with if not exists * Update BULK IT spanner schemas with if not exists * Update reverse replications ITs spanner schemas with if not exists * Add space after if not exists * Using set of random buckets for spanner ITs (GoogleCloudPlatform#2223) * Using set of random buckets for spanner ITs * reverse replication test * checkstyle fix * adding more buckets * removed ignore for testing * spotless fix * fixing UT * skip the flaky test again * addressing comments * spotless fix * Add promote artifact method in release plugin (GoogleCloudPlatform#2227) * Add promote artifact method * address comments; also fixed stagingArtifactRegistry support us.gcr.io * Consolidate method * fixed default DLQ path (GoogleCloudPlatform#2241) * Fix stagingArtifactRegistry support raw us.gcr.io artifact registry (GoogleCloudPlatform#2243) * Print error response on wget call (GoogleCloudPlatform#2245) * This helps debugging e.g. permission issues * Moving local spanner io to a different namespace (GoogleCloudPlatform#2231) * Removed LocalSpannerIO * Moving all of SpannerIO into Teleport * Copying tests also * Updated tests and excluded SpannerIO changestream from coverage checks * spotless apply * Excluding coverage check * Adding retry settings which were overwritten in LocalSpannerAccessor * Changing deprecated retrysettings function calls * Added warning and TODO comments to remove the local copy * spotless * Adding Cassandra Type Options to IT test (GoogleCloudPlatform#2242) * Bump timeouts for tests involving FKs/interleaved dependenceis (GoogleCloudPlatform#2239) * Bump timeouts for datastream to spanner test * Update FK timeout for reverse template * Bump timeouts for old reverse repl template * Enable SpannerToSourceDbInterleaveMultiShardIT * Revert loadtest timeout * Load Tests - Cassandra Reverse Replication (GoogleCloudPlatform#2163) * * Addition of Load Tests in SpannerToSourceDB For Cassandra (#89) * Addition of Load Tests in SpannerToSourceDB For Cassandra * Address Merge conflict * Added LT Refectored (#92) * Added POM Dependecies * sync upstream/main (#98) * RR LOAD TEST FIXES (#101) * Resolved PR comments (#115) * Added Module Dependency Fixes * Added Copyrigh * Added missing commit * Enhanced Retry Logic (GoogleCloudPlatform#2196) Co-authored-by: pawankashyapollion <[email protected]> * Adding support for Cassandra map (GoogleCloudPlatform#2209) * Adding test table for Map types * Adding support for cassandra map * changes (GoogleCloudPlatform#2212) * Fix inf issues in Datastream reader (GoogleCloudPlatform#2213) * add inf fix to ts fixing bug on timestamp type * Update FormatDatastreamRecordToJson.java use equals not == * Upgrade Beam version to 2.63.0 (GoogleCloudPlatform#2206) * Upgrade Beam version to 2.63.0 * add cache factory to local spanner io * remove cache pass to ReadChangeStreamPartitionDoFn * Add unimplemented/unused stubs to TestChangeStreamMutation * Use java Instant instead of threeten Instant * Fix low watermark setter call * fix testchangestreammutation * Recreate validation PR * fix import order * linux-env requirements files * fix v1 pom * SkipShade for Spanner common module (GoogleCloudPlatform#2194) * Add load test for cross db txn (GoogleCloudPlatform#2199) * Add load test for cross db txn * Change test timeout to 2 days * Revert spanner-pr.yml changes * Report Lineage for CsvToBigQuery template (GoogleCloudPlatform#2205) * Report Lineage for CsvToBigQuery template * Reply beampr-32662 to CsvConverters * Spanner Import/Export INTERLEAVE IN (GoogleCloudPlatform#2128) * Changes to write read interleave type from information schema, and write/read the type to/from avro. No change to tests yet. * Fix warnings, bug in InformationSchemaScanner, and bug in AvroSchemaToDdlConverter. Also properly default to IN PARENT when emitting ddl, in case the interleave type is not set (really only necessary for tests, since otherwise it will always be set. * Set interleaveType in InfoSchemaScanner * Style fixes, and only generate INTERLEAVE IN ddl for gsql. * another style fix - remove unused import * Make conditions more readable and add comments * Tests * Fix condition * Add interleave in table to ExportPipelineIT * Add SCRAM-SHA-512 authentication support to Kafka templates (GoogleCloudPlatform#2181) Added SCRAM-SHA-512 authentication support to Kafka to Kafka, Kafka to GCS, and Kafka to BigQuery templates. * Update the required Java version in the base doc, then regenerate docs (GoogleCloudPlatform#2218) * Update java requirement * Generate docs * Post 2.63.0 fixes (GoogleCloudPlatform#2216) * Uncomment kinesis * bump protoc to 4 in v2 * bug-fix: Use jdbc connection properties for reverse migration (GoogleCloudPlatform#2198) * changes * Changes * changes * changes * docs * Support partitioned reads for DateTime column type in JDBC to BigQuery template (GoogleCloudPlatform#2084) * Support partitioned reads for DateTime column type * minor changes * Support backward compatibility and timezone in lower/upper bounds * removed test cases for bounds in unit tests as derby does not support timezone * Fixed typo's in JdbcToBigQuery.java Corrected testcases expected output * Corrected unit test cases expected output. Removed DateTime integration test cases. * Corrected a typo * Added default value for partitionColumnType * spotless apply * Disabling flaky test to unblock dataflow release (GoogleCloudPlatform#2220) * disabling flaky test to unblock dataflow release * minor change * Add logic to skip runnerV2 for the ITs (GoogleCloudPlatform#2219) * skip runnerv2 tests in TemplateTestBase * add logic to skip use_runner_v2 experiment in the launchTemplate * Remove redundant property skipRunnerV2Test in pom and update logic in launchTemplate * Fix format violations using mvn spotless:apply * Add warning about caching with plugin (GoogleCloudPlatform#2221) * [DatastreamToSpanner] Spanner Exception handling (GoogleCloudPlatform#2185) * SpannerExceptionClassifier class and IT * Unit tests and Integration tests * Formatting changes * Correcting UT * Unit test for SpannerResourceManager * Creating Spanner Migration Exception * Formatting changes * Correcting tests * Addressing comments * Correcting a UT * Addressing comments * Changing SpannerMigrationException to extend RuntimeException * Added Cassandra Resource Manager Refectoring and removed Generics * Added Keyspace Voilation fixes * minor changes * Create session for row check --------- Co-authored-by: taherkl <[email protected]> Co-authored-by: Akash Thawait <[email protected]> Co-authored-by: pawankashyapollion <[email protected]> Co-authored-by: Vardhan Vinay Thigle <[email protected]> Co-authored-by: Astha Mohta <[email protected]> Co-authored-by: Dylan Hercher <[email protected]> Co-authored-by: Jack McCluskey <[email protected]> Co-authored-by: Yi Hu <[email protected]> Co-authored-by: Deep1998 <[email protected]> Co-authored-by: jjfox15 <[email protected]> Co-authored-by: vgnanasekaran <[email protected]> Co-authored-by: Danny McCormick <[email protected]> Co-authored-by: Sharan Teja M <[email protected]> Co-authored-by: shreyakhajanchi <[email protected]> Co-authored-by: Rudra-Gujarathi <[email protected]> Co-authored-by: Derrick Williams <[email protected]> Co-authored-by: darshan-sj <[email protected]> * Cassandra wide row it (#140) --------- Co-authored-by: taherkl <[email protected]> Co-authored-by: Taher Lakdawala <[email protected]> Co-authored-by: pawankashyapollion <[email protected]> Co-authored-by: Vardhan Vinay Thigle <[email protected]> Co-authored-by: Astha Mohta <[email protected]> Co-authored-by: Dylan Hercher <[email protected]> Co-authored-by: Jack McCluskey <[email protected]> Co-authored-by: Yi Hu <[email protected]> Co-authored-by: Deep1998 <[email protected]> Co-authored-by: jjfox15 <[email protected]> Co-authored-by: vgnanasekaran <[email protected]> Co-authored-by: Danny McCormick <[email protected]> Co-authored-by: Sharan Teja M <[email protected]> Co-authored-by: shreyakhajanchi <[email protected]> Co-authored-by: Rudra-Gujarathi <[email protected]> Co-authored-by: Derrick Williams <[email protected]> Co-authored-by: darshan-sj <[email protected]> Co-authored-by: Svetak Sundhar <[email protected]> Co-authored-by: Shunping Huang <[email protected]> Co-authored-by: Andrej Galad <[email protected]>
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2252 +/- ##
============================================
+ Coverage 49.21% 49.25% +0.03%
- Complexity 4607 4952 +345
============================================
Files 918 924 +6
Lines 55722 55777 +55
Branches 5985 5992 +7
============================================
+ Hits 27425 27473 +48
- Misses 26320 26329 +9
+ Partials 1977 1975 -2
🚀 New features to boost your workflow:
|
* Added allowes Packet Size * Added and remove unwanted Boundry Check
* Added allowes Packet Size * Added and remove unwanted Boundry Check * Added Foxes for Max in Size IT * Added Fixes for max cols * Added 10mb * Added Fixes * Added Fixes * Spotless fixes * Added Fixes * Added FIxes * removed unwanted
} | ||
} | ||
|
||
/** Writes a row with 1,024 columns in Spanner and verifies replication to Cassandra. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix this comment and add what the actual approximate size of row is here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asthamohta Updated the comments
This PR introduces validation checks to enforce various constraints on table and column creation, ensuring compliance with database limitations.
Changes Implemented
Table Name Length Constraint (1 to 128 characters)
Maximum Columns per Table (1,024)
Column Name Length Constraint (1 to 128 characters)
String Cell Size Constraint (Max 2,621,440 Unicode characters)