You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Make delegating BigQueryMetastore just a SparkCatalog (#520)
## Summary
- Do not use the `DelegatingBigQueryMetastore` as a session catalog,
just have it be a custom catalog.
This will change the following configuration set
From:
```bash
spark.sql.catalog.spark_catalog.warehouse: "gs://zipline-warehouse-etsy/data/tables/"
spark.sql.catalog.spark_catalog.gcp_location: "us"
spark.sql.catalog.spark_catalog.gcp_project: "etsy-zipline-dev"
spark.sql.catalog.spark_catalog.catalog-impl: org.apache.iceberg.gcp.bigquery.BigQueryMetastoreCatalog
spark.sql.catalog.spark_catalog: ai.chronon.integrations.cloud_gcp.DelegatingBigQueryMetastoreCatalog
spark.sql.catalog.spark_catalog.io-impl: org.apache.iceberg.io.ResolvingFileIO
spark.sql.catalog.default_iceberg: ai.chronon.integrations.cloud_gcp.DelegatingBigQueryMetastoreCatalog
spark.sql.catalog.default_iceberg.catalog-impl: org.apache.iceberg.gcp.bigquery.BigQueryMetastoreCatalog
spark.sql.catalog.default_iceberg.io-impl: org.apache.iceberg.io.ResolvingFileIO
spark.sql.catalog.default_iceberg.warehouse: "gs://zipline-warehouse-etsy/data/tables/"
spark.sql.catalog.default_iceberg.gcp_location: "us"
spark.sql.catalog.default_iceberg.gcp_project: "etsy-zipline-dev"
spark.sql.defaultUrlStreamHandlerFactory.enabled: "false"
spark.kryo.registrator: "ai.chronon.integrations.cloud_gcp.ChrononIcebergKryoRegistrator"
```
to:
```bash
spark.sql.defaultCatalog: "default_iceberg"
spark.sql.catalog.default_iceberg: "ai.chronon.integrations.cloud_gcp.DelegatingBigQueryMetastoreCatalog"
spark.sql.catalog.default_iceberg.catalog-impl: "org.apache.iceberg.gcp.bigquery.BigQueryMetastoreCatalog"
spark.sql.catalog.default_iceberg.io-impl: "org.apache.iceberg.io.ResolvingFileIO"
spark.sql.catalog.default_iceberg.warehouse: "gs://zipline-warehouse-etsy/data/tables/"
spark.sql.catalog.default_iceberg.gcp_location: "us"
spark.sql.catalog.default_iceberg.gcp_project: "etsy-zipline-dev"
spark.sql.defaultUrlStreamHandlerFactory.enabled: "false"
spark.kryo.registrator: "ai.chronon.integrations.cloud_gcp.ChrononIcebergKryoRegistrator"
spark.sql.catalog.default_bigquery: "ai.chronon.integrations.cloud_gcp.DelegatingBigQueryMetastoreCatalog"
```
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Improved internal table processing by restructuring class integrations
and enhancing error messaging when a table isn’t found.
- **Tests**
- Updated integration settings and adjusted reference parameters to
ensure validations remain aligned with the new catalog implementation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: Thomas Chow <[email protected]>
null// This corresponds to `spark_catalog in `spark.sql.catalog.spark_catalog`. This is necessary for spark to correctly choose which implementation to use.
92
91
@@ -160,7 +159,7 @@ class DelegatingBigQueryMetastoreCatalog extends CatalogExtension {
0 commit comments