[Fix] type issue in `databricks_sql_table` #4422

ian-norris-ncino · 2025-01-21T19:34:05Z

Changes

Updated type check to only compare the first word in the type string.

Resolves #4421

Tests

make test run locally
relevant change in docs/ folder
covered with integration tests in internal/acceptance
using Go SDK
using TF Plugin Framework

…ore then one word in a string

catalog/resource_sql_table.go

alexott · 2025-01-23T07:37:58Z

You also need to think how to handle cases like, if you have a decimal type declared as DECIMAL(3, 2) - it will be converted into DECIMAL(3, and this is not correct. So we'll need to have a bit more sophisticated approach.

ian-norris-ncino · 2025-01-23T14:48:10Z

You also need to think how to handle cases like, if you have a decimal type declared as DECIMAL(3, 2) - it will be converted into DECIMAL(3, and this is not correct. So we'll need to have a bit more sophisticated approach.

I used regex instead. I looked at this document and it seems all the types use parentheses to signify arguments.

alexott · 2025-01-24T15:49:09Z

Integration tests has failed with the following error:

cannot create sql table: cannot execute CREATE TABLE `main`.`sheghbiihgiaj`.`fjdhiaajkejh` (`name` TIMESTAMP DEFAULT current_timestamp() COMMENT 'comment')
  | COMMENT 'this table is managed by terraform'
  | TBLPROPERTIES ('delta.minWriterVersion'='7', 'this'='that', 'delta.feature.columnMapping'='supported', 'delta.feature.invariants'='supported', 'something'='else', 'delta.columnMapping.mode'='name', 'delta.minReaderVersion'='3');: [WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value,
  | but the corresponding table feature was not enabled. Please retry the command again
  | after executing ALTER TABLE tableName SET
  | TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported'). SQLSTATE: 0AKDE

ian-norris-ncino · 2025-01-24T17:01:27Z

Integration tests has failed with the following error:

cannot create sql table: cannot execute CREATE TABLE `main`.`sheghbiihgiaj`.`fjdhiaajkejh` (`name` TIMESTAMP DEFAULT current_timestamp() COMMENT 'comment')
  | COMMENT 'this table is managed by terraform'
  | TBLPROPERTIES ('delta.minWriterVersion'='7', 'this'='that', 'delta.feature.columnMapping'='supported', 'delta.feature.invariants'='supported', 'something'='else', 'delta.columnMapping.mode'='name', 'delta.minReaderVersion'='3');: [WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value,
  | but the corresponding table feature was not enabled. Please retry the command again
  | after executing ALTER TABLE tableName SET
  | TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported'). SQLSTATE: 0AKDE

I updated the template generation to include the needed delta flag

ian-norris-ncino · 2025-01-28T14:03:22Z

@alexott Anything else I can do?

alexott · 2025-01-28T15:15:21Z

It looks like there is some syntax error - integration tests are failing with:

Error: Missing key/value separator
  | 
  |   on terraform_plugin_test.tf line 23, in resource "databricks_sql_table" "this":
  |   20:     properties         = {
  |   21:         "this"                        = "that"
  |   22:         "something"                   = "else"
  |   23:         "delta.feature.allowColumnDefaults = "supported"
  | 
  | Expected an equals sign ("=") to mark the beginning of the attribute value.
  | If you intended to given an attribute name containing periods or spaces,
  | write the name in quotes to create a string literal.
  | 
  | Error: Invalid multi-line string
  | 
  |   on terraform_plugin_test.tf line 23, in resource "databricks_sql_table" "this":
  |   23:         "delta.feature.allowColumnDefaults = "supported"
  |   24:         "delta.feature.columnMapping" = "supported"
  | 
  | Quoted strings may not be split over multiple lines. To produce a multi-line
  | string, either use the \n escape to represent a newline character or use the
  | "heredoc" multi-line template syntax.
  | 
  | Error: Invalid multi-line string
  | 
  |   on terraform_plugin_test.tf line 24, in resource "databricks_sql_table" "this":
  |   24:         "delta.feature.columnMapping" = "supported"
  |   25:         "delta.feature.invariants"    = "supported"
  |

ian-norris-ncino · 2025-01-28T16:08:46Z

It looks like there is some syntax error - integration tests are failing with:

Error: Missing key/value separator
  | 
  |   on terraform_plugin_test.tf line 23, in resource "databricks_sql_table" "this":
  |   20:     properties         = {
  |   21:         "this"                        = "that"
  |   22:         "something"                   = "else"
  |   23:         "delta.feature.allowColumnDefaults = "supported"
  | 
  | Expected an equals sign ("=") to mark the beginning of the attribute value.
  | If you intended to given an attribute name containing periods or spaces,
  | write the name in quotes to create a string literal.
  | 
  | Error: Invalid multi-line string
  | 
  |   on terraform_plugin_test.tf line 23, in resource "databricks_sql_table" "this":
  |   23:         "delta.feature.allowColumnDefaults = "supported"
  |   24:         "delta.feature.columnMapping" = "supported"
  | 
  | Quoted strings may not be split over multiple lines. To produce a multi-line
  | string, either use the \n escape to represent a newline character or use the
  | "heredoc" multi-line template syntax.
  | 
  | Error: Invalid multi-line string
  | 
  |   on terraform_plugin_test.tf line 24, in resource "databricks_sql_table" "this":
  |   24:         "delta.feature.columnMapping" = "supported"
  |   25:         "delta.feature.invariants"    = "supported"
  |

🤦‍♂️ just updated.

alexott

in general looks good, but need to think on how the change of the type may affect apply/read (I'm not sure if it won't lead to the configuration drift).

integration test is passing, but we need to fix our build to make it overall green

alexott · 2025-01-28T18:55:45Z

catalog/resource_sql_table.go

-	return caseInsensitiveColumnType
+	return normalizedColumnType


I'm not sure about this change, need to think how changes like decimal(12, 2) -> decimal(12,2) will affect the plan/apply and then read operations

Yeah, let me know what you think the best approach is and I'm happy to implement.

github-actions · 2025-01-29T15:25:45Z

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/terraform

Inputs:

PR number: 4422
Commit SHA: 0ee9ac0e2adb646c86d5af70af0308ff0bac4d64

Checks will be approved automatically on success.

ian-norris-ncino · 2025-02-04T20:58:49Z

@alexott Any thoughts on how we can proceed?

mgyucht · 2025-02-26T08:49:54Z

Hi @ian-norris-ncino, thank you for this contribution.

I've gone over this change carefully with @alexott. While I recognize it solves your immediate issue with being able to specify a timestamp column's default value, I am concerned about merging the change as it is.

We need to be able to accurately compare the type of a column as returned by the Tables API with what is specified in a user's config. I have two concerns about this:

Using regular expressions to parse the type provided by users has certain downsides. For instance: there are many attributes of columns that can be expressed in the type: how can we use regexs to properly identify those?
Currently, this PR actually ignores any of those other attributes specified by the user. This causes the diff checking to be incorrect: the provider only checks whether the type matches the type of the column, but it doesn't verify that other annotations are set properly. For example, if the default condition were removed or changed to some other value, the provider would not detect that.

The current implementation is far from ideal, and I'll be the first to admit that, but I don't think this is the right direction to start addressing this. To better support this, I would propose the following pathway.

Define a struct that captures a column's definition, including all attributes. It's OK if this struct doesn't include everything from the start, but it needs to be easily extensible for new attributes.
Implement methods that deserialize the TF config and the Tables API response into this type.
Define a struct that captures the difference between the existing and configured type.
Define a method to compute this difference given the existing and configured type.
Implement a method to convert a given difference into a set of SQL statements that alter the column to achieve the configured type.

I don't know how easy this will be, as I don't know if there is a complete reference for type_json. https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-syntax-aux-describe-table is an initial point, but I think there are many fields that are not documented here (such as a metadata field which includes the default values).

ian-norris-ncino · 2025-03-03T16:13:09Z

Hi @ian-norris-ncino, thank you for this contribution.

I've gone over this change carefully with @alexott. While I recognize it solves your immediate issue with being able to specify a timestamp column's default value, I am concerned about merging the change as it is.

We need to be able to accurately compare the type of a column as returned by the Tables API with what is specified in a user's config. I have two concerns about this:

Using regular expressions to parse the type provided by users has certain downsides. For instance: there are many attributes of columns that can be expressed in the type: how can we use regexs to properly identify those?

Currently, this PR actually ignores any of those other attributes specified by the user. This causes the diff checking to be incorrect: the provider only checks whether the type matches the type of the column, but it doesn't verify that other annotations are set properly. For example, if the default condition were removed or changed to some other value, the provider would not detect that.

The current implementation is far from ideal, and I'll be the first to admit that, but I don't think this is the right direction to start addressing this. To better support this, I would propose the following pathway.

Define a struct that captures a column's definition, including all attributes. It's OK if this struct doesn't include everything from the start, but it needs to be easily extensible for new attributes.

Implement methods that deserialize the TF config and the Tables API response into this type.

Define a struct that captures the difference between the existing and configured type.

Define a method to compute this difference given the existing and configured type.

Implement a method to convert a given difference into a set of SQL statements that alter the column to achieve the configured type.

I don't know how easy this will be, as I don't know if there is a complete reference for type_json. https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-syntax-aux-describe-table is an initial point, but I think there are many fields that are not documented here (such as a metadata field which includes the default values).

Thanks @mgyucht and @alexott for thinking deeply about this. I've starting getting deeper into an implementation and it comes with many complexities. Mainly around parsing the user provided type string into a common structure with the get table API structure. I wonder if it makes sense to add additional keys to the resource for default values, precision, scales, and interval types as to match with the get api? An example would the look like

column {
  name     = "updated_at"
  type     = "timestamp"
  default  = "current_timestamp()"
  comment  = ""
  nullable = false
}
column {
  name           = "value"
  type           = "decimal"
  type_precision = 10
  type_scale     = 0
  comment        = ""
  nullable       = false
}

There would be some considerations on parsing the type_json metadata but this would make handling the user input much easier. What do you think?

[Fix] Fix type issue in databricks_sql_table with columns that have m…

1bfbdaf

…ore then one word in a string

ian-norris-ncino requested review from a team as code owners January 21, 2025 19:34

ian-norris-ncino requested review from parthban-db and removed request for a team January 21, 2025 19:34

ian-norris-ncino mentioned this pull request Jan 21, 2025

[ISSUE] Issue with databricks_sql_table resource #4421

Open

alexott changed the title ~~Fix type issue in databricks_sql_table~~ [Fix] type issue in databricks_sql_table Jan 22, 2025

alexott requested changes Jan 22, 2025

View reviewed changes

catalog/resource_sql_table.go Outdated Show resolved Hide resolved

updating function to use regex and adding tests

3dee149

ian-norris-ncino added 2 commits January 23, 2025 11:12

further string handling and updated tests

bad19da

formatting

e5f6ede

ian-norris-ncino temporarily deployed to test-trigger-is January 23, 2025 17:07 — with GitHub Actions Inactive

added column defauls feature support into integration tests

005e606

ian-norris-ncino temporarily deployed to test-trigger-is January 24, 2025 17:35 — with GitHub Actions Inactive

syntax fix

d5a32ad

ian-norris-ncino temporarily deployed to test-trigger-is January 28, 2025 16:12 — with GitHub Actions Inactive

alexott reviewed Jan 28, 2025

View reviewed changes

Merge branch 'main' into issue-4421

0ee9ac0

alexott temporarily deployed to test-trigger-is January 29, 2025 15:25 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] type issue in `databricks_sql_table` #4422

[Fix] type issue in `databricks_sql_table` #4422

ian-norris-ncino commented Jan 21, 2025 •

edited by alexott

Loading

alexott commented Jan 23, 2025

ian-norris-ncino commented Jan 23, 2025 •

edited

Loading

alexott commented Jan 24, 2025

ian-norris-ncino commented Jan 24, 2025

ian-norris-ncino commented Jan 28, 2025

alexott commented Jan 28, 2025

ian-norris-ncino commented Jan 28, 2025

alexott left a comment

alexott Jan 28, 2025

ian-norris-ncino Jan 28, 2025

github-actions bot commented Jan 29, 2025

ian-norris-ncino commented Feb 4, 2025

mgyucht commented Feb 26, 2025

ian-norris-ncino commented Mar 3, 2025

[Fix] type issue in databricks_sql_table #4422

Are you sure you want to change the base?

[Fix] type issue in databricks_sql_table #4422

Conversation

ian-norris-ncino commented Jan 21, 2025 • edited by alexott Loading

Changes

Tests

alexott commented Jan 23, 2025

ian-norris-ncino commented Jan 23, 2025 • edited Loading

alexott commented Jan 24, 2025

ian-norris-ncino commented Jan 24, 2025

ian-norris-ncino commented Jan 28, 2025

alexott commented Jan 28, 2025

ian-norris-ncino commented Jan 28, 2025

alexott left a comment

Choose a reason for hiding this comment

alexott Jan 28, 2025

Choose a reason for hiding this comment

ian-norris-ncino Jan 28, 2025

Choose a reason for hiding this comment

github-actions bot commented Jan 29, 2025

ian-norris-ncino commented Feb 4, 2025

mgyucht commented Feb 26, 2025

ian-norris-ncino commented Mar 3, 2025

[Fix] type issue in `databricks_sql_table` #4422

[Fix] type issue in `databricks_sql_table` #4422

ian-norris-ncino commented Jan 21, 2025 •

edited by alexott

Loading

ian-norris-ncino commented Jan 23, 2025 •

edited

Loading