-
Notifications
You must be signed in to change notification settings - Fork 424
Fix for #47: notebook content state func is called at the wrong time #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for #47: notebook content state func is called at the wrong time #48
Conversation
Hey @sdebruyn the goal of the statefunc is to not bog down the statefile. Its goal was to store a CRC checksum rather than the whole base64 of the content https://www.terraform.io/docs/extend/schemas/schema-behaviors.html#statefunc. If we remove the statefunc it will bloat the statefile if there alot of notebooks with the base64 string. There will also be a size limit on how big the base64 that terraform state can store. This was a way to, rather than storage the text, it will storage a checksum instead. Let me see if I can recreate the issue using template files instead. I am working on integration tests for notebooks to cover #41 & #42 so before we merge this let me take a quick look at the issue. |
I am not able to recreate the behavior that you described from master branch. |
Hi @stikkireddy, I created a minimal repro at https://github.com/datarootsio/repro-terraform-databricks-template |
Codecov Report
@@ Coverage Diff @@
## master #48 +/- ##
==========================================
- Coverage 48.86% 48.85% -0.01%
==========================================
Files 44 44
Lines 5910 5911 +1
==========================================
Hits 2888 2888
- Misses 2971 2972 +1
Partials 51 51
|
hey @sdebruyn i have some time today to take a look at this sorry for getting back to you rather late on this topic |
@sdebruyn unfortunately when I run this: resource "databricks_notebook" "spark_setup" {
content = base64encode(templatefile("${path.module}/spark_setup.scala", { blob_host = "testhost" }))
language = "SCALA"
path = "/Shared/spark_setup.scala"
overwrite = false
mkdirs = true
format = "SOURCE"
} I get a proper plan:
I am also able to apply appropriately as well. The actual problem is the interpolation, when the statefunc is called during the plan phase for some reason and if there is an error it panics rather than return the "zero" value of the datatype. The easiest fix for this is to make the state function look like the following: "content": &schema.Schema{
Type: schema.TypeString,
Required: true,
ForceNew: true,
StateFunc: func(i interface{}) string {
base64String := i.(string)
base64, err := convertBase64ToCheckSum(base64String)
if err != nil {
return ""
}
return base64
},
}, Please let me know if the above resolves this issue in your fork. I would love to avoid having to store the base64 string in the statefile if that can be avoided by a crc checksum to determine state differences instead. This makes the state more scalable to larger workspaces and avoids the rpc error when terraform state calls are over 4 MB which notebooks base64 strings can potentially be over (Especially DBC files). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see if my comment is appropriate and sufficiently resolves your issue.
Before I make any changes, I want to make sure that we're seeing the same thing. I just pulled the latest master. I did
Then I did the following on the repro:
This is the full output of my commands:
Before we proceed, we should make sure that you can repro the crash 😯 |
I communicated very poorly in my last comment. I was able to recreate the issue as well. As stated in my previous content it only works without the panic if the content is not interpolated such as:
In your scenario you are interpolating it so the following scenario throws an error on my end as well:
The error is due to the panic and only works when interpolated fields are entered into the content which are not yet evaluated/computed. The best way to fix it without side effects seem to be just to return a empty string for the state function. Please let me know if this adds some clarity to my previous statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey this looks good to me @sdebruyn please just read over my comment and let me know if its good for you and I will merge this in. And thank you very much again for the contribution!
Hi @stikkireddy thank you for your clarification! LGTM :) |
The state func in the content argument for the notebook resource was called during the plan phase. Because I was using a templatefile with a dependency, the computed value was not available yet.
The state func got a guid instead of a base64 string so it fails to parse it (probably because the content is not available yet?)
The validation func should correctly verify if it's a valid base64 string and seems to handle the edge cases of when the value is not computed yet.
This fixes #47
Maybe could you let me know why the state func was there in the first place? Thanks!