Skip to content

airbyte-api: add jackson model annotations to remove null values from responses #13370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
3 changes: 2 additions & 1 deletion airbyte-api/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ task generateApiServer(type: GenerateTask) {
configOptions = [
dateLibrary : "java8",
generatePom : "false",
interfaceOnly: "true"
interfaceOnly: "true",
additionalModelTypeAnnotations: "\[email protected](com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL)"
Copy link
Contributor

@cgardens cgardens Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @JsonInclude annotation contains below two values:
Include.NON_NULL: Indicates that only properties with not null values will be included in JSON.
Include.NON_EMPTY: Indicates that only properties that are not empty will be included in JSON.

How did you decide between NON_NULL and NON_EMPTY? @jdpgrailsdev any opinion on which would make more sense here? My mental model is that we want the following:

User:
  type: object
  properties:
    firstName:
      type: string
    lastName:
      type: string

I think we want this:

// object in memory => serialized object
{ firstName: "charles" } => { firstName: "charles" }
{ firstName: "charles", lastName: null } => { firstName: "charles", lastName: null }

I think the change you are making does this:

{ firstName: "charles" } => { firstName: "charles" }
{ firstName: "charles", lastName: null } => { firstName: "charles" }

The behavior before you change is this, right? Which we do not like.

{ firstName: "charles" } => { firstName: "charles", lastName: null }
{ firstName: "charles", lastName: null } => { firstName: "charles", lastName: null }

Basically, my sense is fields that are not set should not be serialized, but fields that are set to null values should be serialized, but my intuition could be off on this one. I can see an argument for the way you have done it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is possible with these annotations, but it seems like the ideal behavior would be that fields with nullable: true are included in the serialized JSON if they are explicitly set to something (either null or a real value), and they are not included if they are not set. But fields that do not have nullable: true should only ever be included in the serialized JSON if they are set to a non-null value, and they should not be included in the serialized JSON if they are set to null.

This behavior would always keep the openAPI spec consistent with the behavior of the API, while still allowing us to explicitly set fields to null when needed (if they are nullable)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

derp. agreed with what lake said. if we need it to be nullable then we should set it as such. okay. i think the implementation in this PR make sense.

Copy link
Contributor Author

@alafanechere alafanechere Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cgardens for the example! They are correct.

I think we want this:

// object in memory => serialized object
{ firstName: "charles" } => { firstName: "charles" }
{ firstName: "charles", lastName: null } => { firstName: "charles", lastName: null }

This is ideally what we want but I don't think it's achievable. When we create a model instance (from generated classes) the model attribute default to null in JAX-RS. It means that if we don't set the attribute on initialization the model instance will still have a null value for non-set attributes. So the default behavior will be:

// init params -> model instance in memory -> serialized json
{ firstName: "charles" } => { firstName: "Charles", lastName: "null" } =>  { firstName: "Charles", lastName: "null" } 
{ firstName: "charles", lastName: "null" } => { firstName: "Charles", lastName: "null" } =>  { firstName: "Charles", lastName: "null" } 

The change I made with the annotation leads to the following:

// init params -> model instance in memory -> serialized json
{ firstName: "charles" } => { firstName: "Charles", lastName: "null" } =>  { firstName: "Charles" } 
{ firstName: "charles", lastName: "null" } => { firstName: "Charles", lastName: "null" } =>  { firstName: "Charles"} 

Jackson cleans the object but without extra logic coming from the OpenAPI spec. It means that even if we set nullable = true on the lastName attribute we won't end up with a null value in the payload.
It answers @lmossman's question:

I'm not sure if it is possible with these annotations, but it seems like the ideal behavior would be that fields with nullable: true are included in the serialized JSON if they are explicitly set to something.

This is not possible with this implementation.

@cgardens @lmossman @jdpgrailsdev I would like to know if you are ok with this and want you to be aware that with this change the nullable = true won't be of any effect in the Open API spec. I am under the impression it is an acceptable compromise at the moment as we are not using this property but we might want to find a more flexible solution in the future. I don't know what this solution could be but it might be changing the generator from jaxrs-spec to another like spring 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, it sounds like we can't get the ideal behavior we want with the current libraries we are using then. In that case, especially since it seems we aren't currently using nullable fields, I think your current implementation is acceptable.

@alafanechere I think it would be helpful to add a comment for this line that summarizes what we discussed here, to help explain why we are setting it to NOT_NULL and what that means for nullable fields.

]
}
compileJava.dependsOn tasks.generateApiServer
Expand Down
11 changes: 0 additions & 11 deletions airbyte-api/src/main/openapi/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2323,7 +2323,6 @@ components:
type: string
CustomerioNotificationConfiguration:
type: object
nullable: true
NotificationType:
type: string
enum:
Expand Down Expand Up @@ -3141,7 +3140,6 @@ components:
sourceCatalogId:
type: string
format: uuid
nullable: true
WebBackendConnectionCreate:
type: object
required:
Expand Down Expand Up @@ -3222,7 +3220,6 @@ components:
sourceCatalogId:
type: string
format: uuid
nullable: true
WebBackendConnectionUpdate:
type: object
required:
Expand Down Expand Up @@ -3309,7 +3306,6 @@ components:
sourceCatalogId:
type: string
format: uuid
nullable: true
ConnectionSearch:
type: object
properties:
Expand Down Expand Up @@ -3387,7 +3383,6 @@ components:
ConnectionSchedule:
description: if null, then no schedule is set.
type: object
nullable: true
required:
- units
- timeUnit
Expand Down Expand Up @@ -3519,7 +3514,6 @@ components:
#- unnesting
OperatorDbt:
type: object
nullable: true
required:
- gitRepoUrl
properties:
Expand Down Expand Up @@ -3600,7 +3594,6 @@ components:
sourceDefinedCursor:
description: If the source defines the cursor field, then any other cursor field inputs will be ignored. If it does not, either the user_provided one is used, or the default one is used as a backup.
type: boolean
nullable: true
defaultCursorField:
description: Path to the field that will be used to determine if a record is new or modified since the last sync. If not provided by the source, the end user will have to specify the comparable themselves.
type: array
Expand All @@ -3615,7 +3608,6 @@ components:
type: string
namespace:
type: string
nullable: true
description: Optional Source-defined namespace. Airbyte streams from the same sources should have the same namespace. Currently only used by JDBC destinations to determine what schema to write to.
StreamJsonSchema:
type: object
Expand Down Expand Up @@ -3796,7 +3788,6 @@ components:
$ref: "#/components/schemas/AttemptFailureSummary"
AttemptStats:
type: object
nullable: true
properties:
recordsEmitted:
type: integer
Expand All @@ -3822,7 +3813,6 @@ components:
$ref: "#/components/schemas/AttemptStats"
AttemptFailureSummary:
type: object
nullable: true
required:
- failures
properties:
Expand Down Expand Up @@ -4036,7 +4026,6 @@ components:
ResourceRequirements:
description: optional resource requirements to run workers (blank for unbounded allocations)
type: object
nullable: true
properties:
cpu_request:
type: string
Expand Down