Skip to content

[Feature Request]: Update the default behaviour in BigQueryIO.Write.Method.DEFAULT to STORAGE_API_AT_LEAST_ONCE #31827

@borjavb

Description

@borjavb

What would you like to happen?

The default behaviour of BigQueryIO.Write.Method for unbounded collections is to use STREAMING_INSERTS, which is now categorised as legacy .

Two new methods STORAGE_API_AT_LEAST_ONCE and STORAGE_WRITE_API are available, being STORAGE_API_AT_LEAST_ONCE the closest in the underlying semantics to STREAMING_INSERTS (best effort deduplication but no guarantees of only once). Using the storage API is also cheaper than the legacy streaming inserts by 50%, with the first 2TB free..

Should the default method point to STORAGE_API_AT_LEAST_ONCE instead of keep using STREAMING_INSERTS?

Issue Priority

Priority: 3 (nice-to-have improvement)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions