Skip to content

Commit 3b57817

Browse files
feat: allow users to set Apache Avro output format options through avro_serialization_options param in TableReadOptions message (#284)
* feat: allow users to set Apache Avro output format options through avro_serialization_options param in TableReadOptions message Through AvroSerializationOptions, users can set enable_display_name_attribute, which populates displayName for every avro field with the original column name Improved documentation for selected_fields, added example for clarity. PiperOrigin-RevId: 468290142 Source-Link: googleapis/googleapis@62ae1af Source-Link: googleapis/googleapis-gen@732b7f9 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNzMyYjdmOTIyNDc3ZDI1MzI4YjkyMzU5ZjA2NjdmZTk1ZGU1MmZhMiJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
1 parent 7bd13e7 commit 3b57817

File tree

5 files changed

+402
-8
lines changed

5 files changed

+402
-8
lines changed

packages/google-cloud-bigquery-storage/protos/google/cloud/bigquery/storage/v1/avro.proto

+15
Original file line numberDiff line numberDiff line change
@@ -39,3 +39,18 @@ message AvroRows {
3939
// Please use the format-independent ReadRowsResponse.row_count instead.
4040
int64 row_count = 2 [deprecated = true];
4141
}
42+
43+
// Contains options specific to Avro Serialization.
44+
message AvroSerializationOptions {
45+
// Enable displayName attribute in Avro schema.
46+
//
47+
// The Avro specification requires field names to be alphanumeric. By
48+
// default, in cases when column names do not conform to these requirements
49+
// (e.g. non-ascii unicode codepoints) and Avro is requested as an output
50+
// format, the CreateReadSession call will fail.
51+
//
52+
// Setting this field to true, populates avro field names with a placeholder
53+
// value and populates a "displayName" attribute for every avro field with the
54+
// original column name.
55+
bool enable_display_name_attribute = 1;
56+
}

packages/google-cloud-bigquery-storage/protos/google/cloud/bigquery/storage/v1/stream.proto

+50-4
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,53 @@ message ReadSession {
5959

6060
// Options dictating how we read a table.
6161
message TableReadOptions {
62-
// Names of the fields in the table that should be read. If empty, all
63-
// fields will be read. If the specified field is a nested field, all
64-
// the sub-fields in the field will be selected. The output field order is
65-
// unrelated to the order of fields in selected_fields.
62+
// Optional. The names of the fields in the table to be returned. If no
63+
// field names are specified, then all fields in the table are returned.
64+
//
65+
// Nested fields -- the child elements of a STRUCT field -- can be selected
66+
// individually using their fully-qualified names, and will be returned as
67+
// record fields containing only the selected nested fields. If a STRUCT
68+
// field is specified in the selected fields list, all of the child elements
69+
// will be returned.
70+
//
71+
// As an example, consider a table with the following schema:
72+
//
73+
// {
74+
// "name": "struct_field",
75+
// "type": "RECORD",
76+
// "mode": "NULLABLE",
77+
// "fields": [
78+
// {
79+
// "name": "string_field1",
80+
// "type": "STRING",
81+
// . "mode": "NULLABLE"
82+
// },
83+
// {
84+
// "name": "string_field2",
85+
// "type": "STRING",
86+
// "mode": "NULLABLE"
87+
// }
88+
// ]
89+
// }
90+
//
91+
// Specifying "struct_field" in the selected fields list will result in a
92+
// read session schema with the following logical structure:
93+
//
94+
// struct_field {
95+
// string_field1
96+
// string_field2
97+
// }
98+
//
99+
// Specifying "struct_field.string_field1" in the selected fields list will
100+
// result in a read session schema with the following logical structure:
101+
//
102+
// struct_field {
103+
// string_field1
104+
// }
105+
//
106+
// The order of the fields in the read session schema is derived from the
107+
// table schema and does not correspond to the order in which the fields are
108+
// specified in this list.
66109
repeated string selected_fields = 1;
67110

68111
// SQL text filtering statement, similar to a WHERE clause in a query.
@@ -80,6 +123,9 @@ message ReadSession {
80123
oneof output_format_serialization_options {
81124
// Optional. Options specific to the Apache Arrow output format.
82125
ArrowSerializationOptions arrow_serialization_options = 3 [(google.api.field_behavior) = OPTIONAL];
126+
127+
// Optional. Options specific to the Apache Avro output format
128+
AvroSerializationOptions avro_serialization_options = 4 [(google.api.field_behavior) = OPTIONAL];
83129
}
84130
}
85131

packages/google-cloud-bigquery-storage/protos/protos.d.ts

+97-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)