You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* This needs to be set to ‘true’ explicitly and audio_channel_count > 1
165
+
* to get each channel recognized separately. The recognition result will
166
+
* contain a channel_tag field to state which channel that result belongs to.
167
+
* If this is not ‘true’, we will only recognize the first channel.
168
+
* NOTE: The request is also billed cumulatively for all channels recognized:
169
+
* (audio_channel_count times the audio length)
170
+
*
152
171
* @property {string} languageCode
153
172
* *Required* The language of the supplied audio as a
154
173
* [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag.
155
174
* Example: "en-US".
156
175
* See [Language Support](https://cloud.google.com/speech/docs/languages)
157
176
* for a list of the currently supported language codes.
158
177
*
178
+
* @property {string[]} alternativeLanguageCodes
179
+
* *Optional* A list of up to 3 additional
180
+
* [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tags,
181
+
* listing possible alternative languages of the supplied audio.
182
+
* See [Language Support](https://cloud.google.com/speech/docs/languages)
183
+
* for a list of the currently supported language codes.
184
+
* If alternative languages are listed, recognition result will contain
185
+
* recognition in the most likely language detected including the main
186
+
* language_code. The recognition result will include the language tag
187
+
* of the language detected in the audio.
188
+
* NOTE: This feature is only supported for Voice Command and Voice Search
189
+
* use cases and performance may vary for other use cases (e.g., phone call
190
+
* transcription).
191
+
*
159
192
* @property {number} maxAlternatives
160
193
* *Optional* Maximum number of recognition hypotheses to be returned.
161
194
* Specifically, the maximum number of `SpeechRecognitionAlternative` messages
@@ -181,6 +214,11 @@ var StreamingRecognitionConfig = {
181
214
* `false`, no word-level time offset information is returned. The default is
182
215
* `false`.
183
216
*
217
+
* @property {boolean} enableWordConfidence
218
+
* *Optional* If `true`, the top result includes a list of words and the
219
+
* confidence for those words. If `false`, no word-level confidence
220
+
* information is returned. The default is `false`.
221
+
*
184
222
* @property {boolean} enableAutomaticPunctuation
185
223
* *Optional* If 'true', adds punctuation to recognition result hypotheses.
186
224
* This feature is only available in select languages. Setting this for
@@ -190,6 +228,21 @@ var StreamingRecognitionConfig = {
190
228
* to all users. In the future this may be exclusively available as a
191
229
* premium feature."
192
230
*
231
+
* @property {boolean} enableSpeakerDiarization
232
+
* *Optional* If 'true', enables speaker detection for each recognized word in
233
+
* the top alternative of the recognition result using a speaker_tag provided
234
+
* in the WordInfo.
235
+
* Note: When this is true, we send all the words from the beginning of the
236
+
* audio for the top alternative in every consecutive responses.
237
+
* This is done in order to improve our speaker tags as our models learn to
238
+
* identify the speakers in the conversation over time.
239
+
*
240
+
* @property {number} diarizationSpeakerCount
241
+
* *Optional*
242
+
* If set, specifies the estimated number of speakers in the conversation.
243
+
* If not set, defaults to '2'.
244
+
* Ignored unless enable_speaker_diarization is set to true."
245
+
*
193
246
* @property {Object} metadata
194
247
* *Optional* Metadata regarding this request.
195
248
*
@@ -797,6 +850,17 @@ var StreamingRecognizeResponse = {
797
850
* This field is only provided for interim results (`is_final=false`).
798
851
* The default of 0.0 is a sentinel value indicating `stability` was not set.
799
852
*
853
+
* @property {number} channelTag
854
+
* For multi-channel audio, this is the channel number corresponding to the
855
+
* recognized result for the audio from that channel.
856
+
* For audio_channel_count = N, its output values can range from '1' to 'N'.
857
+
*
858
+
* @property {string} languageCode
859
+
* Output only. The
860
+
* [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag of the
861
+
* language in this result. This language code was detected to have the most
862
+
* likelihood of being spoken in the audio.
863
+
*
800
864
* @typedef StreamingRecognitionResult
801
865
* @memberof google.cloud.speech.v1p1beta1
802
866
* @see [google.cloud.speech.v1p1beta1.StreamingRecognitionResult definition in proto format]{@link https://github.com/googleapis/googleapis/blob/master/google/cloud/speech/v1p1beta1/cloud_speech.proto}
@@ -816,6 +880,17 @@ var StreamingRecognitionResult = {
816
880
*
817
881
* This object should have the same structure as [SpeechRecognitionAlternative]{@link google.cloud.speech.v1p1beta1.SpeechRecognitionAlternative}
818
882
*
883
+
* @property {number} channelTag
884
+
* For multi-channel audio, this is the channel number corresponding to the
885
+
* recognized result for the audio from that channel.
886
+
* For audio_channel_count = N, its output values can range from '1' to 'N'.
887
+
*
888
+
* @property {string} languageCode
889
+
* Output only. The
890
+
* [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag of the
891
+
* language in this result. This language code was detected to have the most
892
+
* likelihood of being spoken in the audio.
893
+
*
819
894
* @typedef SpeechRecognitionResult
820
895
* @memberof google.cloud.speech.v1p1beta1
821
896
* @see [google.cloud.speech.v1p1beta1.SpeechRecognitionResult definition in proto format]{@link https://github.com/googleapis/googleapis/blob/master/google/cloud/speech/v1p1beta1/cloud_speech.proto}
@@ -880,6 +955,22 @@ var SpeechRecognitionAlternative = {
880
955
* @property {string} word
881
956
* Output only. The word corresponding to this set of information.
882
957
*
958
+
* @property {number} confidence
959
+
* Output only. The confidence estimate between 0.0 and 1.0. A higher number
960
+
* indicates an estimated greater likelihood that the recognized words are
961
+
* correct. This field is set only for the top alternative of a non-streaming
962
+
* result or, of a streaming result where `is_final=true`.
963
+
* This field is not guaranteed to be accurate and users should not rely on it
964
+
* to be always provided.
965
+
* The default of 0.0 is a sentinel value indicating `confidence` was not set.
966
+
*
967
+
* @property {number} speakerTag
968
+
* Output only. A distinct integer value is assigned for every speaker within
969
+
* the audio. This field specifies which one of those speakers was detected to
970
+
* have spoken this word. Value ranges from '1' to diarization_speaker_count.
971
+
* speaker_tag is set if enable_speaker_diarization = 'true' and only in the
972
+
* top alternative.
973
+
*
883
974
* @typedef WordInfo
884
975
* @memberof google.cloud.speech.v1p1beta1
885
976
* @see [google.cloud.speech.v1p1beta1.WordInfo definition in proto format]{@link https://github.com/googleapis/googleapis/blob/master/google/cloud/speech/v1p1beta1/cloud_speech.proto}
0 commit comments