Skip to content

Speech api changes #267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Jul 12, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
89f445c
renamed from recognize to streaming recognize
puneithk Jul 3, 2016
1408b15
added v1beta1 proto
puneithk Jul 3, 2016
6ac06c5
added long running operation
puneithk Jul 4, 2016
34f300c
made changes to the script
puneithk Jul 4, 2016
259ed6a
renamed to sync recognize client
puneithk Jul 4, 2016
4415993
added sync changes and test
puneithk Jul 4, 2016
3892470
added sync changes and test
puneithk Jul 4, 2016
e1516d2
made changes to script and tests
puneithk Jul 4, 2016
ed629b6
deleted
puneithk Jul 4, 2016
20e6b88
made sync work
puneithk Jul 4, 2016
6f513ca
added async
puneithk Jul 4, 2016
f221210
added async samples
puneithk Jul 5, 2016
f9ac582
added async response unpack
puneithk Jul 5, 2016
2b949f6
fixed formatting
puneithk Jul 5, 2016
cecb3b9
removed function
puneithk Jul 11, 2016
f0e8755
removed stub and performed google java formatter
puneithk Jul 11, 2016
03b5c76
modified comment
puneithk Jul 11, 2016
c3b811e
applied DIP for channel
puneithk Jul 11, 2016
e81d249
DIP for channel
puneithk Jul 11, 2016
3976813
removed header comments
puneithk Jul 11, 2016
e56ffc9
ran google java format
puneithk Jul 11, 2016
e47d295
renamed to examples
puneithk Jul 11, 2016
e84a63f
moved to com.examples.cloud.speech
puneithk Jul 11, 2016
a2daba0
renamed to com.examples.cloud.speech
puneithk Jul 11, 2016
9e0563c
fixed path
puneithk Jul 11, 2016
5ba3e19
moved to examples for test
puneithk Jul 11, 2016
e957cd3
moved to examples
puneithk Jul 11, 2016
115bbf8
fixed path
puneithk Jul 11, 2016
e7fd065
added license
puneithk Jul 11, 2016
01a7fa8
moved proto
puneithk Jul 12, 2016
b5bacda
added proto source dir
puneithk Jul 12, 2016
5a6d2ca
added that sample in beta
puneithk Jul 12, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 23 additions & 8 deletions speech/grpc/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Cloud Speech API gRPC samples for Java

This is a sample repo for accessing the [Google Cloud Speech API](http://cloud.google.com/speech) with
[gRPC](http://www.grpc.io/) client library.

[gRPC](http://www.grpc.io/) client library. Note that these samples are for `advanced users` and is in
BETA. Please see [Google Cloud Platform Launch Stages](https://cloud.google.com/terms/launch-stages).

## Prerequisites

Expand Down Expand Up @@ -73,20 +73,35 @@ note that the audio file must be in RAW format. You can use `sox`
(available, e.g. via [http://sox.sourceforge.net/](http://sox.sourceforge.net/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README should mention that this is BETA software w/ a link to what that means: https://cloud.google.com/terms/launch-stages

It should also mention that this samples is for Advanced Users.

or [homebrew](http://brew.sh/)) to convert audio files to raw format.

### Run the non-streaming client
### Run the sync client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it work on Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know, is that a blocker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lesv we don't have the instructions anywhere otherwise and assuming users would have read everything before using this sample is not correct either. So we need them IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an issue for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue for windows or that we don't have instructions anywhere else?


You can run the batch client like this:
You can run the sync client like this:

```sh
$ bin/speech-sample-nonstreaming.sh --host=speech.googleapis.com --port=443 \
--file=<audio file path> --sampling=<sample rate>
$ bin/speech-sample-sync.sh --host=speech.googleapis.com --port=443 \
--uri=<audio file uri> --sampling=<sample rate>
```

Try a streaming rate of 16000 and the included sample audio file, as follows:

```sh
$ bin/speech-sample-nonstreaming.sh --host=speech.googleapis.com --port=443 \
--file=resources/audio.raw --sampling=16000
$ bin/speech-sample-sync.sh --host=speech.googleapis.com --port=443 \
--uri=resources/audio.raw --sampling=16000
```

### Run the async client

You can run the async client like this:

```sh
bin/speech-sample-async.sh --host=speech.googleapis.com --port=443 \
--uri=<audio file uri> --sampling=<sample rate>
```

Try a streaming rate of 16000 and the included sample audio file, as follows:
```sh
$ bin/speech-sample-async.sh --host=speech.googleapis.com --port=443 \
--uri=resources/audio.raw --sampling=16000
```

### Run the streaming client
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@

SRC_DIR=$(cd "$(dirname "$0")/.."; pwd)
java -cp ${SRC_DIR}/target/grpc-sample-1.0-jar-with-dependencies.jar \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is SRC_DIR necessary?

java -cp target/grpc-sample-1.0-jar-with-dependencies.jar \ perhaps?

Or even:
mvn exec:java -PAsyncRecognizeClient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows you to run the code from anywhere. Otherwise, you have to be in the correct directory to run the code. This is better!

com.google.cloud.speech.grpc.demos.NonStreamingRecognizeClient "$@"
com.examples.cloud.speech.AsyncRecognizeClient "$@"
2 changes: 1 addition & 1 deletion speech/grpc/bin/speech-sample-streaming.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@

SRC_DIR=$(cd "$(dirname "$0")/.."; pwd)
java -cp ${SRC_DIR}/target/grpc-sample-1.0-jar-with-dependencies.jar \
com.google.cloud.speech.grpc.demos.RecognizeClient "$@"
com.examples.cloud.speech.StreamingRecognizeClient "$@"
18 changes: 18 additions & 0 deletions speech/grpc/bin/speech-sample-sync.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

SRC_DIR=$(cd "$(dirname "$0")/.."; pwd)
java -cp ${SRC_DIR}/target/grpc-sample-1.0-jar-with-dependencies.jar \
com.examples.cloud.speech.SyncRecognizeClient "$@"
16 changes: 9 additions & 7 deletions speech/grpc/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ limitations under the License.
<url>https://cloud.google.com/speech/</url>
<inceptionYear>2016</inceptionYear>

<!-- Parent defines plugins for checkstyle and unit testing. -->
<parent>
<groupId>com.google.cloud</groupId>
<artifactId>shared-configuration</artifactId>
<version>1.0.0</version>
<relativePath>../../java-repo-tools</relativePath>
</parent>

<licenses>
<license>
<name>Apache 2</name>
Expand All @@ -38,13 +46,6 @@ limitations under the License.
<url>http://www.google.com</url>
</organization>

<parent>
<groupId>com.google.cloud</groupId>
<artifactId>doc-samples</artifactId>
<version>1.0.0</version>
<relativePath>../..</relativePath>
</parent>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
Expand Down Expand Up @@ -195,6 +196,7 @@ limitations under the License.
-->
<protocArtifact>com.google.protobuf:protoc:3.0.0-beta-2:exe:${os.detected.classifier}</protocArtifact>
<pluginId>grpc-java</pluginId>
<protoSourceRoot>${basedir}/src/main/java/third_party</protoSourceRoot>
<pluginArtifact>io.grpc:protoc-gen-grpc-java:0.13.2:exe:${os.detected.classifier}</pluginArtifact>
</configuration>
<executions>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
/*
* Copyright 2016 Google Inc. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.examples.cloud.speech;

import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.speech.v1beta1.AsyncRecognizeRequest;
import com.google.cloud.speech.v1beta1.AsyncRecognizeResponse;
import com.google.cloud.speech.v1beta1.RecognitionAudio;
import com.google.cloud.speech.v1beta1.RecognitionConfig;
import com.google.cloud.speech.v1beta1.RecognitionConfig.AudioEncoding;
import com.google.cloud.speech.v1beta1.SpeechGrpc;

import com.google.longrunning.GetOperationRequest;
import com.google.longrunning.Operation;
import com.google.longrunning.OperationsGrpc;

import io.grpc.ManagedChannel;
import io.grpc.StatusRuntimeException;
import io.grpc.auth.ClientAuthInterceptor;
import io.grpc.netty.NegotiationType;
import io.grpc.netty.NettyChannelBuilder;

import org.apache.commons.cli.CommandLine;
import org.apache.commons.cli.CommandLineParser;
import org.apache.commons.cli.DefaultParser;
import org.apache.commons.cli.OptionBuilder;
import org.apache.commons.cli.Options;
import org.apache.commons.cli.ParseException;

import java.io.IOException;
import java.net.URI;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;

/**
* Client that sends audio to Speech.AsyncRecognize and returns transcript.
*/
public class AsyncRecognizeClient {

private static final Logger logger = Logger.getLogger(AsyncRecognizeClient.class.getName());

private static final List<String> OAUTH2_SCOPES =
Arrays.asList("https://www.googleapis.com/auth/cloud-platform");

private final URI input;
private final int samplingRate;

private final ManagedChannel channel;
private final SpeechGrpc.SpeechBlockingStub speechClient;
private final OperationsGrpc.OperationsBlockingStub statusClient;

/**
* Construct client connecting to Cloud Speech server at {@code host:port}.
*/
public AsyncRecognizeClient(ManagedChannel channel, URI input, int samplingRate)
throws IOException {
this.input = input;
this.samplingRate = samplingRate;
this.channel = channel;

speechClient = SpeechGrpc.newBlockingStub(channel);
statusClient = OperationsGrpc.newBlockingStub(channel);
}

public void shutdown() throws InterruptedException {
channel.shutdown().awaitTermination(5, TimeUnit.SECONDS);
}

public static ManagedChannel createChannel(String host, int port) throws IOException {
GoogleCredentials creds = GoogleCredentials.getApplicationDefault();
creds = creds.createScoped(OAUTH2_SCOPES);
ManagedChannel channel =
NettyChannelBuilder.forAddress(host, port)
.negotiationType(NegotiationType.TLS)
.intercept(new ClientAuthInterceptor(creds, Executors.newSingleThreadExecutor()))
.build();

return channel;
}

/**
* Sends a request to the speech API and returns an Operation handle.
*/
public void recognize() {
RecognitionAudio audio;
try {
audio = RecognitionAudioFactory.createRecognitionAudio(this.input);
} catch (IOException e) {
logger.log(Level.WARNING, "Failed to read audio uri input: " + input);
return;
}
logger.info("Sending " + audio.getContent().size() + " bytes from audio uri input: " + input);
RecognitionConfig config =
RecognitionConfig.newBuilder()
.setEncoding(AudioEncoding.LINEAR16)
.setSampleRate(samplingRate)
.build();
AsyncRecognizeRequest request =
AsyncRecognizeRequest.newBuilder().setConfig(config).setAudio(audio).build();

Operation operation;
Operation status;
try {
operation = speechClient.asyncRecognize(request);

// Print the long running operation handle
logger.log(
Level.INFO,
String.format("Operation handle: %s, URI: %s", operation.getName(), input.toString()));
} catch (StatusRuntimeException e) {
logger.log(Level.WARNING, "RPC failed: {0}", e.getStatus());
return;
}

while (true) {
try {
logger.log(Level.INFO, "Waiting 2s for operation, {0} processing...", operation.getName());
Thread.sleep(2000);
GetOperationRequest operationReq =
GetOperationRequest.newBuilder().setName(operation.getName()).build();
status =
statusClient.getOperation(
GetOperationRequest.newBuilder().setName(operation.getName()).build());

if (status.getDone()) {
break;
}
} catch (Exception ex) {
logger.log(Level.WARNING, ex.getMessage());
}
}

try {
AsyncRecognizeResponse asyncRes = status.getResponse().unpack(AsyncRecognizeResponse.class);

logger.info("Received response: " + asyncRes);
} catch (com.google.protobuf.InvalidProtocolBufferException ex) {
logger.log(Level.WARNING, "Unpack error, {0}", ex.getMessage());
}
}

public static void main(String[] args) throws Exception {

String audioFile = "";
String host = "speech.googleapis.com";
Integer port = 443;
Integer sampling = 16000;

CommandLineParser parser = new DefaultParser();

Options options = new Options();
options.addOption(
OptionBuilder.withLongOpt("uri")
.withDescription("path to audio uri")
.hasArg()
.withArgName("FILE_PATH")
.create());
options.addOption(
OptionBuilder.withLongOpt("host")
.withDescription("endpoint for api, e.g. speech.googleapis.com")
.hasArg()
.withArgName("ENDPOINT")
.create());
options.addOption(
OptionBuilder.withLongOpt("port")
.withDescription("SSL port, usually 443")
.hasArg()
.withArgName("PORT")
.create());
options.addOption(
OptionBuilder.withLongOpt("sampling")
.withDescription("Sampling Rate, i.e. 16000")
.hasArg()
.withArgName("RATE")
.create());

try {
CommandLine line = parser.parse(options, args);
if (line.hasOption("uri")) {
audioFile = line.getOptionValue("uri");
} else {
System.err.println("An Audio uri must be specified (e.g. file:///foo/baz.raw).");
System.exit(1);
}

if (line.hasOption("host")) {
host = line.getOptionValue("host");
} else {
System.err.println("An API enpoint must be specified (typically speech.googleapis.com).");
System.exit(1);
}

if (line.hasOption("port")) {
port = Integer.parseInt(line.getOptionValue("port"));
} else {
System.err.println("An SSL port must be specified (typically 443).");
System.exit(1);
}

if (line.hasOption("sampling")) {
sampling = Integer.parseInt(line.getOptionValue("sampling"));
} else {
System.err.println("An Audio sampling rate must be specified.");
System.exit(1);
}
} catch (ParseException exp) {
System.err.println("Unexpected exception:" + exp.getMessage());
System.exit(1);
}

ManagedChannel channel = AsyncRecognizeClient.createChannel(host, port);

AsyncRecognizeClient client =
new AsyncRecognizeClient(channel, URI.create(audioFile), sampling);
try {
client.recognize();
} finally {
client.shutdown();
}
}
}
Loading