fix race in mock publisher #1560

pongad · 2017-01-24T06:28:33Z

FakePublisherServiceImpl::publish has a race.
If the call to publish happen before a response can be placed,
the server will try to respond with a null object.

The fix is to either

always set the response before calling
make publish wait for the response
For propriety, this commit does both.

This fix reveals another flake.
Publisher uses exponential backoff with jitter.
The jitter randomly picks a number between 0 and a maximum.
If we pick low values too many times,
it will retry too often and the server will run out of canned
transient errors to respond back with.
The test still passed since it expected any Throwable.

This commit fixed the test to expect FakeException,
and use a fake "random" number generator to ensure
we don't run out of return values.

Retrying can still causes random test failures,
independently of above changes.
If a request fails due to DEADLINE_EXCEEDED,
the future is completed with a corresponding error.
However, the last RPC might not have been successfully cancelled.
When a new test starts, it gives canned response to the server.
The server might use some of these responses to respond to
RPCs of previous tests.
Consequently, a misbehaving test can fail every test that comes
after it.
This commit changes the test setup code so that it
creates a new fake server for every test to avoid this problem.

FakePublisherServiceImpl::publish has a race. If the call to publish happen before a response can be placed, the server will try to respond with a null object. The fix is to either - always set the response before calling - make publish wait for the response For propriety, this commit does both. This fix reveals another flake. Publisher uses exponential backoff with jitter. The jitter randomly picks a number between 0 and a maximum. If we pick low values too many times, it will retry too often and the server will run out of canned transient errors to respond back with. The test still passed since it expected any Throwable. This commit fixed the test to expect FakeException, set the jitter to random in range (max/2, max), and increases the number of canned errors to compensate. Retrying can still causes random test failures, independently of above changes. If a request fails due to DEADLINE_EXCEEDED, the future is completed with a corresponding error. However, the last RPC might not have been successfully cancelled. When a new test starts, it gives canned response to the server. The server might use some of these responses to respond to RPCs of previous tests. Consequently, a misbehaving test can fail every test that comes after it. This commit changes the test setup code so that it creates a new fake server for every test to avoid this problem.

pongad · 2017-01-24T06:30:04Z

cc @davidtorres , we still seem to have flakes in SubscriberImplTest. I'll take a look tomorrow.
@garrettjonesgoogle Do we want to make retry jitter in gax (max/2, max) instead of (0, max)? Please see the motivation in the paragraph below the bullets.

coveralls · 2017-01-24T06:36:46Z

Coverage remained the same at 83.189% when pulling 0707971 on pongad:sync-reply-pub into c68968b on GoogleCloudPlatform:pubsub-hp.

garrettjonesgoogle · 2017-01-24T19:07:13Z

Couldn't you also mock the jitter, or the random number provider?

pongad · 2017-01-25T07:25:35Z

@garrettjonesgoogle Good idea. PTAL

coveralls · 2017-01-25T07:42:50Z

Coverage decreased (-0.1%) to 83.081% when pulling 8661957 on pongad:sync-reply-pub into c68968b on GoogleCloudPlatform:pubsub-hp.

garrettjonesgoogle · 2017-01-25T19:25:22Z

LGTM, after you update the PR description to match the latest logic.

pongad · 2017-01-25T22:29:23Z

Description fixed in PR and commit

pongad assigned garrettjonesgoogle Jan 24, 2017

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 24, 2017

PR comments

8661957

pongad merged commit 10bedac into googleapis:pubsub-hp Jan 25, 2017

pongad deleted the sync-reply-pub branch January 25, 2017 22:29

pongad mentioned this pull request Mar 13, 2017

What to do with experimental client? GoogleCloudPlatform/pubsub#54

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix race in mock publisher #1560

fix race in mock publisher #1560

Uh oh!

pongad commented Jan 24, 2017 •

edited

Loading

Uh oh!

pongad commented Jan 24, 2017

Uh oh!

coveralls commented Jan 24, 2017

Uh oh!

garrettjonesgoogle commented Jan 24, 2017

Uh oh!

pongad commented Jan 25, 2017 •

edited

Loading

Uh oh!

coveralls commented Jan 25, 2017

Uh oh!

garrettjonesgoogle commented Jan 25, 2017

Uh oh!

pongad commented Jan 25, 2017

Uh oh!

Uh oh!

fix race in mock publisher #1560

fix race in mock publisher #1560

Uh oh!

Conversation

pongad commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pongad commented Jan 24, 2017

Uh oh!

coveralls commented Jan 24, 2017

Uh oh!

garrettjonesgoogle commented Jan 24, 2017

Uh oh!

pongad commented Jan 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jan 25, 2017

Uh oh!

garrettjonesgoogle commented Jan 25, 2017

Uh oh!

pongad commented Jan 25, 2017

Uh oh!

Uh oh!

pongad commented Jan 24, 2017 •

edited

Loading

pongad commented Jan 25, 2017 •

edited

Loading