Skip to content

Commit 8fb0bf3

Browse files
authored
test: fix concurrency issue in testThrottlingBlocking (#2001)
At the end of the test, it verifies that the batcher waited more than throttled time by capturing the argument of `callContext.withOption()`. There were two problems: # "Zero interactions with this mock" problem ``` Error: Failures: Error: BatcherImplTest.testThrottlingBlocking:899 apiCallContext.withOption( <Capturing argument>, <Capturing argument> ); Wanted 1 time: -> at com.google.api.gax.batching.BatcherImplTest.testThrottlingBlocking(BatcherImplTest.java:899) But was 0 times. ``` The callContext.withOption method is called by `BatcherImpl.sendOutstanding()` method via `BatcherImpl$PushCurrentBatchRunnable` in **another thread**. https://togithub.com/googleapis/sdk-platform-java/blob/4c741077d614093d08665e9ddd83fb0e332b7881/gax-java/gax/src/main/java/com/google/api/gax/batching/BatcherImpl.java#L206 Technically, there's no guarantee that the thread calls the withOption method within a certain timeframe, especially when we run the tests concurrently in a machine with many CPU cores (the nightly test setup). That's why we occasionally saw "Zero interactions with this mock" problem. In my experiment (https://togithub.com/suztomo/gax-batcher-impl-test-reproducer/blob/main/README.md), increasing 100 ms timeout to 1000 ms just worked good to prevent false positives. The assertion is not about timing, but about BatcherImpl recording the throttled time correctly. After fixing this problem, then I saw another problem below. # "expected to be at least: 50 but was: 48" problem ``` [ERROR] com.google.api.gax.batching.BatcherImplTest.testThrottlingBlocking_gax_test06 -- Time elapsed: 5.138 s <<< FAILURE! expected to be at least: 50 but was : 48 at com.google.api.gax.batching.BatcherImplTest.testThrottlingBlocking_gax_test06(BatcherImplTest.java:926) ``` At the end of the test, it verifies that the batcher waited more than throttled time by capturing the argument of `callContext.withOption()`. In ideal, usual scenarios, 1. In thread A, the batcher blocks at `batcher.add` (because of flowController) the thread and starts stopwatch to measure throttled time. 2. In thread B (created by the BatcherImplTest) sleeps 50 ms (`throttledTime`) 3. In thread B, BatcherImplTest calls `flowController.release()`, which wakes up the thread A. 4. The batcher records it was blocked more than 50 ms. However, in rare cases, these events happens in this order: 1. Thread B starts sleeping 50 ms. 2. In thread A the batcher blocks at `batcher.add` (because of flowController) and starts stopwatch to measure throttled time. 3. In thread B, after 50 ms sleep, BatcherImplTest calls `flowController.release()`, which wakes up the thread A. 5. The batcher records it was blocked **less than** 50 ms. This makes the test assertion fail with "_expected to be at least: 50 but was: 48_" message. To prevent this rare case, BatcherImplTest needs to wait before sleeping 50 ms in thread B. In this pull request, I add a do-while loop to check the thread A (the thread of `batcher.add`) is in Thread.State.WAITING state before making the sleeping of 50 ms. - [x] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/gapic-generator-java/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [x] Ensure the tests and linter pass - [x] Code coverage does not decrease (if any source code was changed) - [x] Appropriate docs were updated (if necessary) Fixes #1193 ☕️
1 parent 2ac1547 commit 8fb0bf3

File tree

1 file changed

+25
-5
lines changed

1 file changed

+25
-5
lines changed

gax-java/gax/src/test/java/com/google/api/gax/batching/BatcherImplTest.java

+25-5
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
import com.google.common.collect.ImmutableList;
5252
import com.google.common.collect.Queues;
5353
import java.util.ArrayList;
54+
import java.util.Collections;
5455
import java.util.List;
5556
import java.util.Queue;
5657
import java.util.concurrent.Callable;
@@ -830,7 +831,7 @@ public void testConstructors() throws InterruptedException {
830831
}
831832
}
832833

833-
@Test
834+
@Test(timeout = 60000)
834835
public void testThrottlingBlocking() throws Exception {
835836
BatchingSettings settings =
836837
BatchingSettings.newBuilder()
@@ -862,16 +863,28 @@ public void testThrottlingBlocking() throws Exception {
862863
flowController,
863864
callContext)) {
864865
flowController.reserve(1, 1);
866+
List<Thread> batcherAddThreadHolder = Collections.synchronizedList(new ArrayList<>());
865867
Future future =
866868
executor.submit(
867869
new Runnable() {
868870
@Override
869871
public void run() {
872+
batcherAddThreadHolder.add(Thread.currentThread());
870873
batcher.add(1);
871874
}
872875
});
873-
// Add a little delay ensuring that the next step starts after batcher.add(1)
874-
Thread.sleep(10);
876+
877+
// Wait until batcher.add blocks (Thread.State.WAITING) and the batcher starts the
878+
// stopwatch for total_throttled_time. Without this proper waiting, the
879+
// Thread.sleep(throttledTime) below may start before the stopwatch starts,
880+
// resulting in a shorter total_throttled_time at the verification of throttledTime
881+
// at the end of the test.
882+
// https://github.com/googleapis/sdk-platform-java/issues/1193
883+
do {
884+
Thread.sleep(10);
885+
} while (batcherAddThreadHolder.isEmpty()
886+
|| batcherAddThreadHolder.get(0).getState() != Thread.State.WAITING);
887+
875888
executor.submit(
876889
() -> {
877890
try {
@@ -895,11 +908,18 @@ public void run() {
895908
}
896909

897910
// Mockito recommends using verify() as the ONLY way to interact with Argument
898-
// captors - otherwise it may incur in unexpected behaviour
899-
Mockito.verify(callContext, Mockito.timeout(100)).withOption(key.capture(), value.capture());
911+
// captors - otherwise it may incur in unexpected behaviour.
912+
// The callContext.withOption method is called by BatcherImpl.sendOutstanding() method via
913+
// BatcherImpl$PushCurrentBatchRunnable in another thread. Technically, there's no guarantee
914+
// that the thread calls the withOption method within a certain timeframe. 1000 ms just works
915+
// fine to prevent false positives.
916+
// https://github.com/googleapis/sdk-platform-java/issues/1193
917+
Mockito.verify(callContext, Mockito.timeout(1000)).withOption(key.capture(), value.capture());
900918

901919
// Verify that throttled time is recorded in ApiCallContext
902920
assertThat(key.getValue()).isSameInstanceAs(Batcher.THROTTLED_TIME_KEY);
921+
// Because this test waited for throttledTime before flowController.release() method,
922+
// the recorded total_throttled_time should be higher than or equal to that.
903923
assertThat(value.getValue()).isAtLeast(throttledTime);
904924
} finally {
905925
executor.shutdownNow();

0 commit comments

Comments
 (0)