Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stork config leaks between instances in a continuous testing run #47178

Open
holly-cummins opened this issue Apr 4, 2025 · 12 comments · May be fixed by #47209
Open

Stork config leaks between instances in a continuous testing run #47178

holly-cummins opened this issue Apr 4, 2025 · 12 comments · May be fixed by #47209

Comments

@holly-cummins
Copy link
Contributor

holly-cummins commented Apr 4, 2025

Describe the bug

This reproduces with 3.20.0, so it's not caused by the test classloading rewrite. :)
It also reproduces on main, and #47180 exposes it for the mvn verify case, which is of course worse.

Steps to reproduce

  • check out superheroes

  • cd rest-fights

  • Edit HeroesVillainsNarrationWiremockServerResource and delete the quarkus.stork.hero-service.service-discovery.address-list config override, so that the application is always reading config from application.properties. I don't think this step is strictly necessary, but it makes reproducing and searching easier, because the 'wrong' port is a predictable one.

  • Now if you run quarkus test you'll get 4-12 failures. on 3.21.0 you will see 8 pre-existing failures, caused by Continous testing fails for superheroes rest-fights #47177, main will be a bit less red. But in either case there will be extra, expected, failures caused by the fact that you're trying to connect to an endpoint you haven't mocked. You can search the logs for 8083 for those.

  • Then edit application.properties and change 8083 to 1234. If you look at the logs, you'll get messages saying "I can't connect to port 1234" (expected), but some of the messages will still reference port 8083. Nothing should know about 8083, but something does.

To get a smaller reproducer, you can delete everything except HeroClientTest.

Stack traces

09:29:45 ERROR [io.qu.test] (Test runner thread) ==================== TEST REPORT #3 ==================== [Error Occurred After Shutdown]
09:29:45 ERROR [io.qu.test] (Test runner thread) Test HeroClientTests#helloHeroes() failed 
 [Error Occurred After Shutdown]: java.lang.AssertionError: Expected an item event but got a failure: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8083
        at io.smallrye.mutiny.helpers.test.UniAssertSubscriber.awaitItem(UniAssertSubscriber.java:165)
        at io.quarkus.sample.superheroes.fight.client.HeroClientTests.helloHeroes(HeroClientTests.java:187)


09:29:45 ERROR [io.qu.test] (Test runner thread) Test HeroClientTests#doesntRecoverFrom500() failed 
 [Error Occurred After Shutdown]: java.lang.AssertionError: 
Received a failure event, but expecting:
  <io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:1234>
to be an instance of:
  <class jakarta.ws.rs.WebApplicationException>
but was:
  <io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:1234
Caused by: java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)

Why do I think stork is involved?

If you stop the application and change

quarkus.rest-client.hero-client.url=stork://hero-service

to

quarkus.rest-client.hero-client.url=http://localhost:4321

and then run quarkus test, you get expected failures references 4321. If you then change 4321 to 8888 in application.properties and save, there are no more references to 4321 in the new logging.

Copy link

quarkus-bot bot commented Apr 4, 2025

/cc @aureamunoz (stork), @cescoffier (stork), @geoand (continuous-testing), @radcortez (config)

@cescoffier
Copy link
Member

Stork uses a static field for the singleton. We would need to check if the instance is re-initialized correctly.

@holly-cummins
Copy link
Contributor Author

holly-cummins commented Apr 4, 2025

Following @cescoffier's tip, I looked at StorkConfigProvider and got all excited, because it has a static list of config:

    private static final List<ServiceConfig> serviceConfigs = new ArrayList<>();

on initialisation, it adds to that list, without clearing it first, so the old and new config are both present after a continuous testing restart. That seems bad, but ... fixing that with serviceConfigs.clear() on init doesn't fix the attempts to connect using the old config, so it's something else that's causing what I'm seeing.

@cescoffier
Copy link
Member

I think we also keep the Stork instance: https://github.com/smallrye/smallrye-stork/blob/main/core/src/main/java/io/smallrye/stork/Stork.java#L264

@holly-cummins
Copy link
Contributor Author

I tried Stork.shutdown(); before the initialize in SmallRyeStorkRecorder, and no different. Very annoying, because all these statics look like exactly the problem. Maybe there's one more I/we haven't found yet.

@cescoffier
Copy link
Member

Do you have a reproducer? I can have a look this weekend.

@holly-cummins
Copy link
Contributor Author

I've pushed https://github.com/holly-cummins/quarkus-super-heroes/tree/47178-reproducer which saves doing the edits in the instructions above. It should be sufficient to run quarkus test in rest-fights, edit the quarkus.stork.hero-service.service-discovery.address-list in application.properties, and look at the port numbers in the error messages.

If you want to reproduce with mvn verify instead of quarkus test, uncomment the disabled config in HeroesVillainsNarrationWiremockServerResource, and also add a plain test file into the suite, something like this:

@QuarkusTest
public class DummyTest {
  @Test
  public void hi() {};
}

and rebuild quarkus to include #47180.

@cescoffier
Copy link
Member

Will try with the one mentioned in the description

@cescoffier
Copy link
Member

Hum, I wanted to look, but it downloaded too many images from Docker Hub, so I hit the quota (even if I'm logged in). I will need to give it another attempt in a few hours.

@cescoffier
Copy link
Member

Ok, I've a fix. Unfortunately, it requires a fix in Stork and one in Quarkus (Stork extension).

Ths first part is, and I believe @holly-cummins found it, a problem in the Quarkus StorkConfigProvider - the list is not cleared during the initialization of the method:

Subject: [PATCH] Clear the list of config during the initialization to avoid leaks during continuous testing
---
Index: extensions/smallrye-stork/runtime/src/main/java/io/quarkus/stork/StorkConfigProvider.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/extensions/smallrye-stork/runtime/src/main/java/io/quarkus/stork/StorkConfigProvider.java b/extensions/smallrye-stork/runtime/src/main/java/io/quarkus/stork/StorkConfigProvider.java
--- a/extensions/smallrye-stork/runtime/src/main/java/io/quarkus/stork/StorkConfigProvider.java	(revision 30f2f553c08a814467f938d4ea76c3c6d8ada261)
+++ b/extensions/smallrye-stork/runtime/src/main/java/io/quarkus/stork/StorkConfigProvider.java	(date 1743869879599)
@@ -11,6 +11,7 @@
     private static final List<ServiceConfig> serviceConfigs = new ArrayList<>();
 
     public static void init(List<ServiceConfig> configs) {
+        serviceConfigs.clear();
         serviceConfigs.addAll(configs);
     }

The second one is in Stork itself. For performance reason, Stork "static" service discovery (which is not limited to the stack-service-discovery, but any type of service discovery with a known set of service instances) also uses a static list that was not cleared.

Subject: [PATCH] Clear the in memory address backend during shutdown
---
Index: core/src/main/java/io/smallrye/stork/Stork.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/core/src/main/java/io/smallrye/stork/Stork.java b/core/src/main/java/io/smallrye/stork/Stork.java
--- a/core/src/main/java/io/smallrye/stork/Stork.java	(revision c58a67b70fdefe54b6a587f83fe551654c457555)
+++ b/core/src/main/java/io/smallrye/stork/Stork.java	(date 1743870673259)
@@ -38,6 +38,7 @@
 import io.smallrye.stork.spi.internal.LoadBalancerLoader;
 import io.smallrye.stork.spi.internal.ServiceDiscoveryLoader;
 import io.smallrye.stork.spi.internal.ServiceRegistrarLoader;
+import io.smallrye.stork.utils.InMemoryAddressesBackend;
 
 /**
  * The entrypoint for SmallRye Stork.
@@ -274,7 +275,16 @@
      * Closes the stork instance.
      */
     public static void shutdown() {
-        REFERENCE.set(null);
+        var previous = REFERENCE.getAndSet(null);
+        if (previous != null) {
+            previous.clear();
+        }
+        InMemoryAddressesBackend.clearAll();
+    }
+
+    private void clear() {
+        services.clear();
+        serviceRegistrars.clear();
     }
 
     /**

Openeing PRs right now.

@cescoffier
Copy link
Member

Stork pull request: smallrye/smallrye-stork#1023
Quarkus pull request: #47196

@holly-cummins
Copy link
Contributor Author

Thanks!

@aureamunoz aureamunoz linked a pull request Apr 7, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants