Skip to content

RiakClient shutdown() returns a future that never completes, JVM does not exit #706

Open
@philip-doctor

Description

@philip-doctor

RiakClient shutdown() returns a future that never completes, JVM does not exit

Context

I'm writing a simple ETL app, on conclusion my application is not exiting. I am calling shutdown() on my connection and then blocking .get(), but it never returns.

I have attached my debugger to it and stepped through the code, here's what I see happening

Expected Behavior

shutdown completes, program terminates

Actual Behavior

blocks forever on the shutdown future

Possible Fix

I think all you have to do is to catch the IllegalStateExceptions and continue iteration over each node in your list and call shutdown().

Steps to Reproduce

This does not reproduce cleanly, I suspect there's something racey, I'm using Kotlin, but I'm doing nothing fancy here:

    fun getRiak(riakPort: Int, riakHost: String): RiakClient =
        try {
            RiakClient.newClient(riakPort, riakHost)
        } catch (ex: UnknownHostException) {
            logger.error("Unknown Riak Host $riakHost $riakPort $ex")
            throw Exception("Unable to connect to Riak.")
        }
    val riakClient = getRiak(config.riakPort, config.riakHost)
    riakClient.shutdown().get() // This line hangs forever for me most of the time

This is what I think is happening, let's take a dive....
https://github.com/basho/riak-java-client/blob/riak-client-2.1.1/src/main/java/com/basho/riak/client/core/RiakCluster.java#L630

Here we get a list of nodes, what state are those nodes in?

https://github.com/basho/riak-java-client/blob/riak-client-2.1.1/src/main/java/com/basho/riak/client/core/RiakCluster.java#L408

stateCheck(State.CREATED, State.RUNNING, State.SHUTTING_DOWN, State.QUEUING);

Okay cool, so we get that list of nodes and then call shutdown on them over here:

https://github.com/basho/riak-java-client/blob/riak-client-2.1.1/src/main/java/com/basho/riak/client/core/RiakNode.java#L303

That does a state check
stateCheck(State.RUNNING, State.HEALTH_CHECKING);

So if the state is currently State.SHUTTING_DOWN then you get the node in your node list, then it throws an IllegalStateException. What happens then? Well the Cluster Shutdown action is taking place on a thread in the ExecutorService pool, it's never caught so it just kills the thread.

Because of this we never get our countdown latch to 0, because the thread that's supposed to shut everything down in the background got killed.

Because of that we block forever.

I think all you have to do is to catch the IllegalStateExceptions and continue iteration over each node in your list and call shutdown().

Context

I want my program to terminate normally and not leak connection

Your Environment

  • [2.1.1 ] Riak Java Client version
  • [ 1.8 ] Java version
  • [ 1.5.2 TS ] Riak version
  • [ Ubuntu 16.0.4-1 LTS] Operating System / Distribution & Version

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions