Skip to content

Socket factory fails to connect to the database after maintenance in Enterprise Plus edition #2120

Open
@weiminyu

Description

@weiminyu

Bug Description

We recently upgraded our Postgresql_v17 databases to Enterprise Plus edition and since then, our apps cannot make connections to the databases after maintenance. Every connection attempt times out, with app-side stack trace showing SSL error messages: "Remote host terminated the handshake" and "SSL peer shut down incorrectly". The temporary fix is to restart the databases.

Note that Cloud SQL Proxy does not suffer from this problem. Therefore we suspect the problem is not with the server-side but the socket factory.

Example code (or command)

Stacktrace

message: "Encountered internal server error"
stacktrace: "
google.registry.persistence.transaction.DatabaseException: 	SQL Error: 0, SQLState: 08001, message: HikariPool-1 - Connection is not available, request timed out after 20000ms (total=0, active=0, idle=0, waiting=0).
	SQL Error: 0, SQLState: 08001, message: The connection attempt failed..
	SQL Error: 0, SQLState: 08001, message: The connection attempt failed..
	at google.registry.persistence.transaction.DatabaseException.throwIfSqlException(DatabaseException.java:69)
	at google.registry.persistence.transaction.JpaTransactionManagerImpl.transactNoRetry(JpaTransactionManagerImpl.java:288)
	at google.registry.persistence.transaction.JpaTransactionManagerImpl.transactNoRetry(JpaTransactionManagerImpl.java:214)
	at google.registry.persistence.transaction.JpaTransactionManagerImpl.lambda$reTransact$0(JpaTransactionManagerImpl.java:174)
	at google.registry.util.Retrier.callWithRetry(Retrier.java:153)
	at google.registry.util.Retrier.callWithRetry(Retrier.java:74)
	at google.registry.persistence.transaction.JpaTransactionManagerImpl.reTransact(JpaTransactionManagerImpl.java:173)
	at google.registry.model.tld.Tlds.lambda$createFreshCache$3(Tlds.java:61)
	at com.google.common.base.Suppliers$ExpiringMemoizingSupplier.get(Suppliers.java:317)
	at google.registry.model.tld.Tlds.getTlds(Tlds.java:82)
	at google.registry.model.tld.Tlds.findTldForName(Tlds.java:137)
	at google.registry.whois.WhoisReader.parseCommand(WhoisReader.java:198)
	at google.registry.whois.WhoisReader.readCommand(WhoisReader.java:99)
	at google.registry.whois.WhoisHttpAction.run(WhoisHttpAction.java:157)
	at google.registry.request.RequestHandler.handleRequest(RequestHandler.java:175)
	at google.registry.module.ServletBase.service(ServletBase.java:78)
	at google.registry.module.RegistryServlet.service(RegistryServlet.java:78)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:614)
	at org.eclipse.jetty.ee10.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1370)
	at org.eclipse.jetty.ee10.servlet.ServletHolder.handle(ServletHolder.java:736)
	at org.eclipse.jetty.ee10.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1619)
	at org.eclipse.jetty.ee10.servlet.ServletHandler$MappedServlet.handle(ServletHandler.java:1552)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.dispatch(ServletChannel.java:819)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:436)
	at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:469)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:575)
	at org.eclipse.jetty.ee10.servlet.SessionHandler.handle(SessionHandler.java:717)
	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:181)
	at org.eclipse.jetty.server.Server.handle(Server.java:182)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:662)
	at org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:418)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:480)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:443)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection [HikariPool-1 - Connection is not available, request timed out after 20000ms (total=0, active=0, idle=0, waiting=0)] [n/a]
	at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:51)
	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:58)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:108)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:94)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:129)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:156)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionManagedImpl.java:286)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin(LogicalConnectionManagedImpl.java:294)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin(JdbcResourceLocalTransactionCoordinatorImpl.java:232)
	at org.hibernate.engine.transaction.internal.TransactionImpl.begin(TransactionImpl.java:83)
	at google.registry.persistence.transaction.JpaTransactionManagerImpl.transactNoRetry(JpaTransactionManagerImpl.java:262)
	... 42 more
Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 20000ms (total=0, active=0, idle=0, waiting=0)
	at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:710)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:189)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:147)
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99)
	at org.hibernate.hikaricp.internal.HikariCPConnectionProvider.getConnection(HikariCPConnectionProvider.java:76)
	at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:46)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:126)
	... 48 more
Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:364)
	at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:54)
	at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:273)
	at org.postgresql.Driver.makeConnection(Driver.java:446)
	at org.postgresql.Driver.connect(Driver.java:298)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:139)
	at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:367)
	at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:205)
	at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:484)
	at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:748)
	at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:727)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	... 1 more
Caused by: javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake
	at java.base/sun.security.ssl.SSLSocketImpl.handleEOF(SSLSocketImpl.java:1714)
	at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1514)
	at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
	at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
	at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
	at com.google.cloud.sql.core.Connector.connect(Connector.java:130)
	at com.google.cloud.sql.core.InternalConnectorRegistry.connect(InternalConnectorRegistry.java:181)
	at com.google.cloud.sql.postgres.SocketFactory.createSocket(SocketFactory.java:89)
	at org.postgresql.core.PGStream.createSocket(PGStream.java:248)
	at org.postgresql.core.PGStream.<init>(PGStream.java:121)
	at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:140)
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:268)
	... 14 more
Caused by: java.io.EOFException: SSL peer shut down incorrectly
	at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:494)
	at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:483)
	at java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:160)
	at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)
	at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1506)
	... 24 more
"

Steps to reproduce?

  1. Create a Postgresql database in the Enterprise edition
  2. Connect to the database using the socket factory, verifies that it works.
  3. Upgrade to Enterprise plus edition (this is similar to a maintenance since the DB is not shut down), afterwards no new connections can be made.
    ...

Environment

  1. OS type and version: Debian 6.10
  2. Java SDK version: Openjdk 21.0.3
  3. Cloud SQL Java Socket Factory version: 1.23.0

Additional Details

Cloud SQL Proxy does not have this problem.

Metadata

Metadata

Assignees

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions