Compute Instance Not Reconnecting After Resource Maxed Out and Throwing 404 Error Despite Being Active and Running on GCE #510

skrishna375 · 2025-01-20T10:21:36Z

Jenkins and plugins versions report

Environment

Jenkins: 2.462.1
OS: Linux - 4.18.0-553.30.1.el8_10.x86_64
Java: 17.0.13 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
---
PrioritySorter:5.2.0
analysis-model-api:12.9.1
ansicolor:1.0.5
ant:511.v0a_a_1a_334f41b_
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
apache-httpcomponents-client-5-api:5.4-135.v4da_349961256
artifactory:4.0.8
asm-api:9.7.1-97.v4cc844130d97
authentication-tokens:1.119.v50285141b_7e1
authorize-project:1.8.1
block-queued-job:0.2.0
blueocean:1.27.16
blueocean-autofavorite:1.2.5
blueocean-bitbucket-pipeline:1.27.16
blueocean-commons:1.27.16
blueocean-config:1.27.16
blueocean-core-js:1.27.16
blueocean-dashboard:1.27.16
blueocean-display-url:2.4.3
blueocean-events:1.27.16
blueocean-git-pipeline:1.27.16
blueocean-github-pipeline:1.27.16
blueocean-i18n:1.27.16
blueocean-jira:1.27.16
blueocean-jwt:1.27.16
blueocean-personalization:1.27.16
blueocean-pipeline-api-impl:1.27.16
blueocean-pipeline-editor:1.27.16
blueocean-pipeline-scm-api:1.27.16
blueocean-rest:1.27.16
blueocean-rest-impl:1.27.16
blueocean-web:1.27.16
bootstrap5-api:5.3.3-1
bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_
branch-api:2.1206.vd9f35001c95c
build-failure-analyzer:2.5.2
build-name-setter:2.4.3
build-timeout:1.33
build-with-parameters:76.v9382db_f78962
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.2.1
cloudbees-bitbucket-branch-source:895.v15dc41668f03
cloudbees-disk-usage-simple:232.v713eeed2e1f4
cloudbees-folder:6.955.v81e2a_35c08d3
command-launcher:116.vd85919c54a_d6
commons-compress-api:1.26.1-2
commons-httpclient3-api:3.1-3
commons-lang3-api:3.17.0-84.vb_b_938040b_078
commons-text-api:1.12.0-129.v99a_50df237f7
conditional-buildstep:1.4.3
config-driven-pipeline:1.3
config-file-provider:980.v88956a_a_5d6a_d
configuration-as-code:1873.vea_5814ca_9c93
configurationslicing:548.ve92d48e66b_f8
credentials:1380.va_435002fa_924
credentials-binding:681.vf91669a_32e45
cvs:469.v57a_96d4f6886
data-tables-api:2.1.8-1
datadog:9.0.1
description-setter:258.vcd25251271a_a_
disable-github-multibranch-status:1.2
disable-job-button:1.v9db_352414f90
display-url-api:2.204.vf6fddd8a_8b_e9
docker-build-step:2.12
docker-commons:445.v6b_646c962a_94
docker-custom-build-environment:1.7.3
docker-java-api:3.4.1-96.v77147a_de67f8
docker-workflow:580.vc0c340686b_54
doclinks:0.7
durable-task:581.v299a_5609d767
echarts-api:5.5.1-4
eddsa-api:0.3.0-4.v84c6f0f4969e
email-ext:1844.v3ea_a_b_842374a_
embeddable-build-status:487.va_0ef04c898a_2
envinject:2.919.v009a_a_1067cd0
envinject-api:1.199.v3ce31253ed13
extended-choice-parameter:382.v5697b_32134e8
extended-read-permission:61.vf24570ff3b_e9
external-monitor-job:215.v2e88e894db_f8
favorite:2.221.v19ca_666b_62f5
flatpickr-api:4.6.13-5.v534d8025a_a_59
font-awesome-api:6.6.0-2
forensics-api:2.7.0
fortify-on-demand-uploader:8.0.1
git:5.5.2
git-client:5.0.0
git-forensics:2.2.1
git-parameter:0.10.0
git-server:126.v0d945d8d2b_39
github:1.40.0
github-api:1.321-468.v6a_9f5f2d5a_7e
github-branch-source:1807.v50351eb_7dd13
github-checks:589.v845136f916cd
github-oauth:597.ve0c3480fcb_d0
github-pullrequest:0.7.2
github-scm-trait-notification-context:40.vfa_7f31a_b_d7f8
global-post-script:1.1.4
google-compute-engine:4.606.ve3308d41b_013
google-container-registry-auth:0.3
google-metadata-plugin:0.5
google-oauth-plugin:1.330.vf5e86021cb_ec
google-play-android-publisher:4.2
google-storage-plugin:1.360.v6ca_38618b_41f
gradle:2.14
groovy:457.v99900cb_85593
gson-api:2.11.0-85.v1f4e87273c33
h2-api:11.1.4.199-30.v1c64e772f3a_c
handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
hashicorp-vault-plugin:371.v884a_4dd60fb_6
heavy-job:1.1
htmlpublisher:1.37
instance-identity:201.vd2a_b_5a_468a_a_6
ionicons-api:74.v93d5eb_813d5f
ivy:2.8
jackson2-api:2.17.0-379.v02de8ec9f64c
jacoco:3.3.7
jakarta-activation-api:2.1.3-1
jakarta-mail-api:2.1.3-1
javadoc:280.v050b_5c849f69
javax-activation-api:1.2.0-7
javax-mail-api:1.6.2-10
jaxb:2.3.9-1
jdk-tool:80.v8a_dee33ed6f0
jenkins-design-language:1.27.16
jersey2-api:2.44-151.v6df377fff741
jira:3.13
jira-ext:114.v7b_8b_1d4274c6
jira-steps:2.0.165.v8846cf59f3db
jjwt-api:0.11.5-112.ve82dfb_224b_a_d
jnr-posix-api:3.1.20-125.vb_6ec4b_21b_15e
job-dsl:1.89
jobConfigHistory:1294.v961a_b_707546a_
jobcacher:573.v33fa_12644a_91
joda-time-api:2.13.0-93.v9934da_29b_a_e9
jquery:1.12.4-3
jquery3-api:3.7.1-2
jsch:0.2.16-86.v42e010d9484b_
json-api:20241224-119.va_dca_a_b_ea_7da_5
json-path-api:2.9.0-118.v7f23ed82a_8b_8
junit:1312.v1a_235a_b_94a_31
ldap:725.v3cb_b_711b_1a_ef
lighthouse-report:1.3.0
lockable-resources:1327.ved786b_a_197e0
logstash:2.5.0218.v0a_ff8fefc12b_
mailer:488.v0c9639c1a_eb_3
mapdb-api:1.0.9-40.v58107308b_7a_7
mask-passwords:173.v6a_077a_291eb_5
matrix-auth:3.2.3
matrix-project:839.vff91cd7e3a_b_2
maven-plugin:3.24
metrics:4.2.21-458.vcf496cb_839e4
mina-sshd-api-common:2.14.0-138.v6341ee58e1df
mina-sshd-api-core:2.14.0-138.v6341ee58e1df
monitoring:1.99.0
next-build-number:66.v4b_4762172d53
nodelabelparameter:1.13.0
oauth-credentials:0.653.v14cf2088e950
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
pagerduty:0.7.1
pam-auth:1.11
parameterized-scheduler:277.v61a_4b_a_49a_c5c
parameterized-trigger:806.vf6fff3e28c3e
people-view:1.2
pipeline-agent-build-history:90.vf089ff0feff9
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-github:2.8-159.09e4403bc62f
pipeline-githubnotify-step:49.vf37bf92d2bc8
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-groovy-lib:749.v70084559234a_
pipeline-input-step:508.v584c0e9a_2177
pipeline-maven:1469.ve15ca_a_b_90b_44
pipeline-maven-api:1469.ve15ca_a_b_90b_44
pipeline-milestone-step:119.vdfdc43fc3b_9a_
pipeline-model-api:2.2218.v56d0cda_37c72
pipeline-model-definition:2.2218.v56d0cda_37c72
pipeline-model-extensions:2.2218.v56d0cda_37c72
pipeline-multibranch-defaults:2.1
pipeline-npm:204.v4dc4c2202625
pipeline-rest-api:2.34
pipeline-stage-step:312.v8cd10304c27a_
pipeline-stage-tags-metadata:2.2218.v56d0cda_37c72
pipeline-stage-view:2.34
pipeline-utility-steps:2.18.0
plain-credentials:183.va_de8f1dd5a_2b_
plugin-usage-plugin:4.8
plugin-util-api:5.1.0
popper-api:1.16.1-3
popper2-api:2.11.6-5
prism-api:1.29.0-18
prometheus:795.v995762102f28
promoted-builds:965.vcda_c6a_e0998f
pubsub-light:1.18
rebuild:332.va_1ee476d8f6d
repo:1.16.0
resource-disposer:0.24
run-condition:1.7
rundeck:3.6.11
saferestart:101.vc7fa_8ca_dd18b_
schedule-build:577.v0613c45b_9eef
scm-api:696.v778d637b_a_762
script-security:1369.v9b_98a_4e95b_2d
simple-theme-plugin:196.v96d9592f4efa_
slack:751.v2e44153c8fe1
snakeyaml-api:2.3-123.v13484c65210a_
sonar:2.17.2
sse-gateway:1.27
ssh-agent:376.v8933585c69d3
ssh-credentials:349.vb_8b_6b_9709f5b_
ssh-slaves:2.973.v0fa_8c0dea_f9f
ssh-steps:2.0.68.va_d21a_12a_6476
sshd:3.330.vc866a_8389b_58
structs:338.v848422169819
support-core:1523.v5486c8d6da_f3
swarm:3.47
tap:2.4.3
testng-plugin:835.v51ed3da_fcc35
text-finder:1.30
timestamper:1.27
token-macro:400.v35420b_922dcb_
translation:1.16
trilead-api:2.147.vb_73cc728a_32e
uno-choice:2.8.3
variant:60.v7290fc0eb_b_cd
versioncolumn:243.vda_c20eea_a_8a_f
warnings-ng:11.12.0
wavefront:1.1.4
workflow-aggregator:600.vb_57cdd26fdd7
workflow-api:1358.vfb_5780da_64cb_
workflow-basic-steps:1058.vcb_fc1e3a_21a_9
workflow-cps:4007.vd705fc76a_34e
workflow-durable-task-step:1378.v6a_3e903058a_3
workflow-job:1436.vfa_244484591f
workflow-multibranch:795.ve0cb_1f45ca_9a_
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:678.v3ee58b_469476
workflow-support:943.v8b_0d01a_7b_a_08
ws-cleanup:0.47
xcode-plugin:2.0.17-565.v1c48051d46ef

What Operating System are you using (both controller, and any agents involved in the problem)?

Controller: Rocky Linux 8.10
Agent: Rocky Linux 8.10

Reproduction steps

Trigger a job which should exceeds limit either CPU or Memory
Once the agent got maxed out, Agent will lose the connectivity
2.1 Job which is running will lose the contact and return an error as Cannot contact jenkins-agent-*: java.lang.InterruptedException
2.2 Controller logs will return with 404 operation error but nothing specific at the Agent log.

c.g.c.g.p.p.client.ComputeClient#lambda$waitForOperationCompletion$11: Error retrieving operation.
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
GET https://compute.googleapis.com/compute/v1/projects/**********/zones/us-west1-a/operations/operation-1737008510170-62bcccf3876cb-5079e4fc-9dfe054e
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "The resource 'projects/**********//zones/us-west1-a/operations/operation-1737008510170-62bcccf3876cb-5079e4fc-9dfe054e' was not found",
"reason" : "notFound"
} ],
"message" : "The resource 'projects/**********/zones/us-west1-a/operations/operation-1737008510170-62bcccf3876cb-5079e4fc-9dfe054e' was not found"
}
at PluginClassLoader for google-oauth-plugin//com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at PluginClassLoader for google-oauth-plugin//com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
at PluginClassLoader for google-oauth-plugin//com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)

Expected Results

Agent should get reconnected (unaware if there is any retry options available at this moment)

Actual Results

Agent connection lost and leaves as zombie although the instance is healthy at the GCP end. This is leading to issues while creating agents due to zombie nature and it has to be cleaned up manually.

In ideal scenario, it should have been reconnected to execute jobs on this specific agent.

Anything else?

We have multiple cloud been configured with different instance types and Issue is applicable to both agents configured with One shot as True & False.

Are you interested in contributing a fix?

No response

The text was updated successfully, but these errors were encountered:

gbhat618 · 2025-02-08T17:09:49Z

This is leading to issues while creating agents due to zombie nature and it has to be cleaned up manually

The latest plugin version 4.681.v9020cf2b_7453 adds support for GCP's limit VM runtime via the maxRunDuration option applicable to both Standard and Spot VMs. Upgrade from 4.606.ve3308d41b_013 involves no breaking changes.

if there is any retry options available at this moment

did you mean something like the pipeline retry option? (if yes, you can put the specific part of the pipeline within the retry scope - ig it will create a new agent in GCP - so you would need to put all the steps that are to run in GCP VM within retry scope)
Or you meant something like making Jenkins attempt to retry to the agent several times before marking it as a lost agent ?

Using the retry with the maxRunDuration would solve the problem imo. I will see to reproduce the issue..

(missed this notification, sry, now watching it)

gbhat618 · 2025-02-08T19:24:41Z

I tried to simulate by causing a memory stress,

pipeline used,
(i am running a small VM so 250M maxouts the VM memory causes unresponsive in the ssh, jenkins, every client)

node ('gce') {
    sh """
        timeout <num-seconds>s stress --vm 2 --vm-bytes 250M
    """
}

For now I did two tests, (above pipeline run for 240s and 600s after other).

In the both cases, when the stress was running the machine unresponsive, jenkins lost connection, my terminal clients weren't connecting as well.

However the stress command terminate after timout, and jenkins did reconnect and completed the pipeline.

We can see the Cannot contact gce-10kfxb: java.lang.InterruptedException but the Jenkins connection was resumed after the stress command completed and it printed the failure of the command.
In the Jenkins System logs, we can see hudson.slaves.ChannelPinger$1#onDead: Ping failed. Terminating the channel gce-10kfxb. . After a long connection attempt reconnected at 2025-02-08 19:08:33.575+0000 [id=1105] INFO c.g.j.p.c.ComputeEngineComputer#onConnected: Instance gce-10kfxb is preemptive, setting up preemption listener (we can notice it was attempting to launch a new computer though.. but still seemed to eventually worked in here)

Pipeline logs 240s

Started by user admin
[Pipeline] Start of Pipeline
[Pipeline] node
Running on gce-10kfxb in /tmp/workspace/p1
[Pipeline] {
[Pipeline] sh
Cannot contact gce-10kfxb: java.lang.InterruptedException
+ timeout 240s stress --vm 2 --vm-bytes 250M
stress: info: [1102] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
stress: FAIL: [1102] (425) <-- worker 1105 got signal 9
stress: WARN: [1102] (427) now reaping child worker processes
stress: FAIL: [1102] (431) kill error: No such process
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code 124
Finished: FAILURE

Pipeline logs 600s

Started by user admin
[Pipeline] Start of Pipeline
[Pipeline] node
Running on gce-10kfxb in /tmp/workspace/p1
[Pipeline] {
[Pipeline] sh
Cannot contact gce-10kfxb: java.lang.InterruptedException
+ timeout 600s stress --vm 2 --vm-bytes 250M
stress: info: [1172] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
stress: FAIL: [1172] (425) <-- worker 1175 got signal 9
stress: WARN: [1172] (427) now reaping child worker processes
stress: FAIL: [1172] (431) kill error: No such process
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code 124
Finished: FAILURE

System logs

2025-02-08 18:46:15.497+0000 [id=470]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Launching Jenkins agent via plugin SSH: java -jar /tmp/agent.jar
2025-02-08 18:46:20.908+0000 [id=470]	INFO	c.g.j.p.c.ComputeEngineComputer#onConnected: Instance gce-10kfxb is preemptive, setting up preemption listener
2025-02-08 18:46:23.376+0000 [id=848]	INFO	c.g.j.p.c.ComputeEngineCloud#lambda$getPlannedNodeFuture$0: 36531ms elapsed waiting for node gce-10kfxb to connect
2025-02-08 19:05:19.244+0000 [id=888]	INFO	hudson.slaves.ChannelPinger$1#onDead: Ping failed. Terminating the channel gce-10kfxb.
java.util.concurrent.TimeoutException: Ping started at 1739041279243 hasn't completed by 1739041519243
	at hudson.remoting.PingThread.ping(PingThread.java:135)
	at hudson.remoting.PingThread.run(PingThread.java:87)
2025-02-08 19:06:02.063+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineComputerLauncher#launch: Launch will wait 300000 for operation operation-1739040345903-62da5e21150fb-faf6017c-81048ffe to complete...
2025-02-08 19:06:08.095+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Launching instance: gce-10kfxb
2025-02-08 19:06:08.096+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: bootstrap
2025-02-08 19:06:08.096+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Getting keypair...
2025-02-08 19:06:08.096+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Using autogenerated ssh keypair
2025-02-08 19:06:08.096+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Authenticating as jenkins
2025-02-08 19:06:08.403+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:06:18.404+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:06:18.404+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:06:23.764+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:06:33.765+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:06:33.765+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:06:39.122+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:06:49.123+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:06:49.123+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:06:54.477+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:07:04.478+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:07:04.478+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:07:09.844+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:07:19.845+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:07:19.845+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:07:25.205+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:07:35.206+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:07:35.206+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:07:40.511+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:07:50.512+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:07:50.512+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:07:55.823+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:08:05.824+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:08:05.824+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:08:11.183+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:08:21.184+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
2025-02-08 19:08:21.184+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Waiting for SSH to come up. Sleeping 5.
2025-02-08 19:08:26.646+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connecting to 35.200.161.127 on port 22, with timeout 10000.
2025-02-08 19:08:27.164+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Connected via SSH.
2025-02-08 19:08:27.312+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Verifying: java -fullversion
2025-02-08 19:08:28.079+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Copying agent.jar to: /tmp
2025-02-08 19:08:28.399+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineCloud#log: Launching Jenkins agent via plugin SSH: java -jar /tmp/agent.jar
2025-02-08 19:08:33.575+0000 [id=1105]	INFO	c.g.j.p.c.ComputeEngineComputer#onConnected: Instance gce-10kfxb is preemptive, setting up preemption listener

gbhat618 · 2025-02-08T19:28:26Z

I wasn't able to simulate the kind of logs in the description..

The log is indicating of an operation request sent to GCP (such as provisioning a new machine, deleting a machine) etc.
This shouldn't happen if there is only agent reconnection (which is ssh, not using google client java library).

The current stack trace I am unable to find what is the code path in this plugin that triggered it; the stack trace is only showing the internal library line numbers.

@skrishna375 , Can you please share the full stacktrace (please redact any sensitive information, such as VM name etc.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute Instance Not Reconnecting After Resource Maxed Out and Throwing 404 Error Despite Being Active and Running on GCE #510

Compute Instance Not Reconnecting After Resource Maxed Out and Throwing 404 Error Despite Being Active and Running on GCE #510

skrishna375 commented Jan 20, 2025

gbhat618 commented Feb 8, 2025

gbhat618 commented Feb 8, 2025

gbhat618 commented Feb 8, 2025

Compute Instance Not Reconnecting After Resource Maxed Out and Throwing 404 Error Despite Being Active and Running on GCE #510

Compute Instance Not Reconnecting After Resource Maxed Out and Throwing 404 Error Despite Being Active and Running on GCE #510

Comments

skrishna375 commented Jan 20, 2025

Jenkins and plugins versions report

What Operating System are you using (both controller, and any agents involved in the problem)?

Reproduction steps

Expected Results

Actual Results

Anything else?

Are you interested in contributing a fix?

gbhat618 commented Feb 8, 2025

gbhat618 commented Feb 8, 2025

gbhat618 commented Feb 8, 2025