GC No longer Executes #21824

mdavid01 · 2025-04-03T14:27:38Z

Hi Team: raised this issue on the 4/2 community meeting:
on March 6, GC ran as expected. On march 7, it stopped executing in same fashion as our other environment.
GC log only shows:
{"errors":[{"code":"NOT_FOUND","message":"{"code":10010,"message":"object is not found","details":"log entity: b768f80f8781bf9ef30708f0"}"}]}

Same result on GC scheduled, manual, and dry runs. Only the log entity in the error message changes. Since we're using HELM, there's no way to trace the code.

Initially opened this issue at #21655 but received no actionable response. We've googled and tried multiple fixes. We don't know how to find the log entity -- Or, is it the log entity that's not found?

LM (LMT) Leadership quite frustrated with lack of attention to this issue. We're happy to answer any questions about our environment, pod logs, etc.

Sorry, guys - really need help with this.

wy65701436 · 2025-04-07T06:34:53Z

What is the status of the execution of GC on March 6?
Can you go to the jobservice dashboard and share the pending count of GARBAGE_COLLECTION job queue?
What is the version of your harbor?

wy65701436 · 2025-04-07T08:38:36Z

Hi @mdavid01

To resolve this issue, follow these steps:

Stop all running jobs and disable all scheduled tasks, such as GC, replication, tag retention, and scanning.

Clear all job queues in the Job Service dashboard.

Ensure that all workers in the Job Service dashboard are unoccupied.

Flush the Job Service database in Redis.
 >kubectl exec -it <redis-pod> bash
 >redis-cli
 >flushdb

```
Restart all Job Service pods.
```

Manually execute the GC, change worker setting to 5 in cleanup page, but avoid setting a schedule. Monitor the running GC process closely. there is a dedicate log file for this job in the job service pod's /var/log/jobs/ directory, remember the name of this file, because even the GC is in Error status, as long as the GC log file keep updating, then the GC goroutine is running in the background. and you can check it to get the progress of the GC.

mdavid01 · 2025-04-08T11:30:52Z

What is the status of the execution of GC on March 6?

Can you go to the jobservice dashboard and share the pending count of GARBAGE_COLLECTION job queue?

What is the version of your harbor?

Thanks, Wang:

AWS Gov: SUCCESS, 1718 blob(s) and 210 manifest(s) deleted, 25.06GiB space freed up,Mar 5, 2025, 7:00:00 PM,Mar 5, 2025, 10:49:59 PM. Note however that log output on this and all logs are like "{"errors":[{"code":"NOT_FOUND","message":"{"code":10010,"message":"object is not found","details":"log entity: 27f6668dedbf4fe4a5fbfcb9"}"}]}"

Pending counts (GC is run daily @ 8pm ET):
AWS Gov: EXECUTION_SWEEP, 1036, 1164hrs 23min 16sec
AWS Gov: GARBAGE_COLLECTION, 32, 779hrs 23min 16sec
AWS Commercial: GARBAGE_COLLECTION,5,121hrs 43min 26sec
AWS Commercial: no other executions, e.g., sweep

Both AWS Gov and Commercial: Version v2.11.1-6b7ecba1

mdavid01 · 2025-04-08T11:46:19Z

Hi @mdavid01

To resolve this issue, follow these steps:

Stop all running jobs and disable all scheduled tasks, such as GC, replication, tag retention, and scanning.

Clear all job queues in the Job Service dashboard.

Ensure that all workers in the Job Service dashboard are unoccupied.

Flush the Job Service database in Redis.
 >kubectl exec -it <redis-pod> bash
 >redis-cli
 >flushdb

```
Restart all Job Service pods.
```

Manually execute the GC, change worker setting to 5 in cleanup page, but avoid setting a schedule. Monitor the running GC process closely. there is a dedicate log file for this job in the job service pod's /var/log/jobs/ directory, remember the name of this file, because even the GC is in Error status, as long as the GC log file keep updating, then the GC goroutine is running in the background. and you can check it to get the progress of the GC.

Thanks Wang. Will attempt this on weekend as these are critical production systems. At this time, none of the 5 job service /var/log/jobs shows any files. I assume we need root access to view these files?

mdavid01 · 2025-04-14T16:08:55Z

Hello Wang: we did not execute the steps recommended above because I found what I believe to be a confirmation that GC is running but far too slowly to ever complete. When tracking the Artifact_Trash database table row count every 10 seconds, I found the record count to be decreasing at about 1 entry per minute during prime time. Our artifact trash table count started at 57,000+ entries this morning. Below is a summary of GC performance based on the attached performance tracking file:

844 Estimated total hours to complete garbage collection
13.4 Total Hours for Test (2 run segments)
3.5 Maximum number of artifacts deleted during any 1-minute period
-1.83 Minimum number of artifacts deleted during any 1-minute period
1.1 Average number of artifacts deleted during any 1-minute period

At current rate, GC will run for 844 hours if no new deletions are added. At the current rate, we can handle 1400 GC artifacts per day.

Is there any other script you can provide to safely and quickly remove the "trash' blobs and manifests?
Will copying the active repos and artifacts to another database eliminate the trash (least preferred option)?

Attached is the Excel file that captured the speed at which GC is removing Artifact_Trash records from the database. I assume that tracking Artifact_Trash activity serves as a valid proxy for GC performance.

Harbor Garbage Collection Timings.xlsx

Thanks.
Michael

wy65701436 added the area/gc label Apr 7, 2025

github-project-automation bot added this to GC Improvement Activities Apr 7, 2025

wy65701436 self-assigned this Apr 7, 2025

wy65701436 added the needs/follow-up label Apr 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GC No longer Executes #21824

GC No longer Executes #21824

mdavid01 commented Apr 3, 2025

wy65701436 commented Apr 7, 2025

wy65701436 commented Apr 7, 2025 •

edited by stonezdj

Loading

mdavid01 commented Apr 8, 2025

mdavid01 commented Apr 8, 2025

mdavid01 commented Apr 14, 2025 •

edited

Loading

GC No longer Executes #21824

GC No longer Executes #21824

Comments

mdavid01 commented Apr 3, 2025

wy65701436 commented Apr 7, 2025

wy65701436 commented Apr 7, 2025 • edited by stonezdj Loading

mdavid01 commented Apr 8, 2025

mdavid01 commented Apr 8, 2025

mdavid01 commented Apr 14, 2025 • edited Loading

wy65701436 commented Apr 7, 2025 •

edited by stonezdj

Loading

mdavid01 commented Apr 14, 2025 •

edited

Loading