Skip to content

Create rucio.cfg from inside Rucio objectstore #19863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 21, 2025

Conversation

SergeyYakubov
Copy link
Contributor

Rucio client library requires rucio.cfg to be present on the system. We create this file now in objectstore to make sure it always gets created.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

Copy link
Member

@nsoranzo nsoranzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! I think you can rely on the objectstore extra_dirs["temp"] as defined in lib/galaxy/objectstore/__init__.py , see suggestions.

@SergeyYakubov
Copy link
Contributor Author

I'm not sure why Rucio tests are failing. It seems there is some race condition after recent changes (maybe in switching to Celery to run tests?). It only happens for the tools that have some file as an input. Maybe it does not get uploaded properly before the test runs?

I can remove those tests, unless someone has an idea how to fix it.

Also for security fail - Rucio does need password in the config file, so not sure what can be done. Tried to hide it, but the analyser is too clever :)

@mvdbeek
Copy link
Member

mvdbeek commented Mar 21, 2025

pebble.common.types.RemoteTraceback: Traceback (most recent call last):
  File "/home/runner/work/galaxy/galaxy/galaxy root/.venv/lib/python3.9/site-packages/pebble/common/process.py", line 65, in process_execute
    return Result(ResultStatus.SUCCESS, function(*args, **kwargs))
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/metadata/set_metadata.py", line 535, in set_metadata_portable
    action()
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/metadata/set_metadata.py", line 101, in push_if_necessary
    object_store.update_from_file(dataset.dataset, file_name=external_filename, create=True)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/objectstore/__init__.py", line 658, in update_from_file
    return self._invoke(
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/objectstore/__init__.py", line 492, in _invoke
    return self.__getattribute__(f"_{delegate}")(obj=obj, **kwargs)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/objectstore/rucio.py", line 564, in _update_from_file
    self.rucio_broker.upload(rel_path, source_file)
  File "/home/runner/work/galaxy/galaxy/galaxy root/lib/galaxy/objectstore/rucio.py", line 225, in upload
    self.get_rucio_upload_client().upload(items)
  File "/home/runner/work/galaxy/galaxy/galaxy root/.venv/lib/python3.9/site-packages/rucio/client/uploadclient.py", line 373, in upload
    raise NoFilesUploaded()
rucio.common.exception.NoFilesUploaded: None of the given files have been uploaded.

that's not a terribly helpful message.

@mvdbeek
Copy link
Member

mvdbeek commented Mar 21, 2025

I've disabled the celery setup, and now I see:

root DEBUG 2025-03-21 09:06:52,190 [pN:main,p:24128,tN:LocalRunner.work_thread-2] wan domain is used for the upload
root DEBUG 2025-03-21 09:06:52,190 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Registering file
root DEBUG 2025-03-21 09:06:52,201 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Skipping dataset registration
charset_normalizer DEBUG 2025-03-21 09:06:52,215 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Encoding detection: ascii is most likely the one.
charset_normalizer DEBUG 2025-03-21 09:06:52,215 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Encoding detection: ascii is most likely the one.
charset_normalizer DEBUG 2025-03-21 09:06:52,215 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Encoding detection: ascii is most likely the one.
root DEBUG 2025-03-21 09:06:52,216 [pN:main,p:24128,tN:LocalRunner.work_thread-2] File DID does not exist
root INFO 2025-03-21 09:06:52,253 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Successfully added replica in Rucio catalogue at TEST
root INFO 2025-03-21 09:06:52,328 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Successfully added replication rule at TEST
root DEBUG 2025-03-21 09:06:52,333 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Checking if file://localhost/tmp/galaxy/b3/1e/18210f91b718d4a507967859cbb95bf7d45bdcd0ff9ab2f5b98a0cd1c6d2321e exists
root DEBUG 2025-03-21 09:06:52,334 [pN:main,p:24128,tN:LocalRunner.work_thread-2] [{'hostname': 'localhost', 'scheme': 'file', 'port': 0, 'prefix': '/tmp', 'impl': 'rucio.rse.protocols.posix.Default', 'domains': {'lan': {'read': 1, 'write': 1, 'delete': 1}, 'wan': {'read': 1, 'write': 1, 'delete': 1, 'third_party_copy_read': 0, 'third_party_copy_write': 0}}, 'extended_attributes': None}]
root INFO 2025-03-21 09:06:52,334 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Trying upload with file to TEST
root DEBUG 2025-03-21 09:06:52,334 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Processing upload with the domain: wan
root DEBUG 2025-03-21 09:06:52,334 [pN:main,p:24128,tN:LocalRunner.work_thread-2] The PFN created from the LFN: file://localhost/tmp/galaxy/b3/1e/18210f91b718d4a507967859cbb95bf7d45bdcd0ff9ab2f5b98a0cd1c6d2321e
root DEBUG 2025-03-21 09:06:52,334 [pN:main,p:24128,tN:LocalRunner.work_thread-2] put: Attempt 1
root INFO 2025-03-21 09:06:52,335 [pN:main,p:24128,tN:LocalRunner.work_thread-2] Successful upload of temporary file. file://localhost/tmp/galaxy/b3/1e/18210f91b718d4a507967859cbb95bf7d45bdcd0ff9ab2f5b98a0cd1c6d2321e.rucio.upload
root DEBUG 2025-03-21 09:06:52,335 [pN:main,p:24128,tN:LocalRunner.work_thread-2] skip_upload_stat=False
root DEBUG 2025-03-21 09:06:52,335 [pN:main,p:24128,tN:LocalRunner.work_thread-2] stat: pfn=file://localhost/tmp/galaxy/b3/1e/18210f91b718d4a507967859cbb95bf7d45bdcd0ff9ab2f5b98a0cd1c6d2321e.rucio.upload
root DEBUG 2025-03-21 09:06:52,335 [pN:main,p:24128,tN:LocalRunner.work_thread-2] stat: unexpected error=Filesize came back as 0. Potential storage race condition, need to retry.
root DEBUG 2025-03-21 09:06:52,335 [pN:main,p:24128,tN:LocalRunner.work_thread-2] stat: unknown edge case, retrying in 1s
INFO:     127.0.0.1:57190 - "GET /api/jobs/adb5f5c93f827949 HTTP/1.1" 200 OK
INFO:     127.0.0.1:57193 - "GET /api/jobs/adb5f5c93f827949 HTTP/1.1" 200 OK
INFO:     127.0.0.1:57194 - "GET /api/jobs/adb5f5c93f827949 HTTP/1.1" 200 OK
INFO:     127.0.0.1:57195 - "GET /api/jobs/adb5f5c93f827949 HTTP/1.1" 200 OK
root DEBUG 2025-03-21 09:06:53,340 [pN:main,p:24128,tN:LocalRunner.work_thread-2] stat: pfn=file://localhost/tmp/galaxy/b3/1e/18210f91b718d4a507967859cbb95bf7d45bdcd0ff9ab2f5b98a0cd1c6d2321e.rucio.upload
root DEBUG 2025-03-21 09:06:53,340 [pN:main,p:24128,tN:LocalRunner.work_thread-2] stat: unexpected error=Filesize came back as 0. Potential storage race condition, need to retry.
root DEBUG 2025-03-21 09:06:53,341 [pN:main,p:24128,tN:LocalRunner.work_thread-2] stat: unknown edge case, retrying in 2s

We do create file size 0 uploads, is that something rucio doesn't support ?

Either of these options are not compatible with rucio. Maybe more than
one connection isn't permitted ?
@mvdbeek
Copy link
Member

mvdbeek commented Mar 21, 2025

The last commit disables celery and extended metadata for the rucio object store, and that results in passing tests. Both of these options are key parts we want to build on in the future, so it would be worth figuring out why that doesn't work. Is it possible that we can't connect to rucio from more than one process ? I couldn't make any progress even when I simply skipped the upload of empty files (however treating this as an error seems fairly opinionated ?)

@mvdbeek
Copy link
Member

mvdbeek commented Mar 21, 2025

So ... rucio doesn't do updates of a file with the same contents (i believe that's the source of rucio.common.exception.NoFilesUploaded: None of the given files have been uploaded.). Which is strict, but also I noticed now that we would push certain job outputs twice in extended metadata mode, among those outputs of the new upload tool. I hope 39f304a won't break any of the existing tests. This is a nice performance fix if we can make it work.

Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failing test is unrelated

@mvdbeek mvdbeek merged commit 5e3ecaf into galaxyproject:dev Mar 21, 2025
52 of 55 checks passed
@galaxyproject galaxyproject deleted a comment from github-actions bot Mar 21, 2025
davelopez added a commit to SergeyYakubov/galaxy that referenced this pull request May 7, 2025
@davelopez davelopez mentioned this pull request May 7, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants