Skip to content

Add a script to reorganize tool data based on the new layout for genomic Data Managers #19728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 9, 2025

Conversation

natefoo
Copy link
Member

@natefoo natefoo commented Mar 1, 2025

In #19013 I mentioned that I would write a script to move data from the old layout used by the most common genomic DMs to the new standardized layout - here it is.

Caveat: The __dbkeys__ and twobit tables do not have the dbkey/value(variant) distinction, in these cases the dbkey would be the variant. So in the (rare) case you have a variant in these tables you would need to do some manual post-processing.

DM layout changes were in galaxyproject/tools-iuc#6489

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. Install some data using old versions of data managers
    2. Run python ./scripts/reorganize_tool_data.py --tool-data-path /path/to/data --prune-dirs ./config/tool_data_table_conf.xml
    3. Observe proposed expected changes
    4. Run python ./scripts/reorganize_tool_data.py --tool-data-path /path/to/data --prune-dirs --commit ./config/tool_data_table_conf.xml

I just used this to reorganize the brc.galaxyproject.org CVMFS repo - results can be found at http://datacache.galaxyproject.org/brc/ once publishing is complete and changes propagate.

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@github-actions github-actions bot added this to the 25.0 milestone Mar 1, 2025
@natefoo natefoo changed the title Reorganize tool data Add a script to reorganize tool data based on the new layout for genomic Data Managers Mar 1, 2025
@mvdbeek mvdbeek force-pushed the reorganize-tool-data branch from 67aad94 to 37c5e99 Compare May 9, 2025 08:54
@mvdbeek mvdbeek merged commit 2cf7990 into galaxyproject:dev May 9, 2025
50 of 54 checks passed
Copy link

github-actions bot commented May 9, 2025

This PR was merged without a "kind/" label, please correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants