diff --git a/.editorconfig b/.editorconfig index 72dda289..6d9b74cc 100644 --- a/.editorconfig +++ b/.editorconfig @@ -31,3 +31,7 @@ indent_size = unset # ignore python and markdown [*.{py,md}] indent_style = unset + +# ignore ro-crate metadata files +[**/ro-crate-metadata.json] +insert_final_newline = unset diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index fbf55836..a5434a32 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -1,4 +1,4 @@ -# nf-core/funcscan: Contributing Guidelines +# `nf-core/funcscan`: Contributing Guidelines Hi there! Many thanks for taking an interest in improving nf-core/funcscan. @@ -19,7 +19,7 @@ If you'd like to write some code for nf-core/funcscan, the standard workflow is 1. Check that there isn't already an issue about your idea in the [nf-core/funcscan issues](https://github.com/nf-core/funcscan/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this 2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/funcscan repository](https://github.com/nf-core/funcscan) to your GitHub account 3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions) -4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). +4. Use `nf-core pipelines schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). 5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/). @@ -40,7 +40,7 @@ There are typically two types of tests that run: ### Lint tests `nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to. -To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint ` command. +To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core pipelines lint ` command. If any failures or warnings are encountered, please follow the listed URL for more documentation. @@ -55,9 +55,9 @@ These tests are run both with the latest available version of `Nextflow` and als :warning: Only in the unlikely and regretful event of a release happening with a bug. -- On your own fork, make a new branch `patch` based on `upstream/master`. +- On your own fork, make a new branch `patch` based on `upstream/main` or `upstream/master`. - Fix the bug, and bump version (X.Y.Z+1). -- A PR should be made on `master` from patch to directly this particular bug. +- Open a pull-request from `patch` to `main`/`master` with the changes. ## Getting help @@ -65,17 +65,17 @@ For further information/help, please consult the [nf-core/funcscan documentation ## Pipeline contribution conventions -To make the nf-core/funcscan code and processing logic more understandable for new contributors and to ensure quality, we semi-standardise the way the code and other contributions are written. +To make the `nf-core/funcscan` code and processing logic more understandable for new contributors and to ensure quality, we semi-standardise the way the code and other contributions are written. ### Adding a new step If you wish to contribute a new step, please use the following coding standards: -1. Define the corresponding input channel into your new process from the expected previous process channel +1. Define the corresponding input channel into your new process from the expected previous process channel. 2. Write the process block (see below). 3. Define the output channel if needed (see below). 4. Add any new parameters to `nextflow.config` with a default (see below). -5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core schema build` tool). +5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core pipelines schema build` tool). 6. Add sanity checks and validation for all relevant parameters. 7. Perform local tests to validate that the new code works as expected. 8. If applicable, add a new test command in `.github/workflow/ci.yml`. @@ -84,13 +84,13 @@ If you wish to contribute a new step, please use the following coding standards: ### Default values -Parameters should be initialised / defined with default values in `nextflow.config` under the `params` scope. +Parameters should be initialised / defined with default values within the `params` scope in `nextflow.config`. -Once there, use `nf-core schema build` to add to `nextflow_schema.json`. +Once there, use `nf-core pipelines schema build` to add to `nextflow_schema.json`. ### Default processes resource requirements -Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/master/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. +Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. The process resources can be passed on to the tool dynamically within the process with the `${task.cpus}` and `${task.memory}` variables in the `script:` block. @@ -103,7 +103,7 @@ Please use the following naming schemes, to make it easy to understand what is g ### Nextflow version bumping -If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core bump-version --nextflow . [min-nf-version]` +If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core pipelines bump-version --nextflow . [min-nf-version]` ### Images and figures diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml index 34300c5d..a16aefdf 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.yml +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -9,7 +9,6 @@ body: - [nf-core website: troubleshooting](https://nf-co.re/usage/troubleshooting) - [nf-core/funcscan pipeline documentation](https://nf-co.re/funcscan/usage) - - type: textarea id: description attributes: diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index fa844526..b6e5cebd 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -17,7 +17,7 @@ Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/func - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/funcscan/tree/master/.github/CONTRIBUTING.md) - [ ] If necessary, also make a PR on the nf-core/funcscan _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. -- [ ] Make sure your code lints (`nf-core lint`). +- [ ] Make sure your code lints (`nf-core pipelines lint`). - [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir `). - [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir `). - [ ] Usage Documentation in `docs/usage.md` is updated. diff --git a/.github/workflows/awsfulltest.yml b/.github/workflows/awsfulltest.yml index 6652683e..a27616dc 100644 --- a/.github/workflows/awsfulltest.yml +++ b/.github/workflows/awsfulltest.yml @@ -1,18 +1,48 @@ name: nf-core AWS full size tests -# This workflow is triggered on published releases. +# This workflow is triggered on PRs opened against the main/master branch. # It can be additionally triggered manually with GitHub actions workflow dispatch button. # It runs the -profile 'test_full' on AWS batch on: - release: - types: [published] + pull_request: + branches: + - main + - master workflow_dispatch: + pull_request_review: + types: [submitted] + jobs: run-platform: name: Run AWS full tests - if: github.repository == 'nf-core/funcscan' + # run only if the PR is approved by at least 2 reviewers and against the master branch or manually triggered + if: github.repository == 'nf-core/funcscan' && github.event.review.state == 'approved' && github.event.pull_request.base.ref == 'master' || github.event_name == 'workflow_dispatch' runs-on: ubuntu-latest steps: + - name: Get PR reviews + uses: octokit/request-action@v2.x + if: github.event_name != 'workflow_dispatch' + id: check_approvals + continue-on-error: true + with: + route: GET /repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}/reviews?per_page=100 + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + + - name: Check for approvals + if: ${{ failure() && github.event_name != 'workflow_dispatch' }} + run: | + echo "No review approvals found. At least 2 approvals are required to run this action automatically." + exit 1 + + - name: Check for enough approvals (>=2) + id: test_variables + if: github.event_name != 'workflow_dispatch' + run: | + JSON_RESPONSE='${{ steps.check_approvals.outputs.data }}' + CURRENT_APPROVALS_COUNT=$(echo $JSON_RESPONSE | jq -c '[.[] | select(.state | contains("APPROVED")) ] | length') + test $CURRENT_APPROVALS_COUNT -ge 2 || exit 1 # At least 2 approvals are required + - name: Launch workflow via Seqera Platform uses: seqeralabs/action-tower-launch@v2 with: diff --git a/.github/workflows/branch.yml b/.github/workflows/branch.yml index 6e4495ad..4815e1eb 100644 --- a/.github/workflows/branch.yml +++ b/.github/workflows/branch.yml @@ -1,15 +1,17 @@ name: nf-core branch protection -# This workflow is triggered on PRs to master branch on the repository -# It fails when someone tries to make a PR against the nf-core `master` branch instead of `dev` +# This workflow is triggered on PRs to `main`/`master` branch on the repository +# It fails when someone tries to make a PR against the nf-core `main`/`master` branch instead of `dev` on: pull_request_target: - branches: [master] + branches: + - main + - master jobs: test: runs-on: ubuntu-latest steps: - # PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches + # PRs to the nf-core repo main/master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches - name: Check PRs if: github.repository == 'nf-core/funcscan' run: | @@ -22,7 +24,7 @@ jobs: uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2 with: message: | - ## This PR is against the `master` branch :x: + ## This PR is against the `${{github.event.pull_request.base.ref}}` branch :x: * Do not close this PR * Click _Edit_ and change the `base` to `dev` @@ -32,9 +34,9 @@ jobs: Hi @${{ github.event.pull_request.user.login }}, - It looks like this pull-request is has been made against the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `master` branch. - The `master` branch on nf-core repositories should always contain code from the latest release. - Because of this, PRs to `master` are only allowed if they come from the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `dev` branch. + It looks like this pull-request is has been made against the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) ${{github.event.pull_request.base.ref}} branch. + The ${{github.event.pull_request.base.ref}} branch on nf-core repositories should always contain code from the latest release. + Because of this, PRs to ${{github.event.pull_request.base.ref}} are only allowed if they come from the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `dev` branch. You do not need to close this PR, you can change the target branch to `dev` by clicking the _"Edit"_ button at the top of this page. Note that even after this, the test will continue to show as failing until you push a new commit. diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index dd8aa75b..be05a2ea 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -3,18 +3,20 @@ name: nf-core CI on: push: branches: - - "dev" + - dev pull_request: branches: - "dev" - "master" release: - types: - - "published" + types: [published] + workflow_dispatch: env: NXF_ANSI_LOG: false - NFTEST_VER: "0.8.4" + NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity + NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity + NFTEST_VER: "0.9.2" concurrency: group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} @@ -36,16 +38,23 @@ jobs: fi test: - name: nf-test - needs: define_nxf_versions + name: "Run pipeline with test data (${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }})" + # Only run on push if this is the nf-core dev branch (merged PRs) + if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/funcscan') }}" runs-on: ubuntu-latest strategy: fail-fast: false matrix: - NXF_VER: ${{ fromJson(needs.define_nxf_versions.outputs.matrix) }} - tags: + NXF_VER: + - "24.10.4" + - "latest-everything" + profile: + - "conda" + - "docker" + - "singularity" + test_name: - "test" - - "test_nothing" + - "test_minimal" - "test_bakta" - "test_prokka" - "test_bgc_pyrodigal" @@ -56,12 +65,24 @@ jobs: - "test_taxonomy_prokka" - "test_preannotated" - "test_preannotated_bgc" - profile: - - "docker" - + isMaster: + - ${{ github.base_ref == 'master' }} + # Exclude conda and singularity on dev + exclude: + - isMaster: false + profile: "conda" + - isMaster: false + profile: "singularity" steps: - name: Check out pipeline code - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 + with: + fetch-depth: 0 + + - name: Set up Nextflow + uses: nf-core/setup-nextflow@v2 + with: + version: "${{ matrix.NXF_VER }}" - name: Check out test data uses: actions/checkout@v3 @@ -71,12 +92,32 @@ jobs: path: test-datasets/ fetch-depth: 1 - - name: Install Nextflow - uses: nf-core/setup-nextflow@v1 + - name: Set up Apptainer + if: matrix.profile == 'singularity' + uses: eWaterCycle/setup-apptainer@main + + - name: Set up Singularity + if: matrix.profile == 'singularity' + run: | + mkdir -p $NXF_SINGULARITY_CACHEDIR + mkdir -p $NXF_SINGULARITY_LIBRARYDIR + + - name: Set up Miniconda + if: matrix.profile == 'conda' + uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3 with: - version: "${{ matrix.NXF_VER }}" + miniconda-version: "latest" + auto-update-conda: true + conda-solver: libmamba + channels: conda-forge,bioconda + + - name: Set up Conda + if: matrix.profile == 'conda' + run: | + echo $(realpath $CONDA)/condabin >> $GITHUB_PATH + echo $(realpath python) >> $GITHUB_PATH - - name: Disk space cleanup + - name: Clean up Disk space uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 - name: Install nf-test @@ -84,9 +125,9 @@ jobs: wget -qO- https://code.askimed.com/install/nf-test | bash -s $NFTEST_VER sudo mv nf-test /usr/local/bin/ - - name: Run nf-test + - name: "Run pipeline with test data ${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }}" run: | - nf-test test --tag ${{ matrix.tags }} --profile ${{ matrix.tags }},${{ matrix.profile }} --junitxml=test.xml + nf-test test --tag ${{ matrix.test_name }} --profile ${{ matrix.test_name }},${{ matrix.profile }} --junitxml=test.xml - name: Output log on failure if: failure() diff --git a/.github/workflows/download_pipeline.yml b/.github/workflows/download_pipeline.yml index 2d20d644..efdf8abe 100644 --- a/.github/workflows/download_pipeline.yml +++ b/.github/workflows/download_pipeline.yml @@ -1,33 +1,42 @@ -name: Test successful pipeline download with 'nf-core download' +name: Test successful pipeline download with 'nf-core pipelines download' # Run the workflow when: # - dispatched manually -# - when a PR is opened or reopened to master branch +# - when a PR is opened or reopened to main/master branch # - the head branch of the pull request is updated, i.e. if fixes for a release are pushed last minute to dev. on: workflow_dispatch: inputs: testbranch: - description: "The specific branch you wish to utilize for the test execution of nf-core download." + description: "The specific branch you wish to utilize for the test execution of nf-core pipelines download." required: true default: "dev" pull_request: - types: - - opened - - edited - - synchronize - branches: - - master - pull_request_target: branches: + - main - master env: NXF_ANSI_LOG: false jobs: + configure: + runs-on: ubuntu-latest + outputs: + REPO_LOWERCASE: ${{ steps.get_repo_properties.outputs.REPO_LOWERCASE }} + REPOTITLE_LOWERCASE: ${{ steps.get_repo_properties.outputs.REPOTITLE_LOWERCASE }} + REPO_BRANCH: ${{ steps.get_repo_properties.outputs.REPO_BRANCH }} + steps: + - name: Get the repository name and current branch + id: get_repo_properties + run: | + echo "REPO_LOWERCASE=${GITHUB_REPOSITORY,,}" >> "$GITHUB_OUTPUT" + echo "REPOTITLE_LOWERCASE=$(basename ${GITHUB_REPOSITORY,,})" >> "$GITHUB_OUTPUT" + echo "REPO_BRANCH=${{ github.event.inputs.testbranch || 'dev' }}" >> "$GITHUB_OUTPUT" + download: runs-on: ubuntu-latest + needs: configure steps: - name: Install Nextflow uses: nf-core/setup-nextflow@v2 @@ -35,52 +44,91 @@ jobs: - name: Disk space cleanup uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 - - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 + - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 with: python-version: "3.12" architecture: "x64" - - uses: eWaterCycle/setup-singularity@931d4e31109e875b13309ae1d07c70ca8fbc8537 # v7 + + - name: Setup Apptainer + uses: eWaterCycle/setup-apptainer@4bb22c52d4f63406c49e94c804632975787312b3 # v2.0.0 with: - singularity-version: 3.8.3 + apptainer-version: 1.3.4 - name: Install dependencies run: | python -m pip install --upgrade pip pip install git+https://github.com/nf-core/tools.git@dev - - name: Get the repository name and current branch set as environment variable + - name: Make a cache directory for the container images run: | - echo "REPO_LOWERCASE=${GITHUB_REPOSITORY,,}" >> ${GITHUB_ENV} - echo "REPOTITLE_LOWERCASE=$(basename ${GITHUB_REPOSITORY,,})" >> ${GITHUB_ENV} - echo "REPO_BRANCH=${{ github.event.inputs.testbranch || 'dev' }}" >> ${GITHUB_ENV} + mkdir -p ./singularity_container_images - name: Download the pipeline env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images run: | - nf-core download ${{ env.REPO_LOWERCASE }} \ - --revision ${{ env.REPO_BRANCH }} \ - --outdir ./${{ env.REPOTITLE_LOWERCASE }} \ + nf-core pipelines download ${{ needs.configure.outputs.REPO_LOWERCASE }} \ + --revision ${{ needs.configure.outputs.REPO_BRANCH }} \ + --outdir ./${{ needs.configure.outputs.REPOTITLE_LOWERCASE }} \ --compress "none" \ --container-system 'singularity' \ - --container-library "quay.io" -l "docker.io" -l "ghcr.io" \ + --container-library "quay.io" -l "docker.io" -l "community.wave.seqera.io/library/" \ --container-cache-utilisation 'amend' \ - --download-configuration + --download-configuration 'yes' - name: Inspect download - run: tree ./${{ env.REPOTITLE_LOWERCASE }} + run: tree ./${{ needs.configure.outputs.REPOTITLE_LOWERCASE }} + + - name: Inspect container images + run: tree ./singularity_container_images | tee ./container_initial + + - name: Count the downloaded number of container images + id: count_initial + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Initial container image count: $image_count" + echo "IMAGE_COUNT_INITIAL=$image_count" >> "$GITHUB_OUTPUT" - name: Run the downloaded pipeline (stub) id: stub_run_pipeline continue-on-error: true env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true - run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -stub -profile test,singularity --outdir ./results + run: nextflow run ./${{needs.configure.outputs.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ needs.configure.outputs.REPO_BRANCH }}) -stub -profile test,singularity --outdir ./results - name: Run the downloaded pipeline (stub run not supported) id: run_pipeline - if: ${{ job.steps.stub_run_pipeline.status == failure() }} + if: ${{ steps.stub_run_pipeline.outcome == 'failure' }} env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true - run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -profile test,singularity --outdir ./results + run: nextflow run ./${{ needs.configure.outputs.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ needs.configure.outputs.REPO_BRANCH }}) -profile test,singularity --outdir ./results + + - name: Count the downloaded number of container images + id: count_afterwards + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Post-pipeline run container image count: $image_count" + echo "IMAGE_COUNT_AFTER=$image_count" >> "$GITHUB_OUTPUT" + + - name: Compare container image counts + id: count_comparison + run: | + if [ "${{ steps.count_initial.outputs.IMAGE_COUNT_INITIAL }}" -ne "${{ steps.count_afterwards.outputs.IMAGE_COUNT_AFTER }}" ]; then + initial_count=${{ steps.count_initial.outputs.IMAGE_COUNT_INITIAL }} + final_count=${{ steps.count_afterwards.outputs.IMAGE_COUNT_AFTER }} + difference=$((final_count - initial_count)) + echo "$difference additional container images were \n downloaded at runtime . The pipeline has no support for offline runs!" + tree ./singularity_container_images > ./container_afterwards + diff ./container_initial ./container_afterwards + exit 1 + else + echo "The pipeline can be downloaded successfully!" + fi{% endraw %} + + - name: Upload Nextflow logfile for debugging purposes + uses: actions/upload-artifact@v4 + with: + name: nextflow_logfile.txt + path: .nextflow.log* + include-hidden-files: true{% endif %} diff --git a/.github/workflows/fix-linting.yml b/.github/workflows/fix-linting.yml index f0aa68f2..5a7d4983 100644 --- a/.github/workflows/fix-linting.yml +++ b/.github/workflows/fix-linting.yml @@ -13,7 +13,7 @@ jobs: runs-on: ubuntu-latest steps: # Use the @nf-core-bot token to check out so we can push later - - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 with: token: ${{ secrets.nf_core_bot_auth_token }} @@ -32,7 +32,7 @@ jobs: GITHUB_TOKEN: ${{ secrets.nf_core_bot_auth_token }} # Install and run pre-commit - - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 + - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 with: python-version: "3.12" diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index 1fcafe88..dbd52d5a 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -1,6 +1,6 @@ name: nf-core linting # This workflow is triggered on pushes and PRs to the repository. -# It runs the `nf-core lint` and markdown lint tests to ensure +# It runs the `nf-core pipelines lint` and markdown lint tests to ensure # that the code meets the nf-core guidelines. on: push: @@ -14,10 +14,10 @@ jobs: pre-commit: runs-on: ubuntu-latest steps: - - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 - name: Set up Python 3.12 - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 + uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 with: python-version: "3.12" @@ -31,27 +31,42 @@ jobs: runs-on: ubuntu-latest steps: - name: Check out pipeline code - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 - name: Install Nextflow uses: nf-core/setup-nextflow@v2 - - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 + - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 with: python-version: "3.12" architecture: "x64" + - name: read .nf-core.yml + uses: pietrobolcato/action-read-yaml@1.1.0 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + - name: Install dependencies run: | python -m pip install --upgrade pip - pip install nf-core + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Run nf-core pipelines lint + if: ${{ github.base_ref != 'master' }} + env: + GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} + run: nf-core -l lint_log.txt pipelines lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - - name: Run nf-core lint + - name: Run nf-core pipelines lint --release + if: ${{ github.base_ref == 'master' }} env: GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} - run: nf-core -l lint_log.txt lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md + run: nf-core -l lint_log.txt pipelines lint --release --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - name: Save PR number if: ${{ always() }} @@ -59,7 +74,7 @@ jobs: - name: Upload linting log file artifact if: ${{ always() }} - uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4 + uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4 with: name: linting-logs path: | diff --git a/.github/workflows/linting_comment.yml b/.github/workflows/linting_comment.yml index 40acc23f..95b6b6af 100644 --- a/.github/workflows/linting_comment.yml +++ b/.github/workflows/linting_comment.yml @@ -11,7 +11,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Download lint results - uses: dawidd6/action-download-artifact@09f2f74827fd3a8607589e5ad7f9398816f540fe # v3 + uses: dawidd6/action-download-artifact@20319c5641d495c8a52e688b7dc5fada6c3a9fbc # v8 with: workflow: linting.yml workflow_conclusion: completed diff --git a/.github/workflows/release-announcements.yml b/.github/workflows/release-announcements.yml index 03ecfcf7..76a9e67e 100644 --- a/.github/workflows/release-announcements.yml +++ b/.github/workflows/release-announcements.yml @@ -12,7 +12,7 @@ jobs: - name: get topics and convert to hashtags id: get_topics run: | - echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" >> $GITHUB_OUTPUT + echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" | sed 's/-//g' >> $GITHUB_OUTPUT - uses: rzr/fediverse-action@master with: @@ -27,39 +27,6 @@ jobs: ${{ steps.get_topics.outputs.topics }} #nfcore #openscience #nextflow #bioinformatics - send-tweet: - runs-on: ubuntu-latest - - steps: - - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 - with: - python-version: "3.10" - - name: Install dependencies - run: pip install tweepy==4.14.0 - - name: Send tweet - shell: python - run: | - import os - import tweepy - - client = tweepy.Client( - access_token=os.getenv("TWITTER_ACCESS_TOKEN"), - access_token_secret=os.getenv("TWITTER_ACCESS_TOKEN_SECRET"), - consumer_key=os.getenv("TWITTER_CONSUMER_KEY"), - consumer_secret=os.getenv("TWITTER_CONSUMER_SECRET"), - ) - tweet = os.getenv("TWEET") - client.create_tweet(text=tweet) - env: - TWEET: | - Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! - - Please see the changelog: ${{ github.event.release.html_url }} - TWITTER_CONSUMER_KEY: ${{ secrets.TWITTER_CONSUMER_KEY }} - TWITTER_CONSUMER_SECRET: ${{ secrets.TWITTER_CONSUMER_SECRET }} - TWITTER_ACCESS_TOKEN: ${{ secrets.TWITTER_ACCESS_TOKEN }} - TWITTER_ACCESS_TOKEN_SECRET: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }} - bsky-post: runs-on: ubuntu-latest steps: diff --git a/.github/workflows/template_version_comment.yml b/.github/workflows/template_version_comment.yml new file mode 100644 index 00000000..537529bc --- /dev/null +++ b/.github/workflows/template_version_comment.yml @@ -0,0 +1,46 @@ +name: nf-core template version comment +# This workflow is triggered on PRs to check if the pipeline template version matches the latest nf-core version. +# It posts a comment to the PR, even if it comes from a fork. + +on: pull_request_target + +jobs: + template_version: + runs-on: ubuntu-latest + steps: + - name: Check out pipeline code + uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 + with: + ref: ${{ github.event.pull_request.head.sha }} + + - name: Read template version from .nf-core.yml + uses: nichmor/minimal-read-yaml@v0.0.2 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + + - name: Install nf-core + run: | + python -m pip install --upgrade pip + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Check nf-core outdated + id: nf_core_outdated + run: echo "OUTPUT=$(pip list --outdated | grep nf-core)" >> ${GITHUB_ENV} + + - name: Post nf-core template version comment + uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2 + if: | + contains(env.OUTPUT, 'nf-core') + with: + repo-token: ${{ secrets.NF_CORE_BOT_AUTH_TOKEN }} + allow-repeats: false + message: | + > [!WARNING] + > Newer version of the nf-core template is available. + > + > Your pipeline is using an old version of the nf-core template: ${{ steps.read_yml.outputs['nf_core_version'] }}. + > Please update your pipeline to the latest version. + > + > For more documentation on how to update your pipeline, please see the [nf-core documentation](https://github.com/nf-core/tools?tab=readme-ov-file#sync-a-pipeline-with-the-template) and [Synchronisation documentation](https://nf-co.re/docs/contributing/sync). + # diff --git a/.gitignore b/.gitignore index 2eef655b..23b0c7de 100644 --- a/.gitignore +++ b/.gitignore @@ -6,4 +6,5 @@ results/ testing/ testing* *.pyc +null/ .nf-test* diff --git a/.gitpod.yml b/.gitpod.yml index 105a1821..83599f63 100644 --- a/.gitpod.yml +++ b/.gitpod.yml @@ -4,17 +4,7 @@ tasks: command: | pre-commit install --install-hooks nextflow self-update - - name: unset JAVA_TOOL_OPTIONS - command: | - unset JAVA_TOOL_OPTIONS vscode: - extensions: # based on nf-core.nf-core-extensionpack - - esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code - - EditorConfig.EditorConfig # override user/workspace settings with settings found in .editorconfig files - - Gruntfuggly.todo-tree # Display TODO and FIXME in a tree view in the activity bar - - mechatroner.rainbow-csv # Highlight columns in csv files in different colors - # - nextflow.nextflow # Nextflow syntax highlighting - - oderwat.indent-rainbow # Highlight indentation level - - streetsidesoftware.code-spell-checker # Spelling checker for source code - - charliermarsh.ruff # Code linter Ruff + extensions: + - nf-core.nf-core-extensionpack # https://github.com/nf-core/vscode-extensionpack diff --git a/.nf-core.yml b/.nf-core.yml index 318ad93d..33bbf802 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -1,4 +1,20 @@ -repository_type: pipeline lint: - actions_ci: False ## TODO: re-activate once nf-test ci.yml structure updated -nf_core_version: "2.14.1" + actions_ci: false + files_exist: + - conf/igenomes.config + - conf/igenomes_ignored.config +nf_core_version: 3.2.0 +repository_type: pipeline +template: + author: Jasmin Frangenberg, Anan Ibrahim, Louisa Perelo, Moritz E. Beber, James + A. Fellows Yates + description: Pipeline for screening for functional components of assembled contigs + force: false + is_nfcore: true + name: funcscan + org: nf-core + outdir: . + skip_features: + - igenomes + - fastqc + version: 2.1.0 diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 4dc0f1dc..1dec8650 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -7,7 +7,7 @@ repos: - prettier@3.2.5 - repo: https://github.com/editorconfig-checker/editorconfig-checker.python - rev: "2.7.3" + rev: "3.1.2" hooks: - id: editorconfig-checker alias: ec diff --git a/.prettierignore b/.prettierignore index abb4b4d6..f6400d75 100644 --- a/.prettierignore +++ b/.prettierignore @@ -10,4 +10,5 @@ testing/ testing* *.pyc bin/ +ro-crate-metadata.json tests/ diff --git a/CHANGELOG.md b/CHANGELOG.md index 67b1238a..79cffe10 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,7 +3,44 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## v2.0.0 - [2024-09-05] +## v2.1.0 - Egyptian Koshari - [2025-03-05] + +### `Added` + +- [#421](https://github.com/nf-core/funcscan/pull/421), [#429](https://github.com/nf-core/funcscan/pull/429), [#433](https://github.com/nf-core/funcscan/pull/433), [#438](https://github.com/nf-core/funcscan/pull/438), [#441](https://github.com/nf-core/funcscan/pull/441) Updated to nf-core template 3.0.2, 3.1.0, 3.1.1, 3.1.2, and 3.2.0. (by @jfy133 and @jasmezz) +- [#427](https://github.com/nf-core/funcscan/pull/427) AMPcombi now can use multiple other databases for classifications. (by @darcy220606) +- [#428](https://github.com/nf-core/funcscan/pull/428) Added InterProScan annotation workflow to the pipeline. The results are coupled to AMPcombi final table. (by @darcy220606) +- [#431](https://github.com/nf-core/funcscan/pull/431) Updated AMPcombi, Macrel, all MMseqs2 modules, MultiQC, Pyrodigal, and seqkit, added `--taxa_classification_mmseqs_compressed` parameter. (by @jasmezz) +- [#441](https://github.com/nf-core/funcscan/pull/441) Updated MultiQC. (by @jasmezz and @jfy133) +- [#440](https://github.com/nf-core/funcscan/pull/440) Updated Bakta and introduced new parameter `--annotation_bakta_hmms`. (by @jasmezz) + +### `Fixed` + +- [#427](https://github.com/nf-core/funcscan/pull/427) Fixed the AMP reference database issues reported by users, due to non-ASCII characters. (by @darcy220606) +- [#430](https://github.com/nf-core/funcscan/pull/430) Updated `rgi/main` module to fix incorrect variable name. (by @amizeranschi and @jasmezz) +- [#435](https://github.com/nf-core/funcscan/pull/435) Fixed dependency errors within taxonomy merging scripts, updated the code and output for all three workflows. Bumped to version 0.1.1. (by @darcy220606) +- [#437](https://github.com/nf-core/funcscan/pull/437) Fixed file name error when supplying already preprocessed CARD database for ARG workflow. (by @jasmezz) +- [#446](https://github.com/nf-core/funcscan/pull/446) Updated antiSMASH modules to fix apptainer execution. (by @jasmezz and @jfy133) +- [#448](https://github.com/nf-core/funcscan/pull/448) Fixed taxonomy merge to work with output from GTDB/SILVA/KALAMARI. (by @darcy220606) +- [#447](https://github.com/nf-core/funcscan/pull/447) Added `--annotation_pyrodigal_usespecialstopcharacter` parameter to improve AMPlify screening. (by @jasmezz) +- [#454](https://github.com/nf-core/funcscan/pull/454) Updated default CPU requirement of `ampcombi2/parsetables`. (by @jasmezz) + +### `Dependencies` + +| Tool | Previous Version | New Version | +| ------------ | ---------------- | ----------- | +| AMPcombi | 0.2.2 | 2.0.1 | +| Bakta | 1.9.3 | 1.10.4 | +| InterProScan | - | 5.59_91.0 | +| Macrel | 1.2.0 | 1.4.0 | +| MMseqs2 | 15.6f452 | 17.b804f | +| MultiQC | 1.24.0 | 1.27 | +| Pyrodigal | 3.3.0 | 3.6.3 | +| seqkit | 2.8.1 | 2.9.0 | + +### `Deprecated` + +## v2.0.0 - Brazilian Escondidinho - [2024-09-05] ### `Breaking change` @@ -79,7 +116,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | GECCO | 0.9.8 | 0.9.10 | | hAMRonization | 1.1.1 | 1.1.4 | | HMMER | 3.3.2 | 3.4 | -| MMSeqs | NA | 2:15.6f452 | +| MMseqs2 | NA | 15.6f452 | | MultiQC | 1.15 | 1.24 | | Pyrodigal | 2.1.0 | 3.3.0 | | RGI | 5.2.1 | 6.0.3 | diff --git a/CITATIONS.md b/CITATIONS.md index 80493194..37e595d2 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -26,13 +26,13 @@ > Feldgarden, M., Brover, V., Gonzalez-Escalona, N., Frye, J. G., Haendiges, J., Haft, D. H., Hoffmann, M., Pettengill, J. B., Prasad, A. B., Tillman, G. E., Tyson, G. H., & Klimke, W. (2021). AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Scientific reports, 11(1), 12728. [DOI: 10.1038/s41598-021-91456-0](https://doi.org/10.1038/s41598-021-91456-0) -- [AntiSMASH](https://doi.org/10.1093/nar/gkab335) +- [AntiSMASH](https://doi.org/10.1093/nar/gkad344) - > Blin, K., Shaw, S., Kloosterman, A. M., Charlop-Powers, Z., van Wezel, G. P., Medema, M. H., & Weber, T. (2021). antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic acids research, 49(W1), W29–W35. [DOI: 10.1093/nar/gkab335](https://doi.org/10.1093/nar/gkab335) + > Blin, K., Shaw, S., Augustijn, H. E., Reitz, Z. L., Biermann, F., Alanjary, M., Fetter, A., Terlouw B. R., Metcalf, W. W., Helfrich, E. J. N., van Wezel, G. P., Medema, M. H., & Weber, T. (2023). antiSMASH 7.0: new and improved predictions for detection, regulation, chemical structures and visualisation. Nucleic acids research, 51(W1), W46–W50. [DOI: 10.1093/nar/gkad344](https://doi.org/10.1093/nar/gkad344) -- [argNorm](https://github.com/BigDataBiology/argNorm) +- [argNorm](https://doi.org/10.5204/rep.eprints.252448) - > Perovic, S. U., Ramji, V., Chong, H., Duan, Y., Maguire, F., Coelho, L. P. (2024). BigDataBiology/argNorm. [DOI: 10.5281/zenodo.10963591](https://zenodo.org/doi/10.5281/zenodo.10963591) + > Ugarcina Perovic, S., Ramji, V., Chong, H., Duan, Y., Maguire, F., Coelho, L. P. (2024). argNorm: Normalization of antibiotic resistance gene annotations to the Antibiotic Resistance Ontology (ARO). [Preprint] (Unpublished) [DOI: 10.5204/rep.eprints.252448](https://doi.org/10.5204/rep.eprints.252448) - [Bakta](https://doi.org/10.1099/mgen.0.000685) @@ -70,6 +70,14 @@ > Eddy S. R. (2011). Accelerated Profile HMM Searches. PLoS computational biology, 7(10), e1002195. [DOI: 10.1371/journal.pcbi.1002195](https://doi.org/10.1371/journal.pcbi.1002195) +- [InterPro](https://doi.org/10.1093/nar/gkaa977) + + > Blum, M., Chang, H-Y., Chuguransky, S., Grego, T., Kandasaamy, S., Mitchell, A., Nuka, G., Paysan-Lafosse, T., Qureshi, M., Raj, S., Richardson, L., Salazar, G. A., Williams, L., Bork, P., Bridge, A., Gough, J., Haft, D. H., Letunic, I., Marchler-Bauer, A., Mi, H., Natale, D. A., Necci, M., Orengo, C. A., Pandurangan, A. P., Rivoire, C., Sigrist, C. A., Sillitoe, I., Thanki, N., Thomas, P. D., Tosatto, S. C. E, Wu, C. H., Bateman, A., Finn, R. D. (2021) The InterPro protein families and domains database: 20 years on. Nucleic Acids Research, 49(D1), D344–D354. [DOI: 10.1093/nar/gkaa977](https://doi.org/10.1093/nar/gkaa977) + +- [InterProScan](https://doi.org/10.1093/bioinformatics/btu031) + + > Jones, P., Binns, D., Chang, H-Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A. F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S-Y., Lopez, R., Hunter, S. (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics, 30(9), 1236–1240. [DOI: 10.1093/bioinformatics/btu031](https://doi.org/10.1093/bioinformatics/btu031) + - [Macrel](https://doi.org/10.7717/peerj.10555) > Santos-Júnior, C. D., Pan, S., Zhao, X. M., & Coelho, L. P. (2020). Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ, 8, e10555. [DOI: 10.7717/peerj.10555](https://doi.org/10.7717/peerj.10555) @@ -90,7 +98,7 @@ > Larralde, M. (2022). Pyrodigal: Python bindings and interface to Prodigal, an efficient method for gene prediction in prokaryotes. Journal of Open Source Software, 7(72), 4296. [DOI: 10.21105/joss.04296](https://doi.org/10.21105/joss.04296) -- [RGI](https://doi.org/10.1093/nar/gkz935) +- [RGI](https://doi.org/10.1093/nar/gkac920) > Alcock, B. P., Huynh, W., Chalil, R., Smith, K. W., Raphenya, A. R., Wlodarski, M. A., Edalatmand, A., Petkau, A., Syed, S. A., Tsang, K. K., Baker, S. J. C., Dave, M., McCarthy, M. C., Mukiri, K. M., Nasir, J. A., Golbon, B., Imtiaz, H., Jiang, X., Kaur, K., Kwong, M., Liang, Z. C., Niu, K. C., Shan, P., Yang, J. Y. J., Gray, K. L., Hoad, G. R., Jia, B., Bhando, T., Carfrae, L. A., Farha, M. A., French, S., Gordzevich, R., Rachwalski, K., Tu, M. M., Bordeleau, E., Dooley, D., Griffiths, E., Zubyk, H. L., Brown, E. D., Maguire, F., Beiko, R. G., Hsiao, W. W. L., Brinkman F. S. L., Van Domselaar, G., McArthur, A. G. (2023). CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database. Nucleic acids research, 51(D1):D690-D699. [DOI: 10.1093/nar/gkac920](https://doi.org/10.1093/nar/gkac920) diff --git a/LICENSE b/LICENSE index a5c91c03..5df67c3a 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) Jasmin Frangenberg, Anan Ibrahim, Louisa Perelo, Moritz E. Beber, James A. Fellows Yates +Copyright (c) The nf-core/funcscan team Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index d5c9fcda..6d12d5a9 100644 --- a/README.md +++ b/README.md @@ -9,8 +9,7 @@ [![GitHub Actions Linting Status](https://github.com/nf-core/funcscan/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/funcscan/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/funcscan/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.7643099-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.7643099) [![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com) -[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A523.04.0-23aa62.svg)](https://www.nextflow.io/) - +[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.2-23aa62.svg)](https://www.nextflow.io/) [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) @@ -33,11 +32,12 @@ The nf-core/funcscan AWS full test dataset are contigs generated by the MGnify s 1. Quality control of input sequences with [`SeqKit`](https://bioinf.shenwei.me/seqkit/) 2. Taxonomic classification of contigs of **prokaryotic origin** with [`MMseqs2`](https://github.com/soedinglab/MMseqs2) 3. Annotation of assembled prokaryotic contigs with [`Prodigal`](https://github.com/hyattpd/Prodigal), [`Pyrodigal`](https://github.com/althonos/pyrodigal), [`Prokka`](https://github.com/tseemann/prokka), or [`Bakta`](https://github.com/oschwengers/bakta) -4. Screening contigs for antimicrobial peptide-like sequences with [`ampir`](https://cran.r-project.org/web/packages/ampir/index.html), [`Macrel`](https://github.com/BigDataBiology/macrel), [`HMMER`](http://hmmer.org/), [`AMPlify`](https://github.com/bcgsc/AMPlify) -5. Screening contigs for antibiotic resistant gene-like sequences with [`ABRicate`](https://github.com/tseemann/abricate), [`AMRFinderPlus`](https://github.com/ncbi/amr), [`fARGene`](https://github.com/fannyhb/fargene), [`RGI`](https://card.mcmaster.ca/analyze/rgi), [`DeepARG`](https://bench.cs.vt.edu/deeparg). [`argNorm`](https://github.com/BigDataBiology/argNorm) is used to map the outputs of `DeepARG`, `AMRFinderPlus`, and `ABRicate` to the [`Antibiotic Resistance Ontology`](https://www.ebi.ac.uk/ols4/ontologies/aro) for consistent ARG classification terms. -6. Screening contigs for biosynthetic gene cluster-like sequences with [`antiSMASH`](https://antismash.secondarymetabolites.org), [`DeepBGC`](https://github.com/Merck/deepbgc), [`GECCO`](https://gecco.embl.de/), [`HMMER`](http://hmmer.org/) -7. Creating aggregated reports for all samples across the workflows with [`AMPcombi`](https://github.com/Darcy220606/AMPcombi) for AMPs, [`hAMRonization`](https://github.com/pha4ge/hAMRonization) for ARGs, and [`comBGC`](https://raw.githubusercontent.com/nf-core/funcscan/master/bin/comBGC.py) for BGCs -8. Software version and methods text reporting with [`MultiQC`](http://multiqc.info/) +4. Annotation of coding sequences from 3. to obtain general protein families and domains with [`InterProScan`](https://github.com/ebi-pf-team/interproscan) +5. Screening contigs for antimicrobial peptide-like sequences with [`ampir`](https://cran.r-project.org/web/packages/ampir/index.html), [`Macrel`](https://github.com/BigDataBiology/macrel), [`HMMER`](http://hmmer.org/), [`AMPlify`](https://github.com/bcgsc/AMPlify) +6. Screening contigs for antibiotic resistant gene-like sequences with [`ABRicate`](https://github.com/tseemann/abricate), [`AMRFinderPlus`](https://github.com/ncbi/amr), [`fARGene`](https://github.com/fannyhb/fargene), [`RGI`](https://card.mcmaster.ca/analyze/rgi), [`DeepARG`](https://bench.cs.vt.edu/deeparg). [`argNorm`](https://github.com/BigDataBiology/argNorm) is used to map the outputs of `DeepARG`, `AMRFinderPlus`, and `ABRicate` to the [`Antibiotic Resistance Ontology`](https://www.ebi.ac.uk/ols4/ontologies/aro) for consistent ARG classification terms. +7. Screening contigs for biosynthetic gene cluster-like sequences with [`antiSMASH`](https://antismash.secondarymetabolites.org), [`DeepBGC`](https://github.com/Merck/deepbgc), [`GECCO`](https://gecco.embl.de/), [`HMMER`](http://hmmer.org/) +8. Creating aggregated reports for all samples across the workflows with [`AMPcombi`](https://github.com/Darcy220606/AMPcombi) for AMPs, [`hAMRonization`](https://github.com/pha4ge/hAMRonization) for ARGs, and [`comBGC`](https://raw.githubusercontent.com/nf-core/funcscan/master/bin/comBGC.py) for BGCs +9. Software version and methods text reporting with [`MultiQC`](http://multiqc.info/) ![funcscan metro workflow](docs/images/funcscan_metro_workflow.png) @@ -72,8 +72,7 @@ nextflow run nf-core/funcscan \ ``` > [!WARNING] -> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; -> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files). +> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files). For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/funcscan/usage) and the [parameter documentation](https://nf-co.re/funcscan/parameters). diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml index 5471c44d..8a8682d8 100644 --- a/assets/multiqc_config.yml +++ b/assets/multiqc_config.yml @@ -1,7 +1,8 @@ report_comment: > - This report has been generated by the nf-core/funcscan - analysis pipeline. For information about how to interpret these results, please see the - documentation. + This report has been generated by the nf-core/funcscan analysis pipeline. For information about how + to interpret these results, please see the documentation. report_section_order: "nf-core-funcscan-methods-description": order: -1000 @@ -16,7 +17,7 @@ run_modules: table_columns_visible: Prokka: - organism: False + organism: false export_plots: true @@ -27,4 +28,4 @@ custom_logo_url: https://nf-co.re/funcscan custom_logo_title: "nf-core/funcscan" ## Tool specific configuration -prokka_fn_snames: True +prokka_fn_snames: true diff --git a/assets/schema_input.json b/assets/schema_input.json index 62b4ece9..be402b5f 100644 --- a/assets/schema_input.json +++ b/assets/schema_input.json @@ -1,5 +1,5 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/nf-core/funcscan/master/assets/schema_input.json", "title": "nf-core/funcscan pipeline - params.input schema", "description": "Schema for the file provided with params.input", @@ -11,36 +11,35 @@ "type": "string", "pattern": "^\\S+$", "errorMessage": "Sample name must be provided and cannot contain spaces", - "meta": ["id"], - "unique": true + "meta": ["id"] }, "fasta": { "type": "string", "format": "file-path", "exists": true, "pattern": "^\\S+\\.(fasta|fas|fna|fa)(\\.gz)?$", - "errorMessage": "Fasta file for reads must be provided, cannot contain spaces and must have extension `.fa.gz`, `.fna.gz` or `.fasta.gz`", - "unique": true + "errorMessage": "Fasta file for reads must be provided, cannot contain spaces and must have extension `.fa`, `.fa.gz`, `.fna`, `.fna.gz`, `.fasta`, or `.fasta.gz`" }, "protein": { "type": "string", "format": "file-path", "exists": true, "pattern": "^\\S+\\.(faa|fasta)(\\.gz)?$", - "errorMessage": "Input file for peptide annotations has incorrect file format. File must end in `.fasta` or `.faa`", - "unique": true, - "dependentRequired": ["gbk"] + "errorMessage": "Input file for peptide annotations has incorrect file format. File must end in `.fasta`, `.fasta.gz`, `.faa`, or `.faa.gz`" }, "gbk": { "type": "string", "format": "file-path", "exists": true, "pattern": "^\\S+\\.(gbk|gbff)(\\.gz)?$", - "errorMessage": "Input file for feature annotations has incorrect file format. File must end in `.gbk.gz` or `.gbff.gz`", - "unique": true, - "dependentRequired": ["protein"] + "errorMessage": "Input file for feature annotations has incorrect file format. File must end in `.gbk`, `.gbk.gz` or `.gbff`, or `.gbff.gz`" } }, - "required": ["sample", "fasta"] - } + "required": ["sample", "fasta"], + "dependentRequired": { + "protein": ["gbk"], + "gbk": ["protein"] + } + }, + "uniqueItems": true } diff --git a/bin/ampcombi_download.py b/bin/ampcombi_download.py index dd1373ce..c9a4f639 100755 --- a/bin/ampcombi_download.py +++ b/bin/ampcombi_download.py @@ -1,78 +1,144 @@ #!/usr/bin/env python3 ######################################### -# Authors: [Anan Ibrahim](https://github.com/brianjohnhaas), [Louisa Perelo](https://github.com/louperelo) +# Authors: [Anan Ibrahim](https://github.com/Darcy220606/AMPcombi), [Louisa Perelo](https://github.com/louperelo) # File: amp_database.py # Source: https://github.com/Darcy220606/AMPcombi/blob/main/ampcombi/amp_database.py -# Source+commit: https://github.com/Darcy220606/AMPcombi/commit/a75bc00c32ecf873a133b18cf01f172ad9cf0d2d/ampcombi/amp_database.py -# Download Date: 2023-03-08, commit: a75bc00c # This source code is licensed under the MIT license ######################################### -# TITLE: Download the DRAMP database if input db empty AND and make database compatible for diamond +# TITLE: Download the reference database specified by the user. import pandas as pd import requests import os -from datetime import datetime +import re import subprocess -from Bio import SeqIO -import tempfile -import shutil +import argparse +from datetime import datetime +from Bio.Seq import Seq +from Bio.SeqRecord import SeqRecord +from Bio import SeqIO ######################################## -# FUNCTION: DOWNLOAD DRAMP DATABASE AND CLEAN IT +# FUNCTION: DOWNLOAD DATABASES AND CLEAN DRAMP and APD ######################################### -def download_DRAMP(db): - ##Download the (table) file and store it in a results directory - url = "http://dramp.cpu-bioinfor.org/downloads/download.php?filename=download_data/DRAMP3.0_new/general_amps.xlsx" - r = requests.get(url, allow_redirects=True) - with open(db + "/" + "general_amps.xlsx", "wb") as f: - f.write(r.content) - ##Convert excel to tab sep file and write it to a file in the DRAMP_db directly with the date its downloaded - date = datetime.now().strftime("%Y_%m_%d") - ref_amps = pd.read_excel(db + "/" + r"general_amps.xlsx") - ref_amps.to_csv(db + "/" + f"general_amps_{date}.tsv", index=None, header=True, sep="\t") - ##Download the (fasta) file and store it in a results directory - urlfasta = ( - "http://dramp.cpu-bioinfor.org/downloads/download.php?filename=download_data/DRAMP3.0_new/general_amps.fasta" - ) - z = requests.get(urlfasta) - fasta_path = os.path.join(db + "/" + f"general_amps_{date}.fasta") - with open(fasta_path, "wb") as f: - f.write(z.content) - ##Cleaning step to remove ambigous aminoacids from sequences in the database (e.g. zeros and brackets) - new_fasta = db + "/" + f"general_amps_{date}_clean.fasta" - seq_record = SeqIO.parse(open(fasta_path), "fasta") - with open(new_fasta, "w") as f: - for record in seq_record: - id, sequence = record.id, str(record.seq) - letters = [ - "A", - "C", - "D", - "E", - "F", - "G", - "H", - "I", - "K", - "L", - "M", - "N", - "P", - "Q", - "R", - "S", - "T", - "V", - "W", - "Y", - ] - new = "".join(i for i in sequence if i in letters) - f.write(">" + id + "\n" + new + "\n") - return os.remove(fasta_path), os.remove(db + "/" + r"general_amps.xlsx") +def download_ref_db(database, threads): + """ + Downloads a specified AMP (antimicrobial peptide) reference database based on the + provided database name and saves it to the specified directory. + This supports downloading databases only from DRAMP, APD, and UniRef100. + Parameters: + ---------- + db : str + The directory path where the downloaded database should be saved. + database : str + The name of the database to download. Must be one of 'DRAMP', 'APD', or 'UniRef100'. + threads : int + Number of threads to use when downloading the UniRef100 database with `mmseqs`. + """ + # Check which database was given + if database == 'DRAMP': + # Create dir + db = 'amp_DRAMP_database' + os.makedirs(db, exist_ok=True) + # Download the file + try: + url = 'http://dramp.cpu-bioinfor.org/downloads/download.php?filename=download_data/DRAMP3.0_new/general_amps.txt' + response = requests.get(url, allow_redirects=True) + response.raise_for_status() # Check for any download errors + date = datetime.now().strftime("%Y_%m_%d") + with open(db + '/' + f'general_amps_{date}.txt', 'wb') as file: + file.write(response.content) + print(f"File downloaded successfully and saved to {db}/general_amps_{date}.txt") + # Create fasta version and clean it + db_df = pd.read_csv(f'{db}/general_amps_{date}.txt', sep='\t') + records = [] + valid_sequence_pattern = re.compile("^[ACDEFGHIKLMNPQRSTVWY]+$") + for index, row in db_df.iterrows(): + sequence = row['Sequence'] + if valid_sequence_pattern.match(sequence): + record = SeqRecord(Seq(sequence), id=str(row['DRAMP_ID']), description="") + records.append(record) + output_file = f'{db}/general_amps_{date}.fasta' + SeqIO.write(records, output_file, "fasta") + except requests.exceptions.RequestException as e: + print(f"Failed to download DRAMP AMP general database file: {e}") + return + + if database == 'APD': + # Create dir + db = 'amp_APD_database' + os.makedirs(db, exist_ok=True) + # Download the file + try: + url = 'https://aps.unmc.edu/assets/sequences/APD_sequence_release_09142020.fasta' + response = requests.get(url, allow_redirects=True, verify=False) # Disable SSL verification due to site certificate issue + response.raise_for_status() + content = response.text + print("APD AMP database downloaded successfully.") + except requests.exceptions.RequestException as e: + print(f"Failed to download content: {e}") + return + # Save the content line-by-line exactly as is + try: + with open(db + '/' + 'APD_orig.fasta', 'w') as file: + file.write(content) + with open(f'{db}/APD.fasta', 'w') as output_handle: + valid_sequence_pattern = re.compile("^[ACDEFGHIKLMNPQRSTVWY]+$") + for record in SeqIO.parse(f'{db}/APD_orig.fasta', "fasta"): + sequence = str(record.seq) + if valid_sequence_pattern.match(sequence): + SeqIO.write(record, output_handle, "fasta") + os.remove(db + '/' + 'APD_orig.fasta') + print(f"APD AMP database saved successfully to {db}/APD.fasta") + # Fasta to table + headers = [] + sequences = [] + seq_ids = [] + for i, record in enumerate(SeqIO.parse(f'{db}/APD.fasta', "fasta")): + sequence_id = record.description.split('|')[0] + headers.append(record.description) + sequences.append(str(record.seq)) + seq_ids.append(sequence_id) + db_df = pd.DataFrame({ + "APD_ID": seq_ids, + "APD_Description": headers, + "APD_Sequence": sequences}) + db_df.to_csv(f'{db}/APD.txt', sep='\t', index=False, header=True) + os.remove(db + '/' + 'APD.fasta') + # Table to fasta + records = [] + for index, row in db_df.iterrows(): + sequence = row['APD_Sequence'] + record = SeqRecord(Seq(sequence), id=str(row['APD_ID']), description="") + records.append(record) + output_file = f'{db}/APD.fasta' + SeqIO.write(records, output_file, "fasta") + except Exception as e: + print(f"Failed to save APD AMP database: {e}") + + if database == 'UniRef100': + # Create dir + db = 'amp_UniRef100_database' + os.makedirs(db, exist_ok=True) + # Download the file + try: + os.makedirs(f'{db}/mmseqs2', exist_ok=True) + command = f"mmseqs databases UniRef100 {db}/mmseqs2/ref_DB {db}/mmseqs2/tmp --remove-tmp-files true --threads {threads} -v 0" + subprocess.run(command, shell=True, check=True) + print(f"UniRef100 protein database downloaded successfully and saved to {db}/mmseqs2/UniRef100") + except subprocess.CalledProcessError as e: + print(f"Failed to download UniRef100 protein database: {e}") +if __name__ == "__main__": + parser = argparse.ArgumentParser( + description="Downloads a specified AMP (antimicrobial peptide) reference database based on the provided database name and saves it to the specified directory.") + parser.add_argument("--database_id", dest="database", type=str, required=True, choices=["DRAMP", "APD", "UniRef100"], + help="Database ID - one of DRAMP, APD, or UniRef100. This parameter is required.") + parser.add_argument("--threads", type=int, default=4, + help="Number of threads supplied to mmseqs databases. Only relevant in the case of 'UniRef100'. Default is 4.") -download_DRAMP("amp_ref_database") + args = parser.parse_args() + download_ref_db(args.database, args.threads) diff --git a/bin/merge_taxonomy.py b/bin/merge_taxonomy.py index 44eed31a..ef54e1c2 100755 --- a/bin/merge_taxonomy.py +++ b/bin/merge_taxonomy.py @@ -3,7 +3,7 @@ # Written by Anan Ibrahim and released under the MIT license. # See git repository (https://github.com/Darcy220606/AMPcombi) for full license text. # Date: March 2024 -# Version: 0.1.0 +# Version: 0.1.1 # Required modules import sys @@ -12,7 +12,7 @@ import numpy as np import argparse -tool_version = "0.1.0" +tool_version = "0.1.1" ######################################### # TOP LEVEL: AMPCOMBI ######################################### @@ -66,9 +66,24 @@ # TAXONOMY ######################################### def reformat_mmseqs_taxonomy(mmseqs_taxonomy): - mmseqs2_df = pd.read_csv(mmseqs_taxonomy, sep='\t', header=None, names=['contig_id', 'taxid', 'rank_label', 'scientific_name', 'lineage', 'mmseqs_lineage_contig']) - # remove the lineage column - mmseqs2_df.drop('lineage', axis=1, inplace=True) + """_summary_ + Reformats the taxonomy files and joins them in a list to be passed on to the tools functions + Note: Every database from MMseqs outputs a different number of columns. Only the first 4 and last 2 columns are constant + and the most important. + + Args: + mmseqs_taxonomy (tsv): mmseqs output file per sample + + Returns: + data frame: reformatted tables + """ + col_numbers = pd.read_csv(mmseqs_taxonomy, sep='\t', header=None, nrows=1).shape[1] + selected_cols_numbers = [0, 1, 2, 3, col_numbers - 1] + mmseqs2_df = pd.read_csv(mmseqs_taxonomy, + sep='\t', + header=None, + usecols= selected_cols_numbers, + names=['contig_id', 'taxid', 'rank_label', 'scientific_name', 'mmseqs_lineage_contig']) mmseqs2_df['mmseqs_lineage_contig'].unique() # convert any classification that has Eukaryota/root to NaN as funcscan targets bacteria ONLY ** for i, row in mmseqs2_df.iterrows(): @@ -85,7 +100,19 @@ def reformat_mmseqs_taxonomy(mmseqs_taxonomy): # FUNCTION: AMPCOMBI ######################################### def ampcombi_taxa(args): - merged_df = pd.DataFrame() + """_summary_ + Merges AMPcombi tool output with taxonomy information. + + Parameters: + ---------- + args: + Contains arguments for AMPcombi file path (`amp`) and list of taxonomy file paths (`taxa1`). + + Outputs: + ------- + Creates a file named `ampcombi_complete_summary_taxonomy.tsv` containing the merged results. + """ + combined_dfs = [] # assign input args to variables ampcombi = args.amp @@ -100,13 +127,6 @@ def ampcombi_taxa(args): # filter the tool df tool_df = pd.read_csv(ampcombi, sep='\t') - # remove the column with contig_id - duplicate #NOTE: will be fixed in AMPcombi v2.0.0 - tool_df = tool_df.drop('contig_id', axis=1) - # make sure 1st and 2nd column have the same column labels - tool_df.rename(columns={tool_df.columns[0]: 'sample_id'}, inplace=True) - tool_df.rename(columns={tool_df.columns[1]: 'contig_id'}, inplace=True) - # grab the real contig id in another column copy for merging - tool_df['contig_id_merge'] = tool_df['contig_id'].str.rsplit('_', 1).str[0] # merge rows from taxa to ampcombi_df based on substring match in sample_id # grab the unique sample names from the taxonomy table @@ -114,17 +134,18 @@ def ampcombi_taxa(args): # for every sampleID in taxadf merge the results for sampleID in samples_taxa: # subset ampcombi - subset_tool = tool_df.loc[tool_df['sample_id'].str.contains(sampleID)] + subset_tool = tool_df[tool_df['sample_id'].str.contains(sampleID, na=False)] # subset taxa - subset_taxa = taxa_df.loc[taxa_df['sample_id'].str.contains(sampleID)] + subset_taxa = taxa_df[taxa_df['sample_id'].str.contains(sampleID, na=False)] # merge - subset_df = pd.merge(subset_tool, subset_taxa, left_on = 'contig_id_merge', right_on='contig_id', how='left') + subset_df = pd.merge(subset_tool, subset_taxa, on='contig_id', how='left') # cleanup the table - columnsremove = ['contig_id_merge','contig_id_y', 'sample_id_y'] + columnsremove = ['sample_id_y'] subset_df.drop(columnsremove, axis=1, inplace=True) - subset_df.rename(columns={'contig_id_x': 'contig_id', 'sample_id_x':'sample_id'},inplace=True) + subset_df.rename(columns={'sample_id_x':'sample_id'},inplace=True) # append in the combined_df - merged_df = merged_df.append(subset_df, ignore_index=True) + combined_dfs.append(subset_df) + merged_df = pd.concat(combined_dfs, ignore_index=True) # write to file merged_df.to_csv('ampcombi_complete_summary_taxonomy.tsv', sep='\t', index=False) @@ -133,7 +154,20 @@ def ampcombi_taxa(args): # FUNCTION: COMBGC ######################################### def combgc_taxa(args): - merged_df = pd.DataFrame() + """_summary_ + + Merges comBGC tool output with taxonomy information. + + Parameters: + ---------- + args: + Contains arguments for comBGC file path (`bgc`) and list of taxonomy file paths (`taxa2`). + + Outputs: + ------- + Creates a file named `combgc_complete_summary_taxonomy.tsv` containing the merged results. + """ + combined_dfs = [] # assign input args to variables combgc = args.bgc @@ -152,23 +186,24 @@ def combgc_taxa(args): tool_df.rename(columns={tool_df.columns[0]: 'sample_id'}, inplace=True) tool_df.rename(columns={tool_df.columns[1]: 'contig_id'}, inplace=True) - # merge rows from taxa to ampcombi_df based on substring match in sample_id + # merge rows from taxa to combgc_df based on substring match in sample_id # grab the unique sample names from the taxonomy table samples_taxa = taxa_df['sample_id'].unique() # for every sampleID in taxadf merge the results for sampleID in samples_taxa: - # subset ampcombi - subset_tool = tool_df.loc[tool_df['sample_id'].str.contains(sampleID)] + # subset tool + subset_tool = tool_df[tool_df['sample_id'].str.contains(sampleID, na=False)] # subset taxa - subset_taxa = taxa_df.loc[taxa_df['sample_id'].str.contains(sampleID)] + subset_taxa = taxa_df[taxa_df['sample_id'].str.contains(sampleID, na=False)] # merge - subset_df = pd.merge(subset_tool, subset_taxa, left_on = 'contig_id', right_on='contig_id', how='left') + subset_df = pd.merge(subset_tool, subset_taxa, on='contig_id', how='left') # cleanup the table columnsremove = ['sample_id_y'] subset_df.drop(columnsremove, axis=1, inplace=True) subset_df.rename(columns={'sample_id_x':'sample_id'},inplace=True) # append in the combined_df - merged_df = merged_df.append(subset_df, ignore_index=True) + combined_dfs.append(subset_df) + merged_df = pd.concat(combined_dfs, ignore_index=True) # write to file merged_df.to_csv('combgc_complete_summary_taxonomy.tsv', sep='\t', index=False) @@ -177,7 +212,19 @@ def combgc_taxa(args): # FUNCTION: HAMRONIZATION ######################################### def hamronization_taxa(args): - merged_df = pd.DataFrame() + """_summary_ + Merges hAMRonization tool output with taxonomy information. + + Parameters: + ---------- + args: + Contains arguments for hamronization file path (`arg`) and list of taxonomy file paths (`taxa2`). + + Outputs: + ------- + Creates a file named `hamronization_complete_summary_taxonomy.tsv` containing the merged results. + """ + combined_dfs = [] # assign input args to variables hamronization = args.arg @@ -197,29 +244,46 @@ def hamronization_taxa(args): # reorder the columns new_order = ['sample_id', 'contig_id'] + [col for col in tool_df.columns if col not in ['sample_id', 'contig_id']] tool_df = tool_df.reindex(columns=new_order) - # grab the real contig id in another column copy for merging - tool_df['contig_id_merge'] = tool_df['contig_id'].str.rsplit('_', 1).str[0] - # merge rows from taxa to ampcombi_df based on substring match in sample_id + # merge rows from taxa to hamronization_df based on substring match in sample_id # grab the unique sample names from the taxonomy table samples_taxa = taxa_df['sample_id'].unique() # for every sampleID in taxadf merge the results for sampleID in samples_taxa: - # subset ampcombi - subset_tool = tool_df.loc[tool_df['sample_id'].str.contains(sampleID)] + # subset tool + subset_tool = tool_df[tool_df['sample_id'].str.contains(sampleID, na=False)] # subset taxa - subset_taxa = taxa_df.loc[taxa_df['sample_id'].str.contains(sampleID)] - # merge - subset_df = pd.merge(subset_tool, subset_taxa, left_on = 'contig_id_merge', right_on='contig_id', how='left') - # cleanup the table - columnsremove = ['contig_id_merge','contig_id_y', 'sample_id_y'] - subset_df.drop(columnsremove, axis=1, inplace=True) - subset_df.rename(columns={'contig_id_x': 'contig_id', 'sample_id_x':'sample_id'},inplace=True) - # append in the combined_df - merged_df = merged_df.append(subset_df, ignore_index=True) + subset_taxa = taxa_df[taxa_df['sample_id'].str.contains(sampleID, na=False)] + # ensure strings + subset_tool['contig_id'] = subset_tool['contig_id'].astype(str) + subset_taxa['contig_id'] = subset_taxa['contig_id'].astype(str) + # rename columns to avoid dropping of mutual ones + rename_dict = {col: f"{col}_taxa" for col in subset_taxa.columns if col in subset_tool.columns} + subset_taxa = subset_taxa.rename(columns=rename_dict) + + # merge by string + merged_rows = [] + # iterate and find all matches + for _, tool_row in subset_tool.iterrows(): + tool_contig_id = tool_row['contig_id'] + matches = subset_taxa[subset_taxa['contig_id_taxa'].apply(lambda x: str(x) in tool_contig_id)] + # if match, merge row + if not matches.empty: + for _, taxa_row in matches.iterrows(): + merged_row = {**tool_row.to_dict(), **taxa_row.to_dict()} + merged_rows.append(merged_row) + else: + # if no match keep row as is + merged_row = {**tool_row.to_dict()} + merged_rows.append(merged_row) + + merged_df = pd.DataFrame(merged_rows) + combined_dfs.append(merged_df) + + merged_df_final = pd.concat(combined_dfs, ignore_index=True) # write to file - merged_df.to_csv('hamronization_complete_summary_taxonomy.tsv', sep='\t', index=False) + merged_df_final.to_csv('hamronization_complete_summary_taxonomy.tsv', sep='\t', index=False) ######################################### # SUBPARSERS: DEFAULT diff --git a/conf/base.config b/conf/base.config index a928e380..02314571 100644 --- a/conf/base.config +++ b/conf/base.config @@ -10,51 +10,51 @@ process { - cpus = { check_max( 1 * task.attempt, 'cpus' ) } - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 4.h * task.attempt, 'time' ) } + cpus = { 1 * task.attempt } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } maxRetries = 1 maxErrors = '-1' // Process-specific resource requirements - // NOTE - Please try and re-use the labels below as much as possible. + // NOTE - Please try and reuse the labels below as much as possible. // These labels are used and recognised by default in DSL2 files hosted on nf-core/modules. // If possible, it would be nice to keep the same label naming convention when // adding in your local modules too. // See https://www.nextflow.io/docs/latest/config.html#config-process-selectors - withLabel:process_single { - cpus = { check_max( 1 , 'cpus' ) } - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 1.h * task.attempt, 'time' ) } + withLabel: process_single { + cpus = { 1 } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } } - withLabel:process_low { - cpus = { check_max( 2 * task.attempt, 'cpus' ) } - memory = { check_max( 12.GB * task.attempt, 'memory' ) } - time = { check_max( 1.h * task.attempt, 'time' ) } + withLabel: process_low { + cpus = { 2 * task.attempt } + memory = { 12.GB * task.attempt } + time = { 4.h * task.attempt } } - withLabel:process_medium { - cpus = { check_max( 6 * task.attempt, 'cpus' ) } - memory = { check_max( 36.GB * task.attempt, 'memory' ) } - time = { check_max( 1.h * task.attempt, 'time' ) } + withLabel: process_medium { + cpus = { 6 * task.attempt } + memory = { 36.GB * task.attempt } + time = { 8.h * task.attempt } } - withLabel:process_high { - cpus = { check_max( 12 * task.attempt, 'cpus' ) } - memory = { check_max( 72.GB * task.attempt, 'memory' ) } - time = { check_max( 1.h * task.attempt, 'time' ) } + withLabel: process_high { + cpus = { 12 * task.attempt } + memory = { 72.GB * task.attempt } + time = { 16.h * task.attempt } } - withLabel:process_long { - time = { check_max( 20.h * task.attempt, 'time' ) } + withLabel: process_long { + time = { 20.h * task.attempt } } - withLabel:process_high_memory { - memory = { check_max( 200.GB * task.attempt, 'memory' ) } + withLabel: process_high_memory { + memory = { 200.GB * task.attempt } } - withLabel:error_ignore { + withLabel: error_ignore { errorStrategy = 'ignore' } - withLabel:error_retry { + withLabel: error_retry { errorStrategy = 'retry' maxRetries = 2 } @@ -64,169 +64,178 @@ process { */ withName: GUNZIP { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 } withName: UNTAR { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 } withName: PROKKA { - memory = { check_max( 8.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } - time = { check_max( 8.h * task.attempt, 'time' ) } + memory = { 8.GB * task.attempt } + cpus = { 4 * task.attempt } + time = { 8.h * task.attempt } } withName: PRODIGAL_GBK { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 } withName: BAKTA_BAKTA { - memory = { check_max( 64.GB * task.attempt, 'memory' ) } - cpus = { check_max( 8 * task.attempt, 'cpus' ) } - time = { check_max( 8.h * task.attempt, 'time' ) } + memory = { 64.GB * task.attempt } + cpus = { 8 * task.attempt } + time = { 8.h * task.attempt } } withName: ABRICATE_RUN { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 2.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: AMRFINDERPLUS_RUN { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 } withName: DEEPARG_DOWNLOADDATA { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 - time = { check_max( 2.h * task.attempt, 'time' ) } + time = { 2.h * task.attempt } } withName: DEEPARG_PREDICT { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 } withName: FARGENE { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 2.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: RGI_MAIN { - memory = { check_max( 28.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 28.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: AMPIR { - memory = { check_max( 8.GB * task.attempt, 'memory' ) } + memory = { 8.GB * task.attempt } cpus = 1 } withName: AMPLIFY_PREDICT { - memory = { check_max( 16.GB * task.attempt, 'memory' ) } + memory = { 16.GB * task.attempt } cpus = 1 - time = { check_max( 24.h * task.attempt, 'time' ) } + time = { 24.h * task.attempt } } withName: AMP_HMMER_HMMSEARCH { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 2.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: MACREL_CONTIGS { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 4.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: BGC_HMMER_HMMSEARCH { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 2.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: ANTISMASH_ANTISMASHLITE { - memory = { check_max( 64.GB * task.attempt, 'memory' ) } - cpus = { check_max( 8 * task.attempt, 'cpus' ) } - time = { check_max( 12.h * task.attempt, 'time' ) } + memory = { 64.GB * task.attempt } + cpus = { 8 * task.attempt } + time = { 12.h * task.attempt } } withName: ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 + time = { 2.h * task.attempt } } withName: DEEPBGC_DOWNLOAD { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 } withName: DEEPBGC_PIPELINE { - memory = { check_max( 2.GB * task.attempt, 'memory' ) } + memory = { 2.GB * task.attempt } cpus = 1 - time = { check_max( 24.h * task.attempt, 'time' ) } + time = { 24.h * task.attempt } } withName: GECCO_RUN { - memory = { check_max( 16.GB * task.attempt, 'memory' ) } - cpus = { check_max( 4 * task.attempt, 'cpus' ) } + memory = { 16.GB * task.attempt } + cpus = { 4 * task.attempt } } withName: HAMRONIZATION_ABRICATE { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: HAMRONIZATION_AMRFINDERPLUS { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: HAMRONIZATION_DEEPARG { - memory = { check_max( 8.GB * task.attempt, 'memory' ) } + memory = { 8.GB * task.attempt } cpus = 1 } withName: HAMRONIZATION_RGI { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: HAMRONIZATION_FARGENE { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: HAMRONIZATION_SUMMARIZE { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: ARGNORM_DEEPARG { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: ARGNORM_ABRICATE { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: ARGNORM_AMRFINDERPLUS { - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } cpus = 1 } withName: AMPCOMBI2_PARSETABLES { - memory = { check_max( 8.GB * task.attempt, 'memory' ) } - time = { check_max( 2.h * task.attempt, 'time' ) } + memory = { 8.GB * task.attempt } + time = { 2.h * task.attempt } + cpus = { 16 * task.attempt } + errorStrategy = { task.exitStatus == 1 ? 'retry' : 'finish' } + maxRetries = 2 // Retry the process up to 2 times } withName: AMPCOMBI2_CLUSTER { - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 2.h * task.attempt, 'time' ) } + memory = { 6.GB * task.attempt } + time = { 2.h * task.attempt } + } + + withName: INTERPROSCAN_DATABASE { + memory = { 6.GB * task.attempt } + cpus = { 6 * task.attempt } } } diff --git a/conf/igenomes.config b/conf/igenomes.config deleted file mode 100644 index 3f114377..00000000 --- a/conf/igenomes.config +++ /dev/null @@ -1,440 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for iGenomes paths -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines reference genomes using iGenome paths. - Can be used by any config that customises the base path using: - $params.igenomes_base / --igenomes_base ----------------------------------------------------------------------------------------- -*/ - -params { - // illumina iGenomes reference file paths - genomes { - 'GRCh37' { - fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt" - mito_name = "MT" - macs_gsize = "2.7e9" - blacklist = "${projectDir}/assets/blacklists/GRCh37-blacklist.bed" - } - 'GRCh38' { - fasta = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.bed" - mito_name = "chrM" - macs_gsize = "2.7e9" - blacklist = "${projectDir}/assets/blacklists/hg38-blacklist.bed" - } - 'CHM13' { - fasta = "${params.igenomes_base}/Homo_sapiens/UCSC/CHM13/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Homo_sapiens/UCSC/CHM13/Sequence/BWAIndex/" - bwamem2 = "${params.igenomes_base}/Homo_sapiens/UCSC/CHM13/Sequence/BWAmem2Index/" - gtf = "${params.igenomes_base}/Homo_sapiens/NCBI/CHM13/Annotation/Genes/genes.gtf" - gff = "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/914/755/GCF_009914755.1_T2T-CHM13v2.0/GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz" - mito_name = "chrM" - } - 'GRCm38' { - fasta = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/README.txt" - mito_name = "MT" - macs_gsize = "1.87e9" - blacklist = "${projectDir}/assets/blacklists/GRCm38-blacklist.bed" - } - 'TAIR10' { - fasta = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Annotation/README.txt" - mito_name = "Mt" - } - 'EB2' { - fasta = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Annotation/README.txt" - } - 'UMD3.1' { - fasta = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Annotation/README.txt" - mito_name = "MT" - } - 'WBcel235' { - fasta = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Annotation/Genes/genes.bed" - mito_name = "MtDNA" - macs_gsize = "9e7" - } - 'CanFam3.1' { - fasta = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Annotation/README.txt" - mito_name = "MT" - } - 'GRCz10' { - fasta = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Annotation/Genes/genes.bed" - mito_name = "MT" - } - 'BDGP6' { - fasta = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Annotation/Genes/genes.bed" - mito_name = "M" - macs_gsize = "1.2e8" - } - 'EquCab2' { - fasta = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Annotation/README.txt" - mito_name = "MT" - } - 'EB1' { - fasta = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Annotation/README.txt" - } - 'Galgal4' { - fasta = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Annotation/Genes/genes.bed" - mito_name = "MT" - } - 'Gm01' { - fasta = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Annotation/README.txt" - } - 'Mmul_1' { - fasta = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Annotation/README.txt" - mito_name = "MT" - } - 'IRGSP-1.0' { - fasta = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Annotation/Genes/genes.bed" - mito_name = "Mt" - } - 'CHIMP2.1.4' { - fasta = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Annotation/README.txt" - mito_name = "MT" - } - 'Rnor_5.0' { - fasta = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_5.0/Annotation/Genes/genes.bed" - mito_name = "MT" - } - 'Rnor_6.0' { - fasta = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Annotation/Genes/genes.bed" - mito_name = "MT" - } - 'R64-1-1' { - fasta = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Genes/genes.bed" - mito_name = "MT" - macs_gsize = "1.2e7" - } - 'EF2' { - fasta = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Annotation/README.txt" - mito_name = "MT" - macs_gsize = "1.21e7" - } - 'Sbi1' { - fasta = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Annotation/README.txt" - } - 'Sscrofa10.2' { - fasta = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Annotation/README.txt" - mito_name = "MT" - } - 'AGPv3' { - fasta = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Annotation/Genes/genes.bed" - mito_name = "Mt" - } - 'hg38' { - fasta = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Annotation/Genes/genes.bed" - mito_name = "chrM" - macs_gsize = "2.7e9" - blacklist = "${projectDir}/assets/blacklists/hg38-blacklist.bed" - } - 'hg19' { - fasta = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Annotation/README.txt" - mito_name = "chrM" - macs_gsize = "2.7e9" - blacklist = "${projectDir}/assets/blacklists/hg19-blacklist.bed" - } - 'mm10' { - fasta = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Annotation/README.txt" - mito_name = "chrM" - macs_gsize = "1.87e9" - blacklist = "${projectDir}/assets/blacklists/mm10-blacklist.bed" - } - 'bosTau8' { - fasta = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Bos_taurus/UCSC/bosTau8/Annotation/Genes/genes.bed" - mito_name = "chrM" - } - 'ce10' { - fasta = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Caenorhabditis_elegans/UCSC/ce10/Annotation/README.txt" - mito_name = "chrM" - macs_gsize = "9e7" - } - 'canFam3' { - fasta = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Canis_familiaris/UCSC/canFam3/Annotation/README.txt" - mito_name = "chrM" - } - 'danRer10' { - fasta = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Danio_rerio/UCSC/danRer10/Annotation/Genes/genes.bed" - mito_name = "chrM" - macs_gsize = "1.37e9" - } - 'dm6' { - fasta = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Drosophila_melanogaster/UCSC/dm6/Annotation/Genes/genes.bed" - mito_name = "chrM" - macs_gsize = "1.2e8" - } - 'equCab2' { - fasta = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Equus_caballus/UCSC/equCab2/Annotation/README.txt" - mito_name = "chrM" - } - 'galGal4' { - fasta = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Gallus_gallus/UCSC/galGal4/Annotation/README.txt" - mito_name = "chrM" - } - 'panTro4' { - fasta = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Pan_troglodytes/UCSC/panTro4/Annotation/README.txt" - mito_name = "chrM" - } - 'rn6' { - fasta = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Rattus_norvegicus/UCSC/rn6/Annotation/Genes/genes.bed" - mito_name = "chrM" - } - 'sacCer3' { - fasta = "${params.igenomes_base}/Saccharomyces_cerevisiae/UCSC/sacCer3/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Saccharomyces_cerevisiae/UCSC/sacCer3/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Saccharomyces_cerevisiae/UCSC/sacCer3/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Saccharomyces_cerevisiae/UCSC/sacCer3/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Saccharomyces_cerevisiae/UCSC/sacCer3/Sequence/BismarkIndex/" - readme = "${params.igenomes_base}/Saccharomyces_cerevisiae/UCSC/sacCer3/Annotation/README.txt" - mito_name = "chrM" - macs_gsize = "1.2e7" - } - 'susScr3' { - fasta = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Sequence/WholeGenomeFasta/genome.fa" - bwa = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Sequence/BWAIndex/version0.6.0/" - bowtie2 = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Sequence/Bowtie2Index/" - star = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Sequence/STARIndex/" - bismark = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Sequence/BismarkIndex/" - gtf = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Annotation/Genes/genes.gtf" - bed12 = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Annotation/Genes/genes.bed" - readme = "${params.igenomes_base}/Sus_scrofa/UCSC/susScr3/Annotation/README.txt" - mito_name = "chrM" - } - } -} diff --git a/conf/modules.config b/conf/modules.config index 96b1eb98..34528cd9 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -15,15 +15,15 @@ process { publishDir = [ path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - withName: 'MULTIQC' { - ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' } + withName: MULTIQC { + ext.args = { params.multiqc_title ? "--title \"${params.multiqc_title}\"" : '' } publishDir = [ path: { "${params.outdir}/multiqc" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -38,10 +38,11 @@ process { path: { "${params.outdir}/databases/mmseqs/" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ - params.taxa_classification_mmseqs_db_savetmp ? "" : "--remove-tmp-files" , + ext.args = [ + params.taxa_classification_mmseqs_db_savetmp ? "" : "--remove-tmp-files", + "--compressed ${params.taxa_classification_mmseqs_compressed}" ].join(' ').trim() } @@ -50,8 +51,11 @@ process { path: { "${params.outdir}/databases/mmseqs/mmseqs_createdb/" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] + ext.args = [ + "--compressed ${params.taxa_classification_mmseqs_compressed}" + ].join(' ').trim() } withName: MMSEQS_TAXONOMY { @@ -59,9 +63,9 @@ process { path: { "${params.outdir}/databases/mmseqs/mmseqs_taxonomy/" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ params.taxa_classification_mmseqs_taxonomy_savetmp ? "" : "--remove-tmp-files", "--search-type ${params.taxa_classification_mmseqs_taxonomy_searchtype}", "--lca-ranks ${params.taxa_classification_mmseqs_taxonomy_lcaranks}", @@ -70,6 +74,7 @@ process { "--orf-filter-s ${params.taxa_classification_mmseqs_taxonomy_orffilters}", "--lca-mode ${params.taxa_classification_mmseqs_taxonomy_lcamode}", "--vote-mode ${params.taxa_classification_mmseqs_taxonomy_votemode}", + "--compressed ${params.taxa_classification_mmseqs_compressed}" ].join(' ').trim() } @@ -79,39 +84,81 @@ process { mode: params.publish_dir_mode, enabled: params.run_taxa_classification, pattern: "*.tsv", - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] + ext.args = [ + "--compressed ${params.taxa_classification_mmseqs_compressed}" + ].join(' ').trim() } - withName: SEQKIT_SEQ { + withName: SEQKIT_SEQ_LENGTH { ext.prefix = { "${meta.id}_long" } publishDir = [ path: { "${params.outdir}/bgc/seqkit/" }, mode: params.publish_dir_mode, enabled: params.bgc_savefilteredcontigs, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--min-len ${params.bgc_mincontiglength}" ].join(' ').trim() } + withName: SEQKIT_SEQ_FILTER { + ext.prefix = { "${meta.id}_cleaned.faa" } + publishDir = [ + path: { "${params.outdir}/protein_annotation/interproscan/" }, + mode: params.publish_dir_mode, + enabled: { params.run_protein_annotation }, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + ext.args = [ + "--gap-letters '* \t.' --remove-gaps" + ].join(' ').trim() + } + + withName: INTERPROSCAN_DATABASE { + publishDir = [ + path: { "${params.outdir}/databases/interproscan/" }, + mode: params.publish_dir_mode, + enabled: params.save_db, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: INTERPROSCAN { + ext.prefix = { "${meta.id}_interproscan.faa" } + publishDir = [ + path: { "${params.outdir}/protein_annotation/interproscan/" }, + mode: params.publish_dir_mode, + enabled: params.run_protein_annotation, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + ext.args = [ + "--applications ${params.protein_annotation_interproscan_applications}", + params.protein_annotation_interproscan_enableprecalc ? '' : '--disable-precalc', + '--disable-residue-annot', + '--enable-tsv-residue-annot', + "--formats tsv" + ].join(' ').trim() // Warning: Do not disable the flags "--enable-tsv-residue-annot" and "--formats tsv"! This would cause a run failure because the format of the resulting files would no longer be adequate for parsing by AMPcombi2. + } + withName: PROKKA { - ext.prefix = { "${meta.id}_prokka" } // to prevent pigz symlink problems of input files if already uncompressed during post-annotation gzipping + ext.prefix = { "${meta.id}_prokka" } publishDir = [ path: { "${params.outdir}/annotation/prokka/${meta.category}/" }, mode: params.publish_dir_mode, enabled: params.save_annotations, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--kingdom ${params.annotation_prokka_kingdom}", "--gcode ${params.annotation_prokka_gcode}", "--mincontiglen ${params.annotation_prokka_mincontiglen}", "--evalue ${params.annotation_prokka_evalue}", "--coverage ${params.annotation_prokka_coverage}", - params.annotation_prokka_retaincontigheaders ? "--force" : "--locustag PROKKA --centre CENTER" , - params.annotation_prokka_singlemode ? '' : '--metagenome' , + params.annotation_prokka_retaincontigheaders ? "--force" : "--locustag PROKKA --centre CENTER", + params.annotation_prokka_singlemode ? '' : '--metagenome', params.annotation_prokka_cdsrnaolap ? '--cdsrnaolap' : '', params.annotation_prokka_rawproduct ? '--rawproduct' : '', params.annotation_prokka_rnammer ? '--rnammer' : '', @@ -125,22 +172,22 @@ process { path: { "${params.outdir}/databases/bakta" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--type ${params.annotation_bakta_db_downloadtype}" ].join(' ').trim() } withName: BAKTA_BAKTA { - ext.prefix = { "${meta.id}_bakta" } // to prevent pigz symlink problems of input files if already uncompressed during post-annotation gzipping + ext.prefix = { "${meta.id}_bakta" } publishDir = [ path: { "${params.outdir}/annotation/bakta/${meta.category}/" }, mode: params.publish_dir_mode, enabled: params.save_annotations, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ params.annotation_bakta_singlemode ? '' : '--meta', "--min-contig-length ${params.annotation_bakta_mincontiglen}", "--translation-table ${params.annotation_bakta_translationtable}", @@ -159,7 +206,8 @@ process { params.annotation_bakta_skipsorf ? '--skip-sorf' : '', params.annotation_bakta_gap ? '' : '--skip-gap', params.annotation_bakta_ori ? '' : '--skip-ori', - params.annotation_bakta_activate_plot ? '' : '--skip-plot' + params.annotation_bakta_activate_plot ? '' : '--skip-plot', + params.annotation_bakta_hmms ? '--hmms ${params.annotation_bakta_hmms}' : '', ].join(' ').trim() } @@ -169,30 +217,31 @@ process { mode: params.publish_dir_mode, enabled: params.save_annotations, pattern: "*.{faa,fna,gbk,faa.gz,faa.gz,fna.gz,gbk.gz}", - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ params.annotation_prodigal_singlemode ? "-p single" : "-p meta", params.annotation_prodigal_closed ? "-c" : "", params.annotation_prodigal_forcenonsd ? "-n" : "", - "-g ${params.annotation_prodigal_transtable}" + "-g ${params.annotation_prodigal_transtable}", ].join(' ').trim() } withName: PYRODIGAL { - ext.prefix = { "${meta.id}_pyrodigal" } // to prevent pigz symlink problems of input files if already uncompressed during post-annotation gzipping + ext.prefix = { "${meta.id}_pyrodigal" } publishDir = [ path: { "${params.outdir}/annotation/pyrodigal/${meta.category}/" }, mode: params.publish_dir_mode, enabled: params.save_annotations, pattern: "*.{faa,fna,gbk,score}.gz", - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ params.annotation_pyrodigal_singlemode ? "-p single" : "-p meta", params.annotation_pyrodigal_closed ? "-c" : "", params.annotation_pyrodigal_forcenonsd ? "-n" : "", - "-g ${params.annotation_pyrodigal_transtable}" + params.annotation_pyrodigal_usespecialstopcharacter ? '' : '--no-stop-codon', + "-g ${params.annotation_pyrodigal_transtable}", ].join(' ').trim() } @@ -200,12 +249,12 @@ process { publishDir = [ path: { "${params.outdir}/arg/abricate/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--db ${params.arg_abricate_db_id}", "--minid ${params.arg_abricate_minid}", - "--mincov ${params.arg_abricate_mincov}" + "--mincov ${params.arg_abricate_mincov}", ].join(' ').trim() } @@ -214,7 +263,7 @@ process { path: { "${params.outdir}/databases/amrfinderplus" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -222,15 +271,15 @@ process { publishDir = [ path: { "${params.outdir}/arg/amrfinderplus/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = { + ext.args = { [ "--ident_min ${params.arg_amrfinderplus_identmin}", "--coverage_min ${params.arg_amrfinderplus_coveragemin}", "--translation_table ${params.arg_amrfinderplus_translationtable}", params.arg_amrfinderplus_plus ? '--plus' : '', - params.arg_amrfinderplus_name ? "--name ${meta.id}" : '' + params.arg_amrfinderplus_name ? "--name ${meta.id}" : '', ].join(' ').trim() } } @@ -240,7 +289,7 @@ process { path: { "${params.outdir}/databases/deeparg" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -248,64 +297,62 @@ process { publishDir = [ path: { "${params.outdir}/arg/deeparg/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--type prot", "--min-prob ${params.arg_deeparg_minprob}", "--arg-alignment-identity ${params.arg_deeparg_alignmentidentity}", "--arg-alignment-evalue ${params.arg_deeparg_alignmentevalue}", "--arg-alignment-overlap ${params.arg_deeparg_alignmentoverlap}", - "--arg-num-alignments-per-entry ${params.arg_deeparg_numalignmentsperentry}" + "--arg-num-alignments-per-entry ${params.arg_deeparg_numalignmentsperentry}", ].join(' ').trim() } withName: FARGENE { - tag = {"${meta.id}|${hmm_model}"} + tag = { "${meta.id}|${hmm_model}" } publishDir = [ [ path: { "${params.outdir}/arg/fargene/${meta.id}" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - pattern: "*.log" + pattern: "*.log", ], [ path: { "${params.outdir}/arg/fargene/${meta.id}" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - pattern: "*/results_summary.txt" + pattern: "*/results_summary.txt", ], [ path: { "${params.outdir}/arg/fargene/${meta.id}" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - pattern: "*/{hmmsearchresults,predictedGenes,retrievedFragments}/*" + pattern: "*/{hmmsearchresults,predictedGenes,retrievedFragments}/*", ], [ path: { "${params.outdir}/arg/fargene/${meta.id}/" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, pattern: "*/{tmpdir}/*", - enabled: params.arg_fargene_savetmpfiles - ] + enabled: params.arg_fargene_savetmpfiles, + ], ] ext.prefix = { "${meta.hmm_class}" } - ext.args = { "--hmm-model ${params.arg_fargene_hmmmodel} --logfile ${meta.id}-${meta.hmm_class}.log --min-orf-length ${params.arg_fargene_minorflength} --score ${params.arg_fargene_score} --translation-format ${params.arg_fargene_translationformat}" } - ext.args = params.arg_fargene_orffinder ? '--orf-finder' : '' + ext.args = { "--hmm-model ${params.arg_fargene_hmmmodel} --logfile ${meta.id}-${meta.hmm_class}.log --min-orf-length ${params.arg_fargene_minorflength} --score ${params.arg_fargene_score} --translation-format ${params.arg_fargene_translationformat}" } + ext.args = params.arg_fargene_orffinder ? '--orf-finder' : '' } withName: UNTAR_CARD { - ext.prefix = "card_database" publishDir = [ [ path: { "${params.outdir}/databases/rgi" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ] - } withName: RGI_CARDANNOTATION { @@ -314,7 +361,7 @@ process { path: { "${params.outdir}/databases/rgi" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ] } @@ -325,30 +372,30 @@ process { path: { "${params.outdir}/arg/rgi/${meta.id}" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - pattern: "*.{txt}" - ], + pattern: "*.{txt}", + ], [ path: { "${params.outdir}/arg/rgi/${meta.id}" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, pattern: "*.{json}", - enabled: params.arg_rgi_savejson - ], + enabled: params.arg_rgi_savejson, + ], [ path: { "${params.outdir}/arg/rgi/${meta.id}/" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, pattern: "*temp*", - enabled: params.arg_rgi_savetmpfiles - ] + enabled: params.arg_rgi_savetmpfiles, + ], ] - ext.args2 = [ + ext.args2 = [ "--alignment_tool ${params.arg_rgi_alignmenttool}", "--data ${params.arg_rgi_data}", params.arg_rgi_includeloose ? '--include_loose' : '', params.arg_rgi_includenudge ? '--include_nudge' : '', params.arg_rgi_lowquality ? '--low_quality' : '', - params.arg_rgi_split_prodigal_jobs ? '--split_prodigal_jobs' : '' + params.arg_rgi_split_prodigal_jobs ? '--split_prodigal_jobs' : '', ].join(' ').trim() } @@ -357,7 +404,7 @@ process { publishDir = [ path: { "${params.outdir}/amp/ampir/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -366,16 +413,16 @@ process { publishDir = [ path: { "${params.outdir}/amp/amplify/${meta.id}/" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } withName: AMP_HMMER_HMMSEARCH { - label = { "${meta.id}_${meta.hmm_id}" } + label = { "${meta.id}_${meta.hmm_id}" } publishDir = [ path: { "${params.outdir}/amp/hmmer_hmmsearch/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${meta.id}_${meta.hmm_id}.hmmer_hmmsearch" } } @@ -385,17 +432,17 @@ process { publishDir = [ path: { "${params.outdir}/amp/macrel" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = "--keep-negatives" + ext.args = "--keep-negatives" } withName: BGC_HMMER_HMMSEARCH { - label = { "${meta.id}_${meta.hmm_id}" } + label = { "${meta.id}_${meta.hmm_id}" } publishDir = [ path: { "${params.outdir}/bgc/hmmer_hmmsearch/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${meta.id}_${meta.hmm_id}" } } @@ -404,7 +451,7 @@ process { publishDir = [ path: { "${params.outdir}/bgc/antismash" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.args = [ params.bgc_antismash_cbgeneral ? '--cb-general' : '', @@ -427,7 +474,7 @@ process { path: { "${params.outdir}/databases/antismash" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -436,7 +483,7 @@ process { path: { "${params.outdir}/databases/deepbgc" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -444,9 +491,9 @@ process { publishDir = [ path: { "${params.outdir}/bgc/deepbgc/" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--score ${params.bgc_deepbgc_score}", params.bgc_deepbgc_prodigalsinglemode ? '' : '--prodigal-meta-mode', "--merge-max-protein-gap ${params.bgc_deepbgc_mergemaxproteingap}", @@ -455,7 +502,7 @@ process { "--min-proteins ${params.bgc_deepbgc_minproteins}", "--min-domains ${params.bgc_deepbgc_mindomains}", "--min-bio-domains ${params.bgc_deepbgc_minbiodomains}", - "--classifier-score ${params.bgc_deepbgc_classifierscore}" + "--classifier-score ${params.bgc_deepbgc_classifierscore}", ].join(' ').trim() } @@ -463,14 +510,14 @@ process { publishDir = [ path: { "${params.outdir}/bgc/gecco/${meta.id}" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - ext.args = [ + ext.args = [ "--cds ${params.bgc_gecco_cds}", "--threshold ${params.bgc_gecco_threshold}", "--p-filter ${params.bgc_gecco_pfilter}", "--edge-distance ${params.bgc_gecco_edgedistance}", - params.bgc_gecco_mask ? '--mask' : '' + params.bgc_gecco_mask ? '--mask' : '', ].join(' ').trim() } @@ -478,7 +525,7 @@ process { publishDir = [ path: { "${params.outdir}/arg/hamronization/abricate" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${report}.abricate" } } @@ -487,7 +534,7 @@ process { publishDir = [ path: { "${params.outdir}/arg/hamronization/amrfinderplus" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${report}.amrfinderplus" } } @@ -496,7 +543,7 @@ process { publishDir = [ path: { "${params.outdir}/arg/hamronization/deeparg" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${report}.deeparg" } } @@ -505,7 +552,7 @@ process { publishDir = [ path: { "${params.outdir}/arg/hamronization/rgi" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${report}.rgi" } } @@ -514,7 +561,7 @@ process { publishDir = [ path: { "${params.outdir}/arg/hamronization/fargene" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${meta.id}_${report}.fargene" } } @@ -523,7 +570,7 @@ process { publishDir = [ path: { "${params.outdir}/reports/hamronization_summarize" }, mode: params.publish_dir_mode, - saveAs: { (params.run_taxa_classification == false) ? it : null } + saveAs: { params.run_taxa_classification == false ? it : null }, ] } @@ -531,26 +578,26 @@ process { publishDir = [ path: { "${params.outdir}/reports/hamronization_summarize" }, mode: params.publish_dir_mode, - saveAs: { _ -> null } // do not save the file + saveAs: { _ -> null }, ] } withName: ARG_TABIX_BGZIP { + ext.prefix = { "hamronization_complete_summary_taxonomy" } publishDir = [ path: { "${params.outdir}/reports/hamronization_summarize" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } withName: AMPCOMBI2_PARSETABLES { - publishDir = [ + publishDir = [ path: { "${params.outdir}/reports/ampcombi2/" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] - // Have to use a custom `ext` due to deep nested quotes - ext.args = [ + ext.args = [ "--aminoacid_length ${params.amp_ampcombi_parsetables_aalength}", "--db_evalue ${params.amp_ampcombi_parsetables_dbevalue}", "--amp_cutoff ${params.amp_ampcombi_parsetables_cutoff}", @@ -562,7 +609,7 @@ process { "--hmm_evalue ${params.amp_ampcombi_parsetables_hmmevalue}", "--window_size_stop_codon ${params.amp_ampcombi_parsetables_windowstopcodon}", "--window_size_transporter ${params.amp_ampcombi_parsetables_windowtransport}", - params.amp_ampcombi_parsetables_removehitswostopcodons ? '--remove_stop_codons' : '' + params.amp_ampcombi_parsetables_removehitswostopcodons ? '--remove_stop_codons' : '', ].join(' ').trim() ext.prefix = { "${meta.id}" } } @@ -573,12 +620,13 @@ process { saveAs: { filename -> if (filename.equals('versions.yml')) { return filename - } else { + } + else { return !params.run_taxa_classification ? filename : null } }, ] - ext.args = "--log TRUE" + ext.args = "--log TRUE" } withName: AMPCOMBI2_CLUSTER { @@ -588,12 +636,13 @@ process { saveAs: { filename -> if (filename.equals('versions.yml')) { return filename - } else { + } + else { return !params.run_taxa_classification ? filename : null } }, ] - ext.args = [ + ext.args = [ "--cluster_cov_mode ${params.amp_ampcombi_cluster_covmode}", "--cluster_mode ${params.amp_ampcombi_cluster_mode}", "--cluster_coverage ${params.amp_ampcombi_cluster_coverage}", @@ -601,7 +650,7 @@ process { "--cluster_sensitivity ${params.amp_ampcombi_cluster_sensitivity}", "--cluster_min_member ${params.amp_ampcombi_cluster_minmembers}", "--log TRUE", - params.amp_ampcombi_cluster_removesingletons ? '--cluster_remove_singletons' : '' + params.amp_ampcombi_cluster_removesingletons ? '--cluster_remove_singletons' : '', ].join(' ').trim() } @@ -609,7 +658,7 @@ process { publishDir = [ path: { "${params.outdir}/reports/ampcombi2" }, mode: params.publish_dir_mode, - saveAs: { _ -> null } // do not save the file + saveAs: { _ -> null }, ] } @@ -617,7 +666,7 @@ process { publishDir = [ path: { "${params.outdir}/reports/ampcombi2" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } @@ -625,45 +674,45 @@ process { publishDir = [ path: { "${params.outdir}/reports/combgc" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } withName: ARGNORM_ABRICATE { publishDir = [ - path: {"${params.outdir}/arg/argnorm/abricate/"}, + path: { "${params.outdir}/arg/argnorm/abricate/" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${meta.id}.normalized.tsv" } - ext.args = "--hamronized" + ext.args = "--hamronized" } withName: ARGNORM_AMRFINDERPLUS { publishDir = [ - path: {"${params.outdir}/arg/argnorm/amrfinderplus/"}, + path: { "${params.outdir}/arg/argnorm/amrfinderplus/" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { "${meta.id}.normalized.tsv" } - ext.args = "--hamronized" + ext.args = "--hamronized" } withName: ARGNORM_DEEPARG { publishDir = [ - path: {"${params.outdir}/arg/argnorm/deeparg/"}, + path: { "${params.outdir}/arg/argnorm/deeparg/" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] ext.prefix = { input_tsv.toString().endsWith(".potential.ARG.deeparg.tsv") ? "${meta.id}.potential.ARG.normalized.tsv" : "${meta.id}.ARG.normalized.tsv" } - ext.args = "--hamronized" + ext.args = "--hamronized" } withName: MERGE_TAXONOMY_COMBGC { publishDir = [ path: { "${params.outdir}/reports/combgc" }, mode: params.publish_dir_mode, - saveAs: { _ -> null } // do not save the file + saveAs: { _ -> null }, ] } @@ -671,16 +720,16 @@ process { publishDir = [ path: { "${params.outdir}/reports/combgc" }, mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } - withName: DRAMP_DOWNLOAD { + withName: AMP_DATABASE_DOWNLOAD { publishDir = [ - path: { "${params.outdir}/databases/dramp" }, + path: { "${params.outdir}/databases/ampcombi/" }, mode: params.publish_dir_mode, enabled: params.save_db, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, ] } } diff --git a/conf/test.config b/conf/test.config index 907bdd69..61ad1c4d 100644 --- a/conf/test.config +++ b/conf/test.config @@ -10,24 +10,27 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'AMP/ARG Pyrodigal test profile' config_profile_description = 'Minimal test dataset to check pipeline function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' - annotation_tool = 'pyrodigal' + annotation_tool = 'pyrodigal' - run_arg_screening = true - arg_fargene_hmmmodel = 'class_a,class_b_1_2' + run_arg_screening = true + arg_fargene_hmmmodel = 'class_a,class_b_1_2' - run_amp_screening = true - amp_run_hmmsearch = true - amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' + run_amp_screening = true + amp_run_hmmsearch = true + amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' } diff --git a/conf/test_bakta.config b/conf/test_bakta.config index 72c540c5..4cd2dacb 100644 --- a/conf/test_bakta.config +++ b/conf/test_bakta.config @@ -10,14 +10,17 @@ ---------------------------------------------------------------------------------------- */ -params { - config_profile_name = 'AMP/ARG Bakta test profile' - config_profile_description = 'Minimal test dataset to check pipeline function' +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' +params { + config_profile_name = 'AMP/ARG Bakta test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' // Input data input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' diff --git a/conf/test_bgc_bakta.config b/conf/test_bgc_bakta.config index d879fe38..fba6c3ea 100644 --- a/conf/test_bgc_bakta.config +++ b/conf/test_bgc_bakta.config @@ -10,14 +10,17 @@ ---------------------------------------------------------------------------------------- */ -params { - config_profile_name = 'BGC Bakta test profile' - config_profile_description = 'Minimal test dataset to check BGC workflow function' +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' +params { + config_profile_name = 'BGC Bakta test profile' + config_profile_description = 'Minimal test dataset to check BGC workflow function' // Input data input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' diff --git a/conf/test_bgc_prokka.config b/conf/test_bgc_prokka.config index 0a7b4e18..ece6902b 100644 --- a/conf/test_bgc_prokka.config +++ b/conf/test_bgc_prokka.config @@ -10,24 +10,27 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'BGC Prokka test profile' config_profile_description = 'Minimal test dataset to check BGC workflow function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' - annotation_tool = 'prokka' + annotation_tool = 'prokka' - run_arg_screening = false - run_amp_screening = false - run_bgc_screening = true + run_arg_screening = false + run_amp_screening = false + run_bgc_screening = true - bgc_run_hmmsearch = true - bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' + bgc_run_hmmsearch = true + bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' } diff --git a/conf/test_bgc_pyrodigal.config b/conf/test_bgc_pyrodigal.config index f5ef07a9..da83cbd6 100644 --- a/conf/test_bgc_pyrodigal.config +++ b/conf/test_bgc_pyrodigal.config @@ -10,24 +10,27 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'BGC Pyrodigal test profile' config_profile_description = 'Minimal test dataset to check BGC workflow function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' - annotation_tool = 'pyrodigal' + annotation_tool = 'pyrodigal' - run_arg_screening = false - run_amp_screening = false - run_bgc_screening = true + run_arg_screening = false + run_amp_screening = false + run_bgc_screening = true - bgc_run_hmmsearch = true - bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' + bgc_run_hmmsearch = true + bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' } diff --git a/conf/test_minimal.config b/conf/test_minimal.config new file mode 100644 index 00000000..f25a8c1e --- /dev/null +++ b/conf/test_minimal.config @@ -0,0 +1,56 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Although in this case we turn everything off + + Use as follows: + nextflow run nf-core/funcscan -profile test_minimal, --outdir + +---------------------------------------------------------------------------------------- +*/ + +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + +params { + config_profile_name = 'Minimal test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' + + // Input data + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' + amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' + bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' + + annotation_tool = 'pyrodigal' + + run_arg_screening = false + run_amp_screening = false + run_bgc_screening = false + + arg_fargene_hmmmodel = 'class_a,class_b_1_2' + + amp_skip_amplify = true + amp_skip_macrel = true + amp_skip_ampir = true + amp_run_hmmsearch = false + + arg_skip_deeparg = true + arg_skip_fargene = true + arg_skip_rgi = true + arg_skip_amrfinderplus = true + arg_skip_deeparg = true + arg_skip_abricate = true + + bgc_skip_antismash = true + bgc_skip_deepbgc = true + bgc_skip_gecco = true + bgc_run_hmmsearch = false +} diff --git a/conf/test_nothing.config b/conf/test_nothing.config deleted file mode 100644 index 87a2e06b..00000000 --- a/conf/test_nothing.config +++ /dev/null @@ -1,53 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for running minimal tests -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines input files and everything required to run a fast and simple pipeline test. - - Although in this case we turn everything off - - Use as follows: - nextflow run nf-core/funcscan -profile test_nothing, --outdir - ----------------------------------------------------------------------------------------- -*/ - -params { - config_profile_name = 'Test nothing profile' - config_profile_description = 'Minimal test dataset to check pipeline function' - - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' - - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' - amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' - bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' - - annotation_tool = 'pyrodigal' - - run_arg_screening = false - run_amp_screening = false - run_bgc_screening = false - - arg_fargene_hmmmodel = 'class_a,class_b_1_2' - - amp_skip_amplify = true - amp_skip_macrel = true - amp_skip_ampir = true - amp_run_hmmsearch = false - - arg_skip_deeparg = true - arg_skip_fargene = true - arg_skip_rgi = true - arg_skip_amrfinderplus = true - arg_skip_deeparg = true - arg_skip_abricate = true - - bgc_skip_antismash = true - bgc_skip_deepbgc = true - bgc_skip_gecco = true - bgc_run_hmmsearch = false -} diff --git a/conf/test_preannotated.config b/conf/test_preannotated.config index 38a5e1d1..764304e2 100644 --- a/conf/test_preannotated.config +++ b/conf/test_preannotated.config @@ -10,24 +10,27 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'ARG/AMP test profile - preannotated input' config_profile_description = 'Minimal test dataset to check pipeline function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_preannotated.csv' + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_preannotated.csv' - annotation_tool = 'pyrodigal' + annotation_tool = 'pyrodigal' - run_arg_screening = true - arg_fargene_hmmmodel = 'class_a,class_b_1_2' + run_arg_screening = true + arg_fargene_hmmmodel = 'class_a,class_b_1_2' - run_amp_screening = true - amp_run_hmmsearch = true - amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' + run_amp_screening = true + amp_run_hmmsearch = true + amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' } diff --git a/conf/test_preannotated_bgc.config b/conf/test_preannotated_bgc.config index 039656d3..70d5d1d5 100644 --- a/conf/test_preannotated_bgc.config +++ b/conf/test_preannotated_bgc.config @@ -10,24 +10,27 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'BGC test profile - preannotated input' config_profile_description = 'Minimal test dataset to check BGC workflow function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_preannotated.csv' + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_preannotated.csv' - annotation_tool = 'pyrodigal' + annotation_tool = 'pyrodigal' - run_arg_screening = false - run_amp_screening = false - run_bgc_screening = true + run_arg_screening = false + run_amp_screening = false + run_bgc_screening = true - bgc_run_hmmsearch = true - bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' + bgc_run_hmmsearch = true + bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm' } diff --git a/conf/test_prokka.config b/conf/test_prokka.config index eb346bcb..fd576b81 100644 --- a/conf/test_prokka.config +++ b/conf/test_prokka.config @@ -10,24 +10,27 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'AMP/ARG Prokka test profile' config_profile_description = 'Minimal test dataset to check pipeline function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '8.GB' - max_time = '6.h' - // Input data - input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' + input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' - annotation_tool = 'prokka' + annotation_tool = 'prokka' - run_arg_screening = true - arg_fargene_hmmmodel = 'class_a,class_b_1_2' + run_arg_screening = true + arg_fargene_hmmmodel = 'class_a,class_b_1_2' - run_amp_screening = true - amp_run_hmmsearch = true - amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' + run_amp_screening = true + amp_run_hmmsearch = true + amp_hmmsearch_models = params.pipelines_testdata_base_path + 'funcscan/hmms/mybacteriocin.hmm' } diff --git a/conf/test_taxonomy_bakta.config b/conf/test_taxonomy_bakta.config index e7bc923d..6763c48e 100644 --- a/conf/test_taxonomy_bakta.config +++ b/conf/test_taxonomy_bakta.config @@ -10,14 +10,20 @@ ---------------------------------------------------------------------------------------- */ -params { - config_profile_name = 'Taxonomic classification test profile' - config_profile_description = 'Minimal test dataset to check taxonomic classification workflow function' +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] + withName: MMSEQS_DATABASES { + memory = '14.GB' + } +} - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '14.GB' - max_time = '6.h' +params { + config_profile_name = 'Taxonomic classification test profile' + config_profile_description = 'Minimal test dataset to check taxonomic classification workflow function' // Input data input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' @@ -42,9 +48,3 @@ params { bgc_antismash_contigminlength = 1000 bgc_run_hmmsearch = true } - -process { - withName: MMSEQS_DATABASES { - memory = '14.GB' - } -} diff --git a/conf/test_taxonomy_prokka.config b/conf/test_taxonomy_prokka.config index 39eefdfc..e126624f 100644 --- a/conf/test_taxonomy_prokka.config +++ b/conf/test_taxonomy_prokka.config @@ -10,14 +10,20 @@ ---------------------------------------------------------------------------------------- */ -params { - config_profile_name = 'Taxonomic classification test profile' - config_profile_description = 'Minimal test dataset to check taxonomic classification workflow function' +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] + withName: MMSEQS_DATABASES { + memory = '14.GB' + } +} - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '14.GB' - max_time = '6.h' +params { + config_profile_name = 'Taxonomic classification test profile' + config_profile_description = 'Minimal test dataset to check taxonomic classification workflow function' // Input data input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' @@ -41,9 +47,3 @@ params { bgc_antismash_contigminlength = 1000 bgc_run_hmmsearch = true } - -process { - withName: MMSEQS_DATABASES { - memory = '14.GB' - } -} diff --git a/conf/test_taxonomy_pyrodigal.config b/conf/test_taxonomy_pyrodigal.config index 4ad970f9..cbe89dc3 100644 --- a/conf/test_taxonomy_pyrodigal.config +++ b/conf/test_taxonomy_pyrodigal.config @@ -10,14 +10,20 @@ ---------------------------------------------------------------------------------------- */ -params { - config_profile_name = 'Taxonomic classification test profile' - config_profile_description = 'Minimal test dataset to check taxonomic classification workflow function' +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] + withName: MMSEQS_DATABASES { + memory = '14.GB' + } +} - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '14.GB' - max_time = '6.h' +params { + config_profile_name = 'Taxonomic classification test profile' + config_profile_description = 'Minimal test dataset to check taxonomic classification workflow function' // Input data input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv' @@ -41,9 +47,3 @@ params { bgc_antismash_contigminlength = 1000 bgc_run_hmmsearch = true } - -process { - withName: MMSEQS_DATABASES { - memory = '14.GB' - } -} diff --git a/docs/images/funcscan_metro_workflow.png b/docs/images/funcscan_metro_workflow.png index 7fda2756..6fada251 100644 Binary files a/docs/images/funcscan_metro_workflow.png and b/docs/images/funcscan_metro_workflow.png differ diff --git a/docs/images/funcscan_metro_workflow.svg b/docs/images/funcscan_metro_workflow.svg index c9d291fb..bde2f3cf 100644 --- a/docs/images/funcscan_metro_workflow.svg +++ b/docs/images/funcscan_metro_workflow.svg @@ -3,8 +3,8 @@ hAMRonizationhAMRonizationABRicateABRicateAMRFinderPlusAMRFinderPlusDeepARGDeepARGfARGenefARGeneRGIRGIgunzipgunzipMMseqs2InterProScanSeqKitSeqKitProkkaProkkaProdigalPyrodigalBaktaBaktaAMPcombiAMPcombicomBGCcomBGCSummarySummaryDeepBGChmmsearchhmmsearchhmmsearchhmmsearchv2.0v2.1132MMseqs2argNormargNormLEGENDAntimicrobial Peptide Genes (AMPs)Biosynthetic Gene Clusters (BGCs)Antibiotic Resistance Genes (ARGs)LEGENDTaxonomic ClassificationScreening ToolsAntimicrobial Peptide Genes (AMPs)Preprocessing ToolsBiosynthetic Gene Clusters (BGCs)Postprocessing ToolsAntibiotic Resistance Genes (ARGs)Optional InputAnnotation (Taxonomic, Protein)(Screening Tools2.Preprocessing Tools1.Postprocessing Tools3.Optional Input()) + y="133.87894" />SeqKit diff --git a/docs/images/funcscan_metro_workflow_dark.png b/docs/images/funcscan_metro_workflow_dark.png new file mode 100644 index 00000000..fc612c50 Binary files /dev/null and b/docs/images/funcscan_metro_workflow_dark.png differ diff --git a/docs/images/funcscan_metro_workflow_dark.svg b/docs/images/funcscan_metro_workflow_dark.svg new file mode 100644 index 00000000..dd618d9d --- /dev/null +++ b/docs/images/funcscan_metro_workflow_dark.svg @@ -0,0 +1,3294 @@ + + + +vcfvcftsvhAMRonizationABRicateAMRFinderPlusDeepARGfARGeneRGIgunzipInterProScanSeqKitProkkaProdigalPyrodigalBaktaAMPcombicomBGCSummaryantiSMASHGECCODeepBGChmmsearchhmmsearchampirAMPlifyMacrelv2.1132vcfvcffastavcfvcfgbkvcfvcffaa()MMseqs2vcfvcftsvargNormLEGENDAntimicrobial Peptide Genes (AMPs)Biosynthetic Gene Clusters (BGCs)Antibiotic Resistance Genes (ARGs)Annotation (Taxonomic, Protein)Screening Tools2.Preprocessing Tools1.Postprocessing Tools3.Optional Input()SeqKit diff --git a/docs/images/mqc_fastqc_adapter.png b/docs/images/mqc_fastqc_adapter.png deleted file mode 100755 index 361d0e47..00000000 Binary files a/docs/images/mqc_fastqc_adapter.png and /dev/null differ diff --git a/docs/images/mqc_fastqc_counts.png b/docs/images/mqc_fastqc_counts.png deleted file mode 100755 index cb39ebb8..00000000 Binary files a/docs/images/mqc_fastqc_counts.png and /dev/null differ diff --git a/docs/images/mqc_fastqc_quality.png b/docs/images/mqc_fastqc_quality.png deleted file mode 100755 index a4b89bf5..00000000 Binary files a/docs/images/mqc_fastqc_quality.png and /dev/null differ diff --git a/docs/output.md b/docs/output.md index 9f71278a..289d9086 100644 --- a/docs/output.md +++ b/docs/output.md @@ -18,40 +18,42 @@ The directories listed below will be created in the results directory (specified ```tree results/ -├── taxonomic_classification/ -| └── mmseqs_createtsv/ -├── annotation/ -| ├── bakta/ -| ├── prodigal/ -| ├── prokka/ -| └── pyrodigal/ ├── amp/ | ├── ampir/ | ├── amplify/ | ├── hmmsearch/ | └── macrel/ +├── annotation/ +| ├── bakta/ +| ├── prodigal/ +| ├── prokka/ +| └── pyrodigal/ ├── arg/ | ├── abricate/ | ├── amrfinderplus/ +| ├── argnorm/ | ├── deeparg/ | ├── fargene/ -| ├── rgi/ | ├── hamronization/ -| └── argnorm/ +| └── rgi/ ├── bgc/ | ├── antismash/ | ├── deepbgc/ | ├── gecco/ | └── hmmsearch/ +├── databases/ +├── multiqc/ +├── pipeline_info/ +├── protein_annotation/ +| └── interproscan/ ├── qc/ | └── seqkit/ ├── reports/ | ├── ampcombi/ | ├── combgc/ | └── hamronization_summarize/ -├── databases/ -├── multiqc/ -└── pipeline_info/ +└── taxonomic_classification/ + └── mmseqs_createtsv/ work/ ``` @@ -74,6 +76,10 @@ ORF prediction and annotation with any of: - [Prokka](#prokka) – open reading frame prediction and functional protein annotation. - [Bakta](#bakta) – open reading frame prediction and functional protein annotation. +CDS domain annotation: + +- [InterProScan](#interproscan) (default) – for open reading frame protein and domain predictions. + Antimicrobial Resistance Genes (ARGs): - [ABRicate](#abricate) – antimicrobial resistance gene detection, based on alignment to one of several databases. @@ -216,6 +222,23 @@ Output Summaries: [Bakta](https://github.com/oschwengers/bakta) is a tool for the rapid & standardised annotation of bacterial genomes and plasmids from both isolates and MAGs. It provides dbxref-rich, sORF-including and taxon-independent annotations in machine-readable JSON & bioinformatics standard file formats for automated downstream analysis. The output is used by some of the functional screening tools. +### Protein annotation + +[InterProScan](#interproscan) + +#### InterProScan + +
+Output files + +- `interproscan/` + - `_cleaned.faa`: clean version of the fasta files (in amino acid format) generated by one of the annotation tools (i.e. Pyrodigal, Prokka, Bakta). These contain sequences with no special characters (for eg. `*` or `-`). + - `_interproscan_faa.tsv`: predicted proteins and domains using the InterPro database in TSV format + +
+ +[InterProScan](https://github.com/ebi-pf-team/interproscan) is designed to predict protein functions and provide possible domain and motif information of the coding regions. It utilizes the InterPro database that consists of multiple sister databases such as PANTHER, ProSite, Pfam, etc. More details can be found in the [documentation](https://interproscan-docs.readthedocs.io/en/latest/index.html). + ### AMP detection tools [ampir](#ampir), [AMPlify](#amplify), [hmmsearch](#hmmsearch), [Macrel](#macrel) @@ -227,7 +250,7 @@ Output Summaries: - `ampir/` - `.ampir.faa`: predicted AMP sequences in FAA format - - `.ampir.tsv`: predicted AMP metadata in TSV format, contains contig name, sequence and probability score + - `.ampir.tsv`: predicted AMP metadata in TSV format; contains contig name, sequence and probability score. @@ -239,7 +262,7 @@ Output Summaries: Output files - `amplify/` - - `*_results.tsv`: table of contig amino-acid sequences with prediction result (AMP or non-AMP) and information on sequence length, charge, probability score, AMPlify log-scaled score) + - `*_results.tsv`: table of contig amino-acid sequences with prediction result (AMP or non-AMP) and information on sequence length, charge, probability score, AMPlify log-scaled score @@ -457,15 +480,23 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation - `Ampcombi_parse_tables.log`: log file containing the run information from AMPcombi submodule `ampcombi2/parsetables` - `Ampcombi_complete.log`: log file containing the run information from AMPcombi submodule `ampcombi2/complete` - `Ampcombi_summary_cluster.tsv`: tab-separated table containing the clustered AMP hits. This is the output given when the taxonomic classification is not activated (pipeline default). - - `Ampcombi_summary_cluster_representative_seq.tsv`: tab-separated table containing the representative sequence of each cluster. This can be used in AMPcombi for constructing 3D structures using ColabFold. For more details on how to do this, please refer to the [AMPcombi documentation](https://github.com/Darcy220606/AMPcombi/blob/main/README.md). + - `Ampcombi_summary_cluster_representative_seq.tsv`: tab-separated table containing the representative sequence of each cluster. This can be used in AMPcombi for constructing 3D structures using ColabFold. For more details on how to do this, please refer to the [AMPcombi documentation](https://ampcombi.readthedocs.io/en/main/). - `Ampcombi_cluster.log`: log file containing the run information from AMPcombi submodule `ampcombi2/cluster` - `ampcombi_complete_summary_taxonomy.tsv.gz`: summarised output from all AMP workflow tools with taxonomic assignment in compressed tsv format. This is the same output as `Ampcombi_summary_cluster.tsv` file but with taxonomic classification of the contig. - `/contig_gbks`: contains all the contigs in gbk format that an AMP was found on using the custom parameters - `/*_ampcombi.log`: a log file generated by AMPcombi - `/*_ampcombi.tsv`: summarised output in tsv format for each sample - `/*_amp.faa*`: fasta file containing the amino acid sequences for all AMP hits for each sample - - `/*_diamond_matches.txt*`: alignment file generated by DIAMOND for each sample - AMP summary table header descriptions + - `/*_mmseqs_matches.txt*`: alignment file generated by MMseqs2 for each sample + +:::info +In some cases when the AMP and the taxonomic classification subworkflows are turned on, it can happen that only summary files per sample are created in the output folder with **no** `Ampcombi_summary.tsv` and `Ampcombi_summary_cluster.tsv` files with no taxonomic classifications merged. +This can occur if some AMP prediction parameters are 'too strict' or only one AMP tool is run, which can lead to no AMP hits found in any of the samples or in only one sample. +Look out for the warning `[nf-core/funcscan] AMPCOMBI2: 0/1 file passed. Skipping AMPCOMBI2_COMPLETE, AMPCOMBI2_CLUSTER, and TAXONOMY MERGING steps.` in the stdout or `.nextflow.log` file. +In that case we recommend to lower the AMP prediction thresholds and run more than one AMP prediction tool. +::: + + AMP summary table header descriptions using DRAMP as reference database | Table column | Description | | ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | @@ -477,9 +508,9 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation | `prob_amplify` | Probability associated with the AMP prediction using `AMPLIFY` | | `evalue_hmmer` | Expected number of false positives (nonhomologous sequences) with a similar of higher score. This stands for how significant the hit is, the lower the evalue, the more significant the hit | | `aa_sequence` | Amino acid sequence that forms part of the contig and is AMP encoding | -| `target_id` | [DRAMP](http://dramp.cpu-bioinfor.org/) ID within the database found to be similar to the predicted AMP by `DIAMOND` alignment | +| `target_id` | [DRAMP](http://dramp.cpu-bioinfor.org/) ID within the database found to be similar to the predicted AMP by `MMseqs2` alignment | | `pident` | Percentage identity of amino acid residues that fully aligned between the `DRAMP` sequence and the predicted AMP sequence | -| `evalue` | Number of alignments of similar or better qualities that can be expected when searching a database of similar size with a random sequence distribution. This is generated by `DIAMOND` alignments using the [DRAMP](http://dramp.cpu-bioinfor.org/) AMP database. The lower the value the more significant that the hit is positive. An e-value of < 0.001 means that the this hit will be found by chance once per 1,0000 queries | +| `evalue` | Number of alignments of similar or better qualities that can be expected when searching a database of similar size with a random sequence distribution. This is generated by `MMseqs2` alignments using the [DRAMP](http://dramp.cpu-bioinfor.org/) AMP database. The lower the value the more significant that the hit is positive. An e-value of < 0.001 means that the this hit will be found by chance once per 1,0000 queries | | `Sequence` | Sequence corresponding to the `DRAMP` ID found to be similar to the predicted AMP sequence | | `Sequence_length` | Number of amino acid residues in the `DRAMP` sequence | | `Name` | Full name of the peptide copied from the database it was uploaded to | @@ -510,7 +541,12 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation -[AMPcombi](https://github.com/Darcy220606/AMPcombi) summarizes the results of **antimicrobial peptide (AMP)** prediction tools (ampir, AMPlify, Macrel, and other non-nf-core tools) into a single table and aligns the hits against a reference AMP database for functional and taxonomic classification. It assigns the physiochemical properties (e.g. hydrophobicity, molecular weight) using the [Biopython toolkit](https://github.com/biopython/biopython). Additionally, it clusters the resulting AMP hits from all samples using [MMseqs2](https://github.com/soedinglab/MMseqs2). For further filtering for AMPs with signaling peptides, the output file `Ampcombi_summary_cluster.tsv` or `ampcombi_complete_summary_taxonomy.tsv.gz` can be used downstream as detailed [here](https://github.com/Darcy220606/AMPcombi/blob/main/README.md). +[AMPcombi](https://github.com/Darcy220606/AMPcombi) summarizes the results of **antimicrobial peptide (AMP)** prediction tools (ampir, AMPlify, Macrel, and other non-nf-core supported tools) into a single table and aligns the hits against a reference AMP database for functional, structural and taxonomic classification using [MMseqs2](https://github.com/soedinglab/MMseqs2). +It further assigns the physiochemical properties (e.g. hydrophobicity, molecular weight) using the [Biopython toolkit](https://github.com/biopython/biopython) and clusters the resulting AMP hits from all samples using [MMseqs2](https://github.com/soedinglab/MMseqs2). +To further filter the recovered AMPs using the presence of signaling peptides, the output file `Ampcombi_summary_cluster.tsv` or `ampcombi_complete_summary_taxonomy.tsv.gz` can be used downstream as detailed [here](https://ampcombi.readthedocs.io/en/main/usage.html#signal-peptide). +The final tables generated may also be visualized and explored using an interactive [user interface](https://ampcombi.readthedocs.io/en/main/visualization.html). + +AMPcombi interface #### hAMRonization @@ -520,7 +556,7 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation - `hamronization_summarize/` one of the following: - `hamronization_combined_report.json`: summarised output in .json format - `hamronization_combined_report.tsv`: summarised output in .tsv format when the taxonomic classification is turned off (pipeline default). - - `hamronization_combined_report.tsv.gz`: summarised output in gzipped format when the taxonomic classification is turned on by `--run_taxa_classification`. + - `hamronization_complete_summary_taxonomy.tsv.gz`: summarised output in gzipped format when the taxonomic classification is turned on by `--run_taxa_classification`. - `hamronization_combined_report.html`: interactive output in .html format @@ -645,7 +681,9 @@ argNorm takes the outputs of the [hAMRonization](#hamronization) tool of [ABRica [MultiQC](http://multiqc.info) is used in nf-core/funcscan to report the versions of all software used in the given pipeline run, and provides a suggested methods text. This allows for reproducible analysis and transparency in method reporting in publications. -#### Pipeline information +Results generated by MultiQC collate pipeline QC from supported tools. The pipeline has special steps which also allow the software versions to be reported in the MultiQC output for future traceability. For more information about how to use MultiQC reports, see . + +### Pipeline information
Output files diff --git a/docs/usage.md b/docs/usage.md index 6c3c1088..74da7840 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -18,20 +18,20 @@ nextflow run nf-core/funcscan --input samplesheet.csv --outdir -profile This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles. -To run any of the three screening workflows (AMP, ARG, and/or BGC) or taxonomic classification, switch them on by adding the respective flag(s) to the command: +To run any of the three screening workflows (AMP, ARG, and/or BGC), taxonomic classification, and/or protein annotation, switch them on by adding the respective flag(s) to the command: - `--run_amp_screening` - `--run_arg_screening` - `--run_bgc_screening` -- `--run_taxa_classification` +- `--run_taxa_classification` (for optional additional taxonomic annotations) +- `--run_protein_annotation` (for optional additional protein family and domain annotation) -When switched on, all tools of the given workflow will be run by default. If you don't need specific tools, you can explicitly skip them. The exception is HMMsearch, which needs to be explicitly switched on and provided with HMM screening files (AMP and BGC workflows, see [parameter documentation](/funcscan/parameters)). For the taxonomic classification, MMseqs2 is currently the only tool implemented in the pipline. +When switched on, all tools of the given workflow will be run by default. If you don't need specific tools, you can explicitly skip them. The exception is HMMsearch, which needs to be explicitly switched on and provided with HMM screening files (AMP and BGC workflows, see [parameter documentation](/funcscan/parameters)). For the taxonomic classification, MMseqs2 is currently the only tool implemented in the pipeline. Likewise, InterProScan is the only tool for protein sequence annotation. **Example:** You want to run AMP and ARG screening but you don't need the DeepARG tool of the ARG workflow and the Macrel tool of the AMP workflow. Your command would be: ```bash nextflow run nf-core/funcscan --input samplesheet.csv --outdir -profile docker --run_arg_screening --arg_skip_deeparg --run_amp_screening --amg_skip_macrel - ``` Note that the pipeline will create the following files in your working directory: @@ -48,9 +48,8 @@ If you wish to repeatedly use the same parameters for multiple runs, rather than Pipeline settings can be provided in a `yaml` or `json` file via `-params-file `. -:::warning -Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). -::: +> [!WARNING] +> Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). The above pipeline run specified with a params file in yaml format: @@ -58,12 +57,11 @@ The above pipeline run specified with a params file in yaml format: nextflow run nf-core/funcscan -profile docker -params-file params.yaml ``` -with `params.yaml` containing: +with: -```yaml +```yaml title="params.yaml" input: './samplesheet.csv' outdir: './results/' -genome: 'GRCh37' <...> ``` @@ -79,7 +77,7 @@ nf-core/funcscan takes FASTA files as input, typically contigs or whole genome s The input samplesheet has to be a comma-separated file (`.csv`) with 2 (`sample`, and `fasta`) or 4 columns (`sample`, `fasta`, `protein`, `gbk`), and a header row as shown in the examples below. -If you already have annotated contigs with peptide sequences and an annotation file in Genbank format (`.gbk.` or `.gbff`), you can supply these to the pipeline using the optional `protein` and `gbk` columns. If these additional columns are supplied, pipeline annotation (i.e. with bakta, prodigal, pyrodigal or prokka) will be skipped and the corresponding annotation files used instead. +If you already have annotated contigs with peptide sequences and an annotation file in Genbank format (`.gbk.` or `.gbff`), you can supply these to the pipeline using the optional `protein` and `gbk` columns. If these additional columns are supplied, pipeline annotation (i.e. with bakta, prodigal, pyrodigal or prokka) will be skipped and your corresponding annotation files used instead. For two columns (without pre-annotated data): @@ -109,10 +107,10 @@ An [example samplesheet](../assets/samplesheet.csv) has been provided with the p :::danger We highly recommend performing quality control on input contigs before running the pipeline. You may not receive results for some tools if none of the contigs in a FASTA file reach certain thresholds. Check parameter documentation for relevant minimum contig parameters. -For example, ideally BGC screening requires contigs of at least 3,000 bp else downstream tools may crash. +For example, ideally BGC screening requires contigs of at least 3,000 bp, otherwise downstream tools may crash. ::: -## Notes on screening tools and taxonomic classification +## Notes on screening tools, taxonomic and functional classifications The implementation of some tools in the pipeline may have some particular behaviours that you should be aware of before you run the pipeline. @@ -128,27 +126,40 @@ MMseqs2 is currently the only taxonomic classification tool used in the pipeline The contents of the directory should have files such as `.version` and `.taxonomy` in the top level. -- An MMseqs2 ready database. These databases were compiled by the developers of MMseqs2 and can be called using their labels. All available options can be found [here](https://github.com/soedinglab/MMseqs2/wiki#downloading-databases). Only use those databases that have taxonomy files available (i.e., Taxonomy == Yes). By default mmseqs2 in the pipeline uses '[Kalamari](https://github.com/lskatz/Kalamari)', and runs an aminoacid based alignment. However, if the user requires a more comprehensive taxonomic classification, we recommend the use of [GTDB](https://gtdb.ecogenomic.org/), but for that please remember to increase the memory, CPU threads and time required for the process `MMSEQS_TAXONOMY`. +- An MMseqs2 ready database. These databases were compiled by the developers of MMseqs2 and can be called using their labels. All available options can be found [here](https://github.com/soedinglab/MMseqs2/wiki#downloading-databases). Only use those databases that have taxonomy files available (i.e. Taxonomy column shows "yes"). By default MMseqs2 in the pipeline uses '[Kalamari](https://github.com/lskatz/Kalamari)', and runs an amino acid-based alignment. However, if the user requires a more comprehensive taxonomic classification, we recommend the use of [GTDB](https://gtdb.ecogenomic.org/), but for that please remember to increase the memory, CPU threads and time required for the process `MMSEQS_TAXONOMY`. ```bash --taxa_classification_mmseqs_db_id 'Kalamari' ``` +### InterProScan + +[InterProScan](https://github.com/ebi-pf-team/interproscan) is currently the only protein annotation tool in this pipeline that gives a snapshot of the protein families and domains for each coding region. + +The protein annotation workflow is activated with the flag `--run_protein_annotation`. +InterProScan is used as the only protein annotation tool at the moment and the [InterPro database](http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.72-103.0) version 5.72-103.0 is downloaded and prepared to screen the input sequences against it. + +Since the database download is huge (5.5GB) and might take quite some time, you can skip the automatic database download (see section [Databases and reference files](usage/#interproscan-1) for details). + +:::info +By default, the databases used by InterProScan is set as `PANTHER,ProSiteProfiles,ProSitePatterns,Pfam`. An addition of other application to the list does not guarantee that the results will be integrated correctly within `AMPcombi`. +::: + ### antiSMASH -antiSMASH has a minimum contig parameter, in which only contigs of a certain length (or longer) will be screened. In cases where no hits are found in these, the tool ends successfully without hits. However if no contigs in an input file reach that minimum threshold, the tool will end with a 'failure' code, and cause the pipeline to crash. +antiSMASH has a minimum contig parameter, in which only contigs of a certain length (or longer) will be screened. If no contigs in an input file reach that minimum threshold, the tool will end with a 'failure' code, and cause the pipeline to crash. When the annotation is run with Prokka, the resulting `.gbk` file passed to antiSMASH may produce the error `translation longer than location allows` and end the pipeline run. This Prokka bug has been reported before (see [discussion on GitHub](https://github.com/antismash/antismash/discussions/450)) and is not likely to be fixed soon. :::warning -If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead use the default annotation tool (Pyrodigal) or switch to Prodigal, or (for bacteria only!) Bakta. +If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead use the default annotation tool (Pyrodigal), or switch to Prodigal or (for bacteria only!) Bakta. ::: ## Databases and reference files Various tools of nf-core/funcscan use databases and reference files to operate. -nf-core/funcscan offers the functionality to auto-download databases for you, and as these databases can be very large, and we suggest to store these files in a central place from where you can reuse them across pipeline runs. +nf-core/funcscan offers the functionality to auto-download databases for you, and as these databases can be very large, we suggest to store these files in a central place from where you can reuse them across pipeline runs. If your infrastructure has internet access (particularly on compute nodes), we **highly recommend** allowing the pipeline to download these databases for you on a first run, saving these to your results directory with `--save_db`, then moving these to a different location (in case you wish to delete the results directory of this first run). An exception to this is HMM files where no auto-downloading functionality is possible. @@ -224,21 +235,48 @@ wget https://github.com/nf-core/funcscan/raw//bin/ampcombi_dow python3 ampcombi_download.py ``` -However, the user can also supply their own custom AMP database by following the guidelines in [AMPcombi](https://github.com/Darcy220606/AMPcombi). +In addition to [DRAMP](http://dramp.cpu-bioinfor.org/), two more reference databases can be used to classify the recovered AMPs in the AMP workflow; [APD](https://aps.unmc.edu/) and [UniRef100](https://academic.oup.com/bioinformatics/article/23/10/1282/197795). Only one database can be used at a time using `--amp_ampcombi_db_id `. + +However, the user can also supply their own custom AMP database by following the guidelines in [AMPcombi](https://ampcombi.readthedocs.io/en/main/). This can then be passed to the pipeline with: ```bash --amp_ampcombi_db '/// ``` -The contents of the directory should have files such as `*.dmnd` and `*.fasta` in the top level. +The contents of the directory should have files such as `*.fasta` and `*.tsv` in the top level; a fasta file and the corresponding table with structural, functional and (if reported) taxonomic classifications. AMPcombi will then generate the corresponding `mmseqs2` directory, in which all binary files are prepared for downstream alignment of the recovered AMPs with [MMseqs2](https://github.com/soedinglab/MMseqs2). These can also be provided by the user by setting up an MMseqs2-compatible database using `mmseqs createdb *.fasta` in a directory called `mmseqs2`. An example file structure for [DRAMP](http://dramp.cpu-bioinfor.org/) used as the reference database: + +```tree +amp_DRAMP_database/ +├── general_amps_2024_11_13.fasta +├── general_amps_2024_11_13.txt +└── mmseqs2 + ├── ref_DB + ├── ref_DB.dbtype + ├── ref_DB_h + ├── ref_DB_h.dbtype + ├── ref_DB_h.index + ├── ref_DB.index + ├── ref_DB.lookup + └── ref_DB.source +``` + +:::note +For both [DRAMP](http://dramp.cpu-bioinfor.org/) and [APD](https://aps.unmc.edu/), AMPcombi removes entries that contain any non-amino acid residues by default. +::: :::warning The pipeline will automatically run Pyrodigal instead of Prodigal if the parameters `--run_annotation_tool prodigal --run_amp_screening` are both provided. This is due to an incompatibility issue of Prodigal's output `.gbk` file with multiple downstream tools. ::: -### Abricate +:::tip + +- If `--run_protein_annotation` is activated, protein and domain classifications of the coding regions are generated and then used by the `ampcombi2/parsetables` module to create a table for every sample and afterwards the combined summary files, e.g. `Ampcombi_summary.tsv`. +- In some cases when the AMP and the taxonomic classification subworkflows are turned on, it can happen that only summary files per sample are created in the output folder with **no** `Ampcombi_summary.tsv` and `Ampcombi_summary_cluster.tsv` files with no taxonomic classifications merged. This can occur if some AMP prediction parameters are 'too strict' or only one AMP tool is run, which can lead to no AMP hits found in any of the samples or in only one sample. Look out for the warning `[nf-core/funcscan] AMPCOMBI2: 0/1 file passed. Skipping AMPCOMBI2_COMPLETE, AMPCOMBI2_CLUSTER, and TAXONOMY MERGING steps.` in the stdout or `.nextflow.log` file. In that case we recommend to lower the AMP prediction thresholds and run more than one AMP prediction tool. + ::: + +### ABRicate The default ABRicate installation comes with a series of 'default' databases: @@ -349,10 +387,7 @@ conda activate deeparg 2. Run `deeparg download_data -o ////` -Or download the files directly from - -1. the [DeepARG FTP site](https://bench.cs.vt.edu/ftp/data/gustavo1/deeparg/database/) -2. the [DeepARG database Zenodo archive](https://zenodo.org/record/8280582) +Or download the files directly from the [DeepARG database Zenodo archive](https://zenodo.org/record/8280582). Note that more recent database versions maybe available from the [ARGMiner service](https://bench.cs.vt.edu/argminer/#/home). @@ -446,7 +481,6 @@ conda activate antismash-lite --bgc_antismash_installdir '/////antismash' ``` -Note that the names of the supplied folders must differ from each other (e.g. `antismash_db` and `antismash_dir`). The contents of the database directory should include directories such as `as-js/`, `clusterblast/`, `clustercompare/` etc. in the top level. The contents of the installation directory should include directories such as `common/` `config/` and files such as `custom_typing.py` `custom_typing.pyi` etc. in the top level. @@ -464,7 +498,7 @@ The flag `--save_db` saves the pipeline-downloaded databases in your results dir ### DeepBGC DeepBGC relies on trained models and Pfams to run its analysis. -nf-core/funcscan will download these databases for you. If the flag `--save_db` is set, the downloaded files will be stored in the output directory under `databases/deepbgc/`. +nf-core/funcscan will download these databases for you. If the flag `--save_db` is set, the downloaded files will be stored in the output directory under `/databases/deepbgc/`. Alternatively, you can download the database locally with: @@ -478,10 +512,9 @@ deepbgc download You can then indicate the path to the database folder in the pipeline with `--bgc_deepbgc_db ///`. The contents of the database directory should include directories such as `common`, `0.1.0` in the top level: -```console +```tree deepbgc_db/ ├── common - └── Pfam-hmm-models*.hmm.* └── [0.1.0] ├── classifier | └── myClassifiers*.pkl @@ -489,9 +522,52 @@ deepbgc_db/ └── myDetectors*.pkl ``` +### InterProScan + +[InterProScan](https://github.com/ebi-pf-team/interproscan) is used to provide more information about the proteins annotated on the contigs. By default, turning on this subworkflow with `--run_protein_annotation` will download and unzip the [InterPro database](http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.72-103.0/) version 5.72-103.0. The database can be saved in the output directory `/databases/interproscan/` if the `--save_db` is turned on. + +:::note +The huge database download (5.5GB) can take up to 4 hours depending on the bandwidth. +::: + +A local version of the database can be supplied to the pipeline by passing the InterProScan database directory to `--protein_annotation_interproscan_db `. The directory can be created by running (e.g. for database version 5.72-103.0): + +``` +curl -L https://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.72-103.0/interproscan-5.72-103.0-64-bit.tar.gz -o interproscan_db/interproscan-5.72-103.0-64-bit.tar.gz +tar -xzf interproscan_db/interproscan-5.72-103.0-64-bit.tar.gz -C interproscan_db/ + +``` + +The contents of the database directory should include the directory `data` in the top level with a couple of subdirectories: + +``` +interproscan_db/ + └── data/ + ├── antifam + ├── cdd + ├── funfam + ├── gene3d + ├── hamap + ├── ncbifam + ├── panther + | └── [18.0] + ├── pfam + | └── [36.0] + ├── phobius + ├── pirsf + ├── pirsr + ├── prints + ├── prosite + | └── [2023_05] + ├── sfld + ├── smart + ├── superfamily + └── tmhmm +``` + ## Updating the pipeline -When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: +When you run the below command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: ```bash nextflow pull nf-core/funcscan @@ -499,23 +575,21 @@ nextflow pull nf-core/funcscan ## Reproducibility -It is a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. +It is a good idea to specify the pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. First, go to the [nf-core/funcscan releases page](https://github.com/nf-core/funcscan/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag. This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports. -To further assist in reproducibility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter. +To further assist in reproducibility, you can use share and reuse [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter. -:::tip -If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles. -::: +> [!TIP] +> If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles. ## Core Nextflow arguments -:::note -These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen). -::: +> [!NOTE] +> These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen) ### `-profile` @@ -523,11 +597,10 @@ Use this parameter to choose a configuration profile. Profiles can give configur Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below. -:::info -We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported. -::: +> [!IMPORTANT] +> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported. -The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to see if your system is available in these configs please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation). +The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to check if your system is supported, please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation). Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important! They are loaded in sequence, so later profiles can overwrite earlier profiles. @@ -568,13 +641,13 @@ Specify the path to a specific config file (this is a core Nextflow command). Se ### Resource requests -Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher requests (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. +Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the pipeline steps, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher resources request (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website. ### Custom Containers -In some cases you may wish to change which container or conda environment a step of the pipeline uses for a particular tool. By default nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However in some cases the pipeline specified version maybe out of date. +In some cases, you may wish to change the container or conda environment used by a pipeline steps for a particular tool. By default, nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However, in some cases the pipeline specified version maybe out of date. To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website. @@ -592,14 +665,6 @@ See the main [Nextflow documentation](https://www.nextflow.io/docs/latest/config If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack) on the [`#configs` channel](https://nfcore.slack.com/channels/configs). -## Azure Resource Requests - -To be used with the `azurebatch` profile by specifying the `-profile azurebatch`. -We recommend providing a compute `params.vm_type` of `Standard_D16_v3` VMs by default but these options can be changed if required. - -Note that the choice of VM size depends on your quota and the overall workload during the analysis. -For a thorough list, please refer the [Azure Sizes for virtual machines in Azure](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes). - ## Running in the background Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished. diff --git a/main.nf b/main.nf index 529aa3ee..30994327 100644 --- a/main.nf +++ b/main.nf @@ -9,8 +9,6 @@ ---------------------------------------------------------------------------------------- */ -nextflow.enable.dsl = 2 - /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IMPORT FUNCTIONS / MODULES / SUBWORKFLOWS / WORKFLOWS @@ -21,14 +19,6 @@ include { FUNCSCAN } from './workflows/funcscan' include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_nfcore_funcscan_pipeline' include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_funcscan_pipeline' - -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - GENOME PARAMETER VALUES -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - - /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NAMED WORKFLOWS FOR PIPELINE @@ -51,10 +41,8 @@ workflow NFCORE_FUNCSCAN { FUNCSCAN ( samplesheet ) - emit: multiqc_report = FUNCSCAN.out.multiqc_report // channel: /path/to/multiqc_report.html - } /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -65,13 +53,11 @@ workflow NFCORE_FUNCSCAN { workflow { main: - // // SUBWORKFLOW: Run initialisation tasks // PIPELINE_INITIALISATION ( params.version, - params.help, params.validate_params, params.monochrome_logs, args, @@ -85,7 +71,6 @@ workflow { NFCORE_FUNCSCAN ( PIPELINE_INITIALISATION.out.samplesheet ) - // // SUBWORKFLOW: Run completion tasks // diff --git a/modules.json b/modules.json index c8fdad1a..b4ef3688 100644 --- a/modules.json +++ b/modules.json @@ -7,209 +7,213 @@ "nf-core": { "abricate/run": { "branch": "master", - "git_sha": "9837ac7d7bb2e2362c021e8dc08efa96190b49a4", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "ampcombi2/cluster": { "branch": "master", - "git_sha": "900f6c970712e41b783e21e5dfc30f052174b5cd", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "ampcombi2/complete": { "branch": "master", - "git_sha": "900f6c970712e41b783e21e5dfc30f052174b5cd", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "ampcombi2/parsetables": { "branch": "master", - "git_sha": "900f6c970712e41b783e21e5dfc30f052174b5cd", + "git_sha": "637c3e1796ab13d4c91f3030932598aed94a4f87", "installed_by": ["modules"] }, "ampir": { "branch": "master", - "git_sha": "9bfc81874554e87740bcb3e5e07acf0a153c9ecb", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "amplify/predict": { "branch": "master", - "git_sha": "730f3aee80d5f8d0b5fc532202ac59361414d006", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "amrfinderplus/run": { "branch": "master", - "git_sha": "c0514dfc403fa97c96f549de6abe99f03c78fe8d", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "amrfinderplus/update": { "branch": "master", - "git_sha": "8f4a5d5ad55715f6c905ab73ce49f677cf6092fc", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "antismash/antismashlite": { "branch": "master", - "git_sha": "b20be35facfc5acdc1259f132ed79339d79e989f", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "antismash/antismashlitedownloaddatabases": { "branch": "master", - "git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "argnorm": { "branch": "master", - "git_sha": "e4fc46af5ec30070e6aef780aba14f89a28caa88", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "bakta/bakta": { "branch": "master", - "git_sha": "52507581f62929f98dd6e6c5c5824583fa6ef94d", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "bakta/baktadbdownload": { "branch": "master", - "git_sha": "7c06e6820fa3918bc28a040e794f8a2b39fabadb", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "deeparg/downloaddata": { "branch": "master", - "git_sha": "0af92e0fe6a34f31ee41eae66f04d71850fb4beb", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "deeparg/predict": { "branch": "master", - "git_sha": "90b63cde0f838ca4da3a88a37a5309888cae97b9", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "deepbgc/download": { "branch": "master", - "git_sha": "f315f85d9ac6c321f6e3596493fd61019340df2a", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "deepbgc/pipeline": { "branch": "master", - "git_sha": "34ac993e081b32d2170ab790d0386b74122f9d36", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "fargene": { "branch": "master", - "git_sha": "5e8481d994963871e3faf061d6fbf02fe33d8cad", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "gecco/run": { "branch": "master", - "git_sha": "f9707f9499a90a46208873d23440e22ac8ad5ebc", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "gunzip": { "branch": "master", - "git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hamronization/abricate": { "branch": "master", - "git_sha": "9837ac7d7bb2e2362c021e8dc08efa96190b49a4", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hamronization/amrfinderplus": { "branch": "master", - "git_sha": "52ddbb3ad754d870e485bcfcb680fe6a49d83567", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hamronization/deeparg": { "branch": "master", - "git_sha": "9837ac7d7bb2e2362c021e8dc08efa96190b49a4", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hamronization/fargene": { "branch": "master", - "git_sha": "9cf6f5e4ad9cc11a670a94d56021f1c4f9a91ec1", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hamronization/rgi": { "branch": "master", - "git_sha": "483e4838a2a009e826ea14da0dfc6bcaccef5ad1", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hamronization/summarize": { "branch": "master", - "git_sha": "9837ac7d7bb2e2362c021e8dc08efa96190b49a4", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "hmmer/hmmsearch": { "branch": "master", - "git_sha": "b046a286c8240ebe3412ddf8ae901d47008d1ca7", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", + "installed_by": ["modules"] + }, + "interproscan": { + "branch": "master", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "macrel/contigs": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "mmseqs/createdb": { "branch": "master", - "git_sha": "89fe39b745da3dca14ad1a361784812ea3aa3a43", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "mmseqs/createtsv": { "branch": "master", - "git_sha": "89fe39b745da3dca14ad1a361784812ea3aa3a43", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "mmseqs/databases": { "branch": "master", - "git_sha": "151460db852d636979d9ff3ee631e2268060d4c3", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "mmseqs/taxonomy": { "branch": "master", - "git_sha": "89fe39b745da3dca14ad1a361784812ea3aa3a43", + "git_sha": "2dc4c0474a77f5f8709eb970d890ad102e92af6f", "installed_by": ["modules"] }, "multiqc": { "branch": "master", - "git_sha": "878d2adbb911aa6e15c06a4d1e93d01bd6f26c74", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "prodigal": { "branch": "master", - "git_sha": "5e8481d994963871e3faf061d6fbf02fe33d8cad", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "prokka": { "branch": "master", - "git_sha": "697d97d46d56b12ff46a1a848a36849527cea0b8", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "pyrodigal": { "branch": "master", - "git_sha": "c00055a0b13d622b4f1f51a8e5be31deaf99ded7", + "git_sha": "938e803109104e30773f76a7142442722498fef1", "installed_by": ["modules"] }, "rgi/cardannotation": { "branch": "master", - "git_sha": "dbbb0c509e044d2680b429ba622049d4a23426dc", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "rgi/main": { "branch": "master", - "git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "seqkit/seq": { "branch": "master", - "git_sha": "03fbf6c89e551bd8d77f3b751fb5c955f75b34c5", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "tabix/bgzip": { "branch": "master", - "git_sha": "b20be35facfc5acdc1259f132ed79339d79e989f", + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", "installed_by": ["modules"] }, "untar": { "branch": "master", - "git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d", - "installed_by": ["modules"], - "patch": "modules/nf-core/untar/untar.diff" + "git_sha": "81880787133db07d9b4c1febd152c090eb8325dc", + "installed_by": ["modules"] } } }, @@ -217,17 +221,17 @@ "nf-core": { "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "c2b22d85f30a706a3073387f30380704fcae013b", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "92de218a329bfc9a9033116eb5f65fd270e72ba3", + "git_sha": "51ae5406a030d4da1e49e4dab49756844fdd6c7a", "installed_by": ["subworkflows"] }, - "utils_nfvalidation_plugin": { + "utils_nfschema_plugin": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", "installed_by": ["subworkflows"] } } diff --git a/modules/local/dramp_download.nf b/modules/local/amp_database_download.nf similarity index 50% rename from modules/local/dramp_download.nf rename to modules/local/amp_database_download.nf index 8b7eb2d1..8e2bc05a 100644 --- a/modules/local/dramp_download.nf +++ b/modules/local/amp_database_download.nf @@ -1,22 +1,26 @@ -process DRAMP_DOWNLOAD { +process AMP_DATABASE_DOWNLOAD { label 'process_single' - conda "bioconda::ampcombi=0.2.2" + conda "bioconda::ampcombi=2.0.1" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ampcombi:0.2.2--pyhdfd78af_0': - 'biocontainers/ampcombi:0.2.2--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/ampcombi:2.0.1--pyhdfd78af_0': + 'biocontainers/ampcombi:2.0.1--pyhdfd78af_0' }" + + input: + val database_id output: - path "amp_ref_database/" , emit: db - path "versions.yml" , emit: versions + path "amp_${database_id}_database" , emit: db + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when script: // This script is bundled with the pipeline, in nf-core/funcscan/bin/ """ - mkdir amp_ref_database/ - ampcombi_download.py + ampcombi_download.py \\ + --database_id $database_id \\ + --threads ${task.cpus} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/local/interproscan_download.nf b/modules/local/interproscan_download.nf new file mode 100644 index 00000000..119e6027 --- /dev/null +++ b/modules/local/interproscan_download.nf @@ -0,0 +1,35 @@ +process INTERPROSCAN_DATABASE { + tag "interproscan_database_download" + label 'process_long' + + conda "conda-forge::sed=4.7" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/curl:7.80.0' : + 'biocontainers/curl:7.80.0' }" + + input: + val database_url + + output: + path("interproscan_db/*"), emit: db + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + """ + mkdir -p interproscan_db/ + + filename=\$(basename ${database_url}) + + curl -L ${database_url} -o interproscan_db/\$filename + tar -xzf interproscan_db/\$filename -C interproscan_db/ + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + tar: \$(tar --version 2>&1 | sed -n '1s/tar (busybox) //p') + curl: "\$(curl --version 2>&1 | sed -n '1s/^curl \\([0-9.]*\\).*/\\1/p')" + END_VERSIONS + """ +} diff --git a/modules/nf-core/abricate/run/environment.yml b/modules/nf-core/abricate/run/environment.yml index 4b2a1d2a..53fe9857 100644 --- a/modules/nf-core/abricate/run/environment.yml +++ b/modules/nf-core/abricate/run/environment.yml @@ -1,7 +1,7 @@ -name: abricate_run +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::abricate=1.0.1 diff --git a/modules/nf-core/abricate/run/meta.yml b/modules/nf-core/abricate/run/meta.yml index 927c21f6..dce78f3c 100644 --- a/modules/nf-core/abricate/run/meta.yml +++ b/modules/nf-core/abricate/run/meta.yml @@ -11,34 +11,38 @@ tools: documentation: https://github.com/tseemann/abricate tool_dev_url: https://github.com/tseemann/abricate licence: ["GPL v2"] + identifier: biotools:ABRicate input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - assembly: - type: file - description: FASTA, GenBank or EMBL formatted file - pattern: "*.{fa,fasta,fna,fa.gz,fasta.gz,fna.gz,gbk,gbk.gz,embl,embl.gz}" - - databasedir: - type: directory - description: Optional location of local copy of database files, possibly with custom databases set up with `abricate --setupdb` - pattern: "*/" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - assembly: + type: file + description: FASTA, GenBank or EMBL formatted file + pattern: "*.{fa,fasta,fna,fa.gz,fasta.gz,fna.gz,gbk,gbk.gz,embl,embl.gz}" + - - databasedir: + type: directory + description: Optional location of local copy of database files, possibly with + custom databases set up with `abricate --setupdb` + pattern: "*/" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - report: - type: file - description: Tab-delimited report of results - pattern: "*.{txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.txt": + type: file + description: Tab-delimited report of results + pattern: "*.{txt}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@rpetit3" maintainers: diff --git a/modules/nf-core/ampcombi2/cluster/environment.yml b/modules/nf-core/ampcombi2/cluster/environment.yml index aa5e5fe4..e88b26ba 100644 --- a/modules/nf-core/ampcombi2/cluster/environment.yml +++ b/modules/nf-core/ampcombi2/cluster/environment.yml @@ -1,9 +1,7 @@ --- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json -name: "ampcombi2_cluster" channels: - conda-forge - bioconda - - defaults dependencies: - - "bioconda::ampcombi=0.2.2" + - bioconda::ampcombi=2.0.1 diff --git a/modules/nf-core/ampcombi2/cluster/main.nf b/modules/nf-core/ampcombi2/cluster/main.nf index 90495dba..98a19a96 100644 --- a/modules/nf-core/ampcombi2/cluster/main.nf +++ b/modules/nf-core/ampcombi2/cluster/main.nf @@ -4,8 +4,8 @@ process AMPCOMBI2_CLUSTER { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ampcombi:0.2.2--pyhdfd78af_0': - 'biocontainers/ampcombi:0.2.2--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/ampcombi:2.0.1--pyhdfd78af_0': + 'biocontainers/ampcombi:2.0.1--pyhdfd78af_0' }" input: path(summary_file) diff --git a/modules/nf-core/ampcombi2/cluster/meta.yml b/modules/nf-core/ampcombi2/cluster/meta.yml index 60949dc3..2e37a0c2 100644 --- a/modules/nf-core/ampcombi2/cluster/meta.yml +++ b/modules/nf-core/ampcombi2/cluster/meta.yml @@ -1,7 +1,7 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json name: "ampcombi2_cluster" -description: A submodule that clusters the merged AMP hits generated from ampcombi2/parsetables and ampcombi2/complete using MMseqs2 cluster. +description: A submodule that clusters the merged AMP hits generated from ampcombi2/parsetables + and ampcombi2/complete using MMseqs2 cluster. keywords: - antimicrobial peptides - amps @@ -12,36 +12,46 @@ keywords: - mmseqs2 tools: - ampcombi2/cluster: - description: "A tool for clustering all AMP hits found across many samples and supporting many AMP prediction tools." + description: "A tool for clustering all AMP hits found across many samples and + supporting many AMP prediction tools." homepage: "https://github.com/Darcy220606/AMPcombi" documentation: "https://github.com/Darcy220606/AMPcombi" tool_dev_url: "https://github.com/Darcy220606/AMPcombi/tree/dev" licence: ["MIT"] + identifier: "" input: - - summary_file: - type: file - description: A file corresponding to the Ampcombi_summary.tsv that is generated by running 'ampcombi complete'. It is a file containing all the merged AMP results from all samples and all tools. - pattern: "*.tsv" - + - - summary_file: + type: file + description: A file corresponding to the Ampcombi_summary.tsv that is generated + by running 'ampcombi complete'. It is a file containing all the merged AMP + results from all samples and all tools. + pattern: "*.tsv" output: - cluster_tsv: - type: file - description: A file containing all the results from the merged input table 'Ampcombi_summary.tsv', but also including the cluster id number. The clustering is done using MMseqs2 cluster. - pattern: "*.tsv" + - Ampcombi_summary_cluster.tsv: + type: file + description: A file containing all the results from the merged input table 'Ampcombi_summary.tsv', + but also including the cluster id number. The clustering is done using MMseqs2 + cluster. + pattern: "*.tsv" - rep_cluster_tsv: - type: file - description: A file containing the representative sequences of the clusters estimated by the tool. The clustering is done using MMseqs2 cluster. - pattern: "*.tsv" + - Ampcombi_summary_cluster_representative_seq.tsv: + type: file + description: A file containing the representative sequences of the clusters + estimated by the tool. The clustering is done using MMseqs2 cluster. + pattern: "*.tsv" - log: - type: file - description: A log file that captures the standard output for the entire process in a log file. Can be activated by `--log`. - pattern: "*.log" + - Ampcombi_cluster.log: + type: file + description: A log file that captures the standard output for the entire process + in a log file. Can be activated by `--log`. + pattern: "*.log" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@darcy220606" maintainers: diff --git a/modules/nf-core/ampcombi2/cluster/tests/main.nf.test.snap b/modules/nf-core/ampcombi2/cluster/tests/main.nf.test.snap index f4123c76..fd79a83b 100644 --- a/modules/nf-core/ampcombi2/cluster/tests/main.nf.test.snap +++ b/modules/nf-core/ampcombi2/cluster/tests/main.nf.test.snap @@ -4,14 +4,14 @@ true, true, [ - "versions.yml:md5,4e9aa3812bfee6ec22a1b6ccb62de2ca" + "versions.yml:md5,b629089d44775078dce5e664a455422b" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-24T12:11:40.928513749" + "timestamp": "2024-12-03T07:57:01.869983435" }, "ampcombi2_cluster - metagenome - stub": { "content": [ @@ -26,7 +26,7 @@ "Ampcombi_cluster.log:md5,d41d8cd98f00b204e9800998ecf8427e" ], "3": [ - "versions.yml:md5,4e9aa3812bfee6ec22a1b6ccb62de2ca" + "versions.yml:md5,b629089d44775078dce5e664a455422b" ], "cluster_tsv": [ "Ampcombi_summary_cluster.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" @@ -38,14 +38,14 @@ "Ampcombi_summary_cluster_representative_seq.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ], "versions": [ - "versions.yml:md5,4e9aa3812bfee6ec22a1b6ccb62de2ca" + "versions.yml:md5,b629089d44775078dce5e664a455422b" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-24T12:12:08.780718892" + "timestamp": "2024-12-03T07:57:23.939137628" } } \ No newline at end of file diff --git a/modules/nf-core/ampcombi2/complete/environment.yml b/modules/nf-core/ampcombi2/complete/environment.yml index fa640b77..e88b26ba 100644 --- a/modules/nf-core/ampcombi2/complete/environment.yml +++ b/modules/nf-core/ampcombi2/complete/environment.yml @@ -1,9 +1,7 @@ --- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json -name: "ampcombi2_complete" channels: - conda-forge - bioconda - - defaults dependencies: - - "bioconda::ampcombi=0.2.2" + - bioconda::ampcombi=2.0.1 diff --git a/modules/nf-core/ampcombi2/complete/main.nf b/modules/nf-core/ampcombi2/complete/main.nf index 0e4d5d53..98f62347 100644 --- a/modules/nf-core/ampcombi2/complete/main.nf +++ b/modules/nf-core/ampcombi2/complete/main.nf @@ -4,8 +4,8 @@ process AMPCOMBI2_COMPLETE { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ampcombi:0.2.2--pyhdfd78af_0': - 'biocontainers/ampcombi:0.2.2--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/ampcombi:2.0.1--pyhdfd78af_0': + 'biocontainers/ampcombi:2.0.1--pyhdfd78af_0' }" input: path(summaries) diff --git a/modules/nf-core/ampcombi2/complete/meta.yml b/modules/nf-core/ampcombi2/complete/meta.yml index e9ae632c..13a7468b 100644 --- a/modules/nf-core/ampcombi2/complete/meta.yml +++ b/modules/nf-core/ampcombi2/complete/meta.yml @@ -1,7 +1,7 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json name: "ampcombi2_complete" -description: A submodule that merges all output summary tables from ampcombi/parsetables in one summary file. +description: A submodule that merges all output summary tables from ampcombi/parsetables + in one summary file. keywords: - antimicrobial peptides - amps @@ -18,32 +18,38 @@ keywords: - DRAMP tools: - ampcombi2/complete: - description: "This merges the per sample AMPcombi summaries generated by running 'ampcombi2/parsetables'." + description: "This merges the per sample AMPcombi summaries generated by running + 'ampcombi2/parsetables'." homepage: "https://github.com/Darcy220606/AMPcombi" documentation: "https://github.com/Darcy220606/AMPcombi" tool_dev_url: "https://github.com/Darcy220606/AMPcombi/tree/dev" licence: ["MIT"] + identifier: "" input: - - summaries: - type: list - description: The path to the list of files corresponding to each sample as generated by ampcombi2/parsetables. - pattern: "[*_ampcombi.tsv, *_ampcombi.tsv]" - + - - summaries: + type: list + description: The path to the list of files corresponding to each sample as generated + by ampcombi2/parsetables. + pattern: "[*_ampcombi.tsv, *_ampcombi.tsv]" output: - tsv: - type: file - description: A file containing the complete AMPcombi summaries from all processed samples. - pattern: "*.tsv" + - Ampcombi_summary.tsv: + type: file + description: A file containing the complete AMPcombi summaries from all processed + samples. + pattern: "*.tsv" - log: - type: file - description: A log file that captures the standard output for the entire process in a log file. Can be activated by `--log`. - pattern: "*.log" + - Ampcombi_complete.log: + type: file + description: A log file that captures the standard output for the entire process + in a log file. Can be activated by `--log`. + pattern: "*.log" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@darcy220606" maintainers: diff --git a/modules/nf-core/ampcombi2/complete/tests/main.nf.test.snap b/modules/nf-core/ampcombi2/complete/tests/main.nf.test.snap index cd8fa18f..87435e5b 100644 --- a/modules/nf-core/ampcombi2/complete/tests/main.nf.test.snap +++ b/modules/nf-core/ampcombi2/complete/tests/main.nf.test.snap @@ -6,39 +6,39 @@ "Ampcombi_summary.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ], "1": [ - + ], "2": [ - "versions.yml:md5,0aa35e86761a6c160482b8b8dbfc5440" + "versions.yml:md5,bfba0046e0cfa7b0b6d79663823f94c0" ], "log": [ - + ], "tsv": [ "Ampcombi_summary.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ], "versions": [ - "versions.yml:md5,0aa35e86761a6c160482b8b8dbfc5440" + "versions.yml:md5,bfba0046e0cfa7b0b6d79663823f94c0" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-29T11:55:16.030399422" + "timestamp": "2024-12-03T07:57:53.385349848" }, "ampcombi2_complete - contigs": { "content": [ true, [ - "versions.yml:md5,0aa35e86761a6c160482b8b8dbfc5440" + "versions.yml:md5,bfba0046e0cfa7b0b6d79663823f94c0" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-29T11:54:54.334224301" + "timestamp": "2024-12-03T07:57:40.263912946" } -} \ No newline at end of file +} diff --git a/modules/nf-core/ampcombi2/parsetables/environment.yml b/modules/nf-core/ampcombi2/parsetables/environment.yml index 7a4b37ab..e88b26ba 100644 --- a/modules/nf-core/ampcombi2/parsetables/environment.yml +++ b/modules/nf-core/ampcombi2/parsetables/environment.yml @@ -1,9 +1,7 @@ --- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json -name: "ampcombi2_parsetables" channels: - conda-forge - bioconda - - defaults dependencies: - - "bioconda::ampcombi=0.2.2" + - bioconda::ampcombi=2.0.1 diff --git a/modules/nf-core/ampcombi2/parsetables/main.nf b/modules/nf-core/ampcombi2/parsetables/main.nf index d779440b..088497f4 100644 --- a/modules/nf-core/ampcombi2/parsetables/main.nf +++ b/modules/nf-core/ampcombi2/parsetables/main.nf @@ -1,31 +1,33 @@ process AMPCOMBI2_PARSETABLES { - tag "$meta.id" + tag "${meta.id}" label 'process_medium' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ampcombi:0.2.2--pyhdfd78af_0': - 'biocontainers/ampcombi:0.2.2--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/ampcombi:2.0.1--pyhdfd78af_0': + 'biocontainers/ampcombi:2.0.1--pyhdfd78af_0' }" input: tuple val(meta), path(amp_input) - path(faa_input) - path(gbk_input) - path(opt_amp_db) + path faa_input + path gbk_input + val opt_amp_db + path opt_amp_db_dir + path opt_interproscan output: - tuple val(meta), path("${meta.id}/") , emit: sample_dir - tuple val(meta), path("${meta.id}/contig_gbks/") , emit: contig_gbks - tuple val(meta), path("${meta.id}/${meta.id}_diamond_matches.txt"), emit: txt - tuple val(meta), path("${meta.id}/${meta.id}_ampcombi.tsv") , emit: tsv - tuple val(meta), path("${meta.id}/${meta.id}_amp.faa") , emit: faa - tuple val(meta), path("${meta.id}/${meta.id}_ampcombi.log") , emit: sample_log, optional:true - tuple val(meta), path("Ampcombi_parse_tables.log") , emit: full_log, optional:true - tuple val(meta), path("amp_ref_database/") , emit: results_db, optional:true - tuple val(meta), path("amp_ref_database/*.dmnd") , emit: results_db_dmnd, optional:true - tuple val(meta), path("amp_ref_database/*.clean.fasta") , emit: results_db_fasta, optional:true - tuple val(meta), path("amp_ref_database/*.tsv") , emit: results_db_tsv, optional:true - path "versions.yml" , emit: versions + tuple val(meta), path("${meta.id}/") , emit: sample_dir + tuple val(meta), path("${meta.id}/contig_gbks/") , emit: contig_gbks , optional:true + tuple val(meta), path("${meta.id}/${meta.id}_mmseqs_matches.tsv") , emit: db_tsv , optional:true + tuple val(meta), path("${meta.id}/${meta.id}_ampcombi.tsv") , emit: tsv , optional:true + tuple val(meta), path("${meta.id}/${meta.id}_amp.faa") , emit: faa , optional:true + tuple val(meta), path("${meta.id}/${meta.id}_ampcombi.log") , emit: sample_log , optional:true + tuple val(meta), path("Ampcombi_parse_tables.log") , emit: full_log , optional:true + tuple val(meta), path("amp_${opt_amp_db}_database/") , emit: db , optional:true + tuple val(meta), path("amp_${opt_amp_db}_database/*.txt") , emit: db_txt , optional:true + tuple val(meta), path("amp_${opt_amp_db}_database/*.fasta") , emit: db_fasta , optional:true + tuple val(meta), path("amp_${opt_amp_db}_database/mmseqs2/") , emit: db_mmseqs , optional:true + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -33,16 +35,20 @@ process AMPCOMBI2_PARSETABLES { script: def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def db = opt_amp_db? "--amp_database $opt_amp_db": "" + def db_dir = opt_amp_db_dir ? "--amp_database_dir ${opt_amp_db_dir}" : "" + def interpro = opt_interproscan ? "--interproscan_output ${opt_interproscan}" : "" + """ ampcombi parse_tables \\ - --path_list '${amp_input.collect{"$it"}.join("' '")}' \\ - --faa ${faa_input} \\ - --gbk ${gbk_input} \\ - --sample_list ${prefix} \\ - ${db} \\ - $args \\ - --threads ${task.cpus} + --path_list '${amp_input.collect { "${it}" }.join("' '")}' \\ + --faa ${faa_input} \\ + --gbk ${gbk_input} \\ + --sample_list ${prefix} \\ + --amp_database ${opt_amp_db} \\ + ${db_dir} \\ + ${interpro} \\ + ${args} \\ + --threads ${task.cpus} cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -53,20 +59,30 @@ process AMPCOMBI2_PARSETABLES { stub: def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def db = opt_amp_db? "--amp_database $opt_amp_db": "" + def db_dir = opt_amp_db_dir ? "--amp_database_dir ${opt_amp_db_dir}" : "" + def interpro = opt_interproscan ? "--interproscan_output ${opt_interproscan}" : "" + """ mkdir -p ${prefix} mkdir -p ${prefix}/contig_gbks - touch ${prefix}/${meta.id}_diamond_matches.txt + touch ${prefix}/${meta.id}_mmseqs_matches.tsv touch ${prefix}/${meta.id}_ampcombi.tsv touch ${prefix}/${meta.id}_amp.faa touch ${prefix}/${meta.id}_ampcombi.log touch Ampcombi_parse_tables.log - mkdir -p amp_ref_database - touch amp_ref_database/*.dmnd - touch amp_ref_database/*.clean.fasta - touch amp_ref_database/*.tsv + mkdir -p amp_${opt_amp_db}_database + mkdir -p amp_${opt_amp_db}_database/mmseqs2 + touch amp_${opt_amp_db}_database/*.fasta + touch amp_${opt_amp_db}_database/*.txt + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB.dbtype + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB_h + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB_h.dbtype + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB_h.index + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB.index + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB.lookup + touch amp_${opt_amp_db}_database/mmseqs2/ref_DB.source cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/ampcombi2/parsetables/meta.yml b/modules/nf-core/ampcombi2/parsetables/meta.yml index eeea5586..a8d1a4f7 100644 --- a/modules/nf-core/ampcombi2/parsetables/meta.yml +++ b/modules/nf-core/ampcombi2/parsetables/meta.yml @@ -1,7 +1,7 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json name: "ampcombi2_parsetables" -description: A submodule that parses and standardizes the results from various antimicrobial peptide identification tools. +description: A submodule that parses and standardizes the results from various antimicrobial + peptide identification tools. keywords: - antimicrobial peptides - amps @@ -16,91 +16,183 @@ keywords: - ampgram - amptransformer - DRAMP + - MMseqs2 + - InterProScan tools: - ampcombi2/parsetables: - description: "A parsing tool to convert and summarise the outputs from multiple AMP detection tools in a standardized format." + description: "A parsing tool to convert and summarise the outputs from multiple + AMP detection tools in a standardized format." homepage: "https://github.com/Darcy220606/AMPcombi" - documentation: "https://github.com/Darcy220606/AMPcombi" + documentation: "https://ampcombi.readthedocs.io/en/main/" tool_dev_url: "https://github.com/Darcy220606/AMPcombi/tree/dev" licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'sample1', single_end:false ]` - - amp_input: - type: list - description: The path to the directory containing the results for the AMP tools for each processed sample or a list of files corresponding to each file generated by AMP tools. - pattern: "[*amptool.tsv, *amptool.tsv]" - - faa_input: - type: file - description: The path to the file corresponding to the respective protein fasta files with '.faa' extension. File names have to contain the corresponding sample name, i.e. sample_1.faa - pattern: "*.faa" - - gbk_input: - type: file - description: The path to the file corresponding to the respective annotated files with either '.gbk' or '.gbff' extensions. File names must contain the corresponding sample name, i.e. sample_1.faa where "sample_1" is the sample name. - pattern: "*.gbk" - - opt_amp_db: - type: directory - description: The path to the folder containing the fasta and tsv database files. - pattern: "*/" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - amp_input: + type: list + description: The path to the directory containing the results for the AMP tools + for each processed sample or a list of files corresponding to each file generated + by AMP tools. + pattern: "[*amptool.tsv, *amptool.tsv]" + - - faa_input: + type: file + description: The path to the file corresponding to the respective protein fasta + files with '.faa' extension. File names have to contain the corresponding + sample name, i.e. sample_1.faa + pattern: "*.faa" + - - gbk_input: + type: file + description: The path to the file corresponding to the respective annotated + files with either '.gbk' or '.gbff' extensions. File names must contain the + corresponding sample name, i.e. sample_1.faa where "sample_1" is the sample + name. + pattern: "*.gbk" + - - opt_amp_db: + type: string + description: The name of the database to download and set up. This can either be 'DRAMP', 'APD' or 'UniRef100'. + pattern: "DRAMP|APD|UniRef100" + - - opt_amp_db_dir: + type: directory + description: The path to the folder containing the fasta and tsv database files. + pattern: "path/to/amp_*_database" + - - opt_interproscan: + type: directory + description: A path to a file corresponding to the respective tsv files containing protein classifications of the annotated CDSs. The file must be the raw output from InterProScan. + pattern: "*.tsv" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'sample1', single_end:false ]` - sample_dir: - type: directory - description: The output directory that contains the summary output and related alignment files for one sample. - pattern: "/*" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - ${meta.id}/: + type: directory + description: The output directory that contains the summary output and related + alignment files for one sample. + pattern: "/*" - contig_gbks: - type: directory - description: The output subdirectory that contains the gbk files containing the AMP hits for each sample. - pattern: "/*/contig_gbks" - - txt: - type: file - description: An alignment file containing the results from the DIAMOND alignment step done on all AMP hits. - pattern: "/*/*_diamond_matches.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - ${meta.id}/contig_gbks/: + type: directory + description: The output subdirectory that contains the gbk files containing + the AMP hits for each sample. + pattern: "/*/contig_gbks" + - db_tsv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - ${meta.id}/${meta.id}_mmseqs_matches.tsv: + type: file + description: An alignment file containing the results from the MMseqs2 alignment + step done on all AMP hits. + pattern: "/*/*_mmseqs_matches.tsv" - tsv: - type: file - description: A file containing the summary report of all predicted AMP hits from all AMP tools given as input, the corresponding taxonomic and functional classification from the alignment step and the estimated physiochemical properties. - pattern: "/*/*_ampcombi.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - ${meta.id}/${meta.id}_ampcombi.tsv: + type: file + description: A file containing the summary report of all predicted AMP hits + from all AMP tools given as input, the corresponding taxonomic and functional + classification from the alignment step and the estimated physiochemical properties. + pattern: "/*/*_ampcombi.tsv" - faa: - type: file - description: A fasta file containing the amino acid sequences of all predicted AMP hits. - pattern: "/*/*_amp.faa" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - ${meta.id}/${meta.id}_amp.faa: + type: file + description: A fasta file containing the amino acid sequences of all predicted + AMP hits. + pattern: "/*/*_amp.faa" - sample_log: - type: file - description: A log file that captures the standard output per sample in a log file. Can be activated by `--log`. - pattern: "/*/*.log" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - ${meta.id}/${meta.id}_ampcombi.log: + type: file + description: A log file that captures the standard output per sample in a log + file. Can be activated by `--log`. + pattern: "/*/*.log" - full_log: - type: file - description: A log file that captures the standard output for the entire process in a log file. Can be activated by `--log`. - pattern: "Ampcombi_parse_tables.log" - - results_db: - type: directory - description: If the AMP reference database is not provided by the user using the flag `--amp_database', by default the DRAMP database will be downloaded, filtered and stored in this folder. - pattern: "/amp_ref_database" - - results_db_dmnd: - type: file - description: AMP reference database converted to DIAMOND database format. - pattern: "/amp_ref_database/*.dmnd" - - results_db_fasta: - type: file - description: AMP reference database fasta file, cleaned of diamond-uncompatible characters. - pattern: "/amp_ref_database/*.clean.fasta" - - results_db_tsv: - type: file - description: AMP reference database in tsv-format with two columns containing header and sequence. - pattern: "/amp_ref_database/*.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - Ampcombi_parse_tables.log: + type: file + description: A log file that captures the standard output for the entire process + in a log file. Can be activated by `--log`. + pattern: "Ampcombi_parse_tables.log" + - db: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - amp_${opt_amp_db}_database/: + type: directory + description: If the AMP reference database ID is not provided by the user using + the flag `--amp_database', by default the DRAMP database will be downloaded, + filtered and stored in this folder. + pattern: "/amp_*_database" + - db_txt: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - amp_${opt_amp_db}_database/*.txt: + type: file + description: AMP reference database in tsv-format with two columns containing + header and sequence. + pattern: "/amp_*_database/*.txt" + - db_fasta: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - amp_${opt_amp_db}_database/*.fasta: + type: file + description: AMP reference database fasta file in clean format. + pattern: "/amp_*_database/*.fasta" + - db_mmseqs: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - amp_${opt_amp_db}_database/mmseqs2/: + type: directory + description: As alignment to the reference database is carried out by MMseqs2, this directory + contains all the files generated by MMseqs2 on the fasta file of the database. + pattern: "/amp_*_database/mmseqs2" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@darcy220606" maintainers: diff --git a/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test b/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test index 2d775179..272d31e6 100644 --- a/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test +++ b/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test @@ -28,7 +28,9 @@ nextflow_process { input[0] = amp_input input[1] = faa_input input[2] = gbk_input - input[3] = [] + input[3] = 'DRAMP' + input[4] = [] + input[5] = [] """ } } @@ -37,15 +39,17 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot(process.out.sample_dir.collect { file(it[1]).getName() } + - process.out.results_db.collect { file(it[1]).getName() } + - process.out.contig_gbks.collect { file(it[1]).getName() } + - process.out.full_log.collect { file(it[1]).readLines().contains("<--AMP_database>") } + - process.out.sample_log.collect { file(it[1]).readLines().contains("found ampir file") } + - process.out.txt.collect { file(it[1]).readLines()[0] } + - process.out.tsv.collect { file(it[1]).readLines()[0] } + - process.out.faa.collect { file(it[1]).readLines()[0] } + - process.out.summary_csv.collect { file(it[1]).readLines().contains("Structure_Description") } + - process.out.versions ).match() } + process.out.contig_gbks.collect { file(it[1]).getName() } + + process.out.db_tsv.collect { file(it[1]).readLines()[0] } + + process.out.tsv.collect { file(it[1]).readLines()[0] } + + process.out.faa.collect { file(it[1]).readLines()[0] } + + process.out.full_log.collect { file(it[1]).readLines().contains("File downloaded successfully") } + + process.out.sample_log.collect { file(it[1]).readLines().contains("found ampir file") } + + process.out.db.collect { file(it[1]).getName() } + + process.out.db_txt.collect { file(it[1]).readLines()[0] } + + process.out.db_fasta.collect { file(it[1]).readLines()[0] } + + process.out.db_mmseqs.collect { file(it[1]).getName() } + + process.out.versions ).match() } ) } } @@ -67,7 +71,9 @@ nextflow_process { input[0] = amp_input input[1] = faa_input input[2] = gbk_input - input[3] = [] + input[3] = 'DRAMP' + input[4] = [] + input[5] = [] """ } } diff --git a/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test.snap b/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test.snap index 54faf69f..47102283 100644 --- a/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test.snap +++ b/modules/nf-core/ampcombi2/parsetables/tests/main.nf.test.snap @@ -3,21 +3,24 @@ "content": [ [ "sample_1", - "amp_ref_database", "contig_gbks", + null, + "sample_id\tCDS_id\tprob_ampir\tprob_amplify\taa_sequence\tmolecular_weight\thelix_fraction\tturn_fraction\tsheet_fraction\tisoelectric_point\thydrophobicity\ttransporter_protein\tcontig_id\tCDS_start\tCDS_end\tCDS_dir\tCDS_stop_codon_found", + ">BAONEE_00005", false, true, - "contig_id\ttarget_id\tpident\tevalue\tnident\tfull_qseq\tfull_sseq\tqseq\tsseq\tqcovhsp\tscovhsp", - "sample_id\tCDS_id\tprob_ampir\tprob_amplify\taa_sequence\ttarget_id\tpident\tevalue\tSequence\tFamily\tSource\tPDB_ID\tLinear/Cyclic/Branched\tOther_Modifications\tPubmed_ID\tReference\tmolecular_weight\thelix_fraction\tturn_fraction\tsheet_fraction\tisoelectric_point\thydrophobicity\ttransporter_protein\tcontig_id\tCDS_start\tCDS_end\tCDS_dir\tCDS_stop_codon_found", - ">BAONEE_00005", - "versions.yml:md5,f32ab4ba79e66feba755b78d7d7a1f36" + "amp_DRAMP_database", + "DRAMP_ID\tSequence\tSequence_Length\tName\tSwiss_Prot_Entry\tFamily\tGene\tSource\tActivity\tProtein_existence\tStructure\tStructure_Description\tPDB_ID\tComments\tTarget_Organism\tHemolytic_activity\tLinear/Cyclic/Branched\tN-terminal_Modification\tC-terminal_Modification\tOther_Modifications\tStereochemistry\tCytotoxicity\tBinding_Traget\tPubmed_ID\tReference\tAuthor\tTitle", + ">DRAMP00005", + "mmseqs2", + "versions.yml:md5,09f086e07825d96816d792d73eee90ca" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-24T12:05:11.848363584" + "timestamp": "2024-12-11T13:58:57.988191067" }, "ampcombi2_parsetables - metagenome - stub": { "content": [ @@ -34,7 +37,7 @@ "sample_1_amp.faa:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_1_ampcombi.log:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_1_ampcombi.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", - "sample_1_diamond_matches.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + "sample_1_mmseqs_matches.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ] ] ], @@ -53,18 +56,27 @@ { "id": "sample_1" }, - "*.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + [ + "ref_DB:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.lookup:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.source:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.index:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ] ], "11": [ - "versions.yml:md5,f32ab4ba79e66feba755b78d7d7a1f36" + "versions.yml:md5,09f086e07825d96816d792d73eee90ca" ], "2": [ [ { "id": "sample_1" }, - "sample_1_diamond_matches.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + "sample_1_mmseqs_matches.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "3": [ @@ -105,9 +117,18 @@ "id": "sample_1" }, [ - "*.clean.fasta:md5,d41d8cd98f00b204e9800998ecf8427e", - "*.dmnd:md5,d41d8cd98f00b204e9800998ecf8427e", - "*.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + "*.fasta:md5,d41d8cd98f00b204e9800998ecf8427e", + "*.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + [ + "ref_DB:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.lookup:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.source:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.index:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ] ] ], @@ -116,7 +137,7 @@ { "id": "sample_1" }, - "*.dmnd:md5,d41d8cd98f00b204e9800998ecf8427e" + "*.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "9": [ @@ -124,7 +145,7 @@ { "id": "sample_1" }, - "*.clean.fasta:md5,d41d8cd98f00b204e9800998ecf8427e" + "*.fasta:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "contig_gbks": [ @@ -137,56 +158,82 @@ ] ] ], - "faa": [ + "db": [ [ { "id": "sample_1" }, - "sample_1_amp.faa:md5,d41d8cd98f00b204e9800998ecf8427e" + [ + "*.fasta:md5,d41d8cd98f00b204e9800998ecf8427e", + "*.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + [ + "ref_DB:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.lookup:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.source:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.index:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] ] ], - "full_log": [ + "db_fasta": [ [ { "id": "sample_1" }, - "Ampcombi_parse_tables.log:md5,d41d8cd98f00b204e9800998ecf8427e" + "*.fasta:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "results_db": [ + "db_mmseqs": [ [ { "id": "sample_1" }, [ - "*.clean.fasta:md5,d41d8cd98f00b204e9800998ecf8427e", - "*.dmnd:md5,d41d8cd98f00b204e9800998ecf8427e", - "*.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + "ref_DB:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.lookup:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB.source:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "ref_DB_h.index:md5,d41d8cd98f00b204e9800998ecf8427e" ] ] ], - "results_db_dmnd": [ + "db_tsv": [ + [ + { + "id": "sample_1" + }, + "sample_1_mmseqs_matches.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "db_txt": [ [ { "id": "sample_1" }, - "*.dmnd:md5,d41d8cd98f00b204e9800998ecf8427e" + "*.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "results_db_fasta": [ + "faa": [ [ { "id": "sample_1" }, - "*.clean.fasta:md5,d41d8cd98f00b204e9800998ecf8427e" + "sample_1_amp.faa:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "results_db_tsv": [ + "full_log": [ [ { "id": "sample_1" }, - "*.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + "Ampcombi_parse_tables.log:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "sample_dir": [ @@ -201,7 +248,7 @@ "sample_1_amp.faa:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_1_ampcombi.log:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_1_ampcombi.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", - "sample_1_diamond_matches.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + "sample_1_mmseqs_matches.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ] ] ], @@ -221,23 +268,15 @@ "sample_1_ampcombi.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "txt": [ - [ - { - "id": "sample_1" - }, - "sample_1_diamond_matches.txt:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], "versions": [ - "versions.yml:md5,f32ab4ba79e66feba755b78d7d7a1f36" + "versions.yml:md5,09f086e07825d96816d792d73eee90ca" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-24T12:05:34.675308615" + "timestamp": "2024-12-05T13:03:22.741430379" } } \ No newline at end of file diff --git a/modules/nf-core/ampcombi2/parsetables/tests/nextflow.config b/modules/nf-core/ampcombi2/parsetables/tests/nextflow.config index d39b0509..75396b7d 100644 --- a/modules/nf-core/ampcombi2/parsetables/tests/nextflow.config +++ b/modules/nf-core/ampcombi2/parsetables/tests/nextflow.config @@ -12,7 +12,8 @@ process { "--hmmsearch_file 'candidates.txt'", "--ampgram_file '.tsv'", "--amptransformer_file '.txt'", - "--log true" + "--log true", + "--interproscan_filter 'nonsense'" ].join(' ') ext.prefix = "sample_1" diff --git a/modules/nf-core/ampir/environment.yml b/modules/nf-core/ampir/environment.yml index 8cb475d1..3c6f4793 100644 --- a/modules/nf-core/ampir/environment.yml +++ b/modules/nf-core/ampir/environment.yml @@ -1,7 +1,7 @@ -name: ampir +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - conda-forge::r-ampir=1.1.0 diff --git a/modules/nf-core/ampir/meta.yml b/modules/nf-core/ampir/meta.yml index 231cec54..571ddd86 100644 --- a/modules/nf-core/ampir/meta.yml +++ b/modules/nf-core/ampir/meta.yml @@ -1,57 +1,69 @@ name: "ampir" -description: A fast and user-friendly method to predict antimicrobial peptides (AMPs) from any given size protein dataset. ampir uses a supervised statistical machine learning approach to predict AMPs. +description: A fast and user-friendly method to predict antimicrobial peptides (AMPs) + from any given size protein dataset. ampir uses a supervised statistical machine + learning approach to predict AMPs. keywords: - ampir - amp - antimicrobial peptide prediction tools: - "ampir": - description: "A toolkit to predict antimicrobial peptides from protein sequences on a genome-wide scale." + description: "A toolkit to predict antimicrobial peptides from protein sequences + on a genome-wide scale." homepage: "https://github.com/Legana/ampir" documentation: "https://cran.r-project.org/web/packages/ampir/index.html" tool_dev_url: "https://github.com/Legana/ampir" doi: "10.1093/bioinformatics/btaa653" licence: ["GPL v2"] + identifier: biotools:ampir input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - faa: - type: file - description: FASTA file containing amino acid sequences - pattern: "*.{faa,fasta}" - - model: - type: string - description: Built-in model for AMP prediction - pattern: "{precursor,mature}" - - min_length: - type: integer - description: Minimum protein length for which predictions will be generated - pattern: "[0-9]+" - - min_probability: - type: float - description: Cut-off for AMP prediction - pattern: "[0-9].[0-9]+" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - faa: + type: file + description: FASTA file containing amino acid sequences + pattern: "*.{faa,fasta}" + - - model: + type: string + description: Built-in model for AMP prediction + pattern: "{precursor,mature}" + - - min_length: + type: integer + description: Minimum protein length for which predictions will be generated + pattern: "[0-9]+" + - - min_probability: + type: float + description: Cut-off for AMP prediction + pattern: "[0-9].[0-9]+" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - amps_faa: - type: file - description: File containing AMP predictions in amino acid FASTA format - pattern: "*.{faa}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.faa": + type: file + description: File containing AMP predictions in amino acid FASTA format + pattern: "*.{faa}" - amps_tsv: - type: file - description: File containing AMP predictions in TSV format - pattern: "*.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: File containing AMP predictions in TSV format + pattern: "*.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jasmezz" maintainers: diff --git a/modules/nf-core/amplify/predict/environment.yml b/modules/nf-core/amplify/predict/environment.yml index c980cf5e..872115b4 100644 --- a/modules/nf-core/amplify/predict/environment.yml +++ b/modules/nf-core/amplify/predict/environment.yml @@ -1,7 +1,7 @@ -name: amplify_predict +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::amplify=2.0.0 diff --git a/modules/nf-core/amplify/predict/meta.yml b/modules/nf-core/amplify/predict/meta.yml index 5ef93c83..cbe19f33 100644 --- a/modules/nf-core/amplify/predict/meta.yml +++ b/modules/nf-core/amplify/predict/meta.yml @@ -1,5 +1,6 @@ name: "amplify_predict" -description: AMPlify is an attentive deep learning model for antimicrobial peptide prediction. +description: AMPlify is an attentive deep learning model for antimicrobial peptide + prediction. keywords: - antimicrobial peptides - AMPs @@ -13,33 +14,37 @@ tools: tool_dev_url: "https://github.com/bcgsc/AMPlify" doi: "10.1186/s12864-022-08310-4" licence: ["GPL v3"] + identifier: biotools:amplify input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - faa: - type: file - description: amino acid sequences fasta - pattern: "*.{fa,fa.gz,faa,faa.gz,fasta,fasta.gz}" - - model_dir: - type: directory - description: Directory of where models are stored (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - faa: + type: file + description: amino acid sequences fasta + pattern: "*.{fa,fa.gz,faa,faa.gz,fasta,fasta.gz}" + - - model_dir: + type: directory + description: Directory of where models are stored (optional) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - tsv: - type: file - description: amino acid sequences with prediction (AMP, non-AMP) and probability scores - pattern: "*.{tsv}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: amino acid sequences with prediction (AMP, non-AMP) and probability + scores + pattern: "*.{tsv}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" maintainers: diff --git a/modules/nf-core/amplify/predict/tests/main.nf.test b/modules/nf-core/amplify/predict/tests/main.nf.test index 835c409c..d9ca94ae 100644 --- a/modules/nf-core/amplify/predict/tests/main.nf.test +++ b/modules/nf-core/amplify/predict/tests/main.nf.test @@ -13,7 +13,7 @@ nextflow_process { test("AMPlify predict (with Prodigal) - sarscov2 - contigs.fasta") { - setup { + setup { run("PRODIGAL") { script "../../../prodigal/main.nf" process { @@ -31,7 +31,7 @@ nextflow_process { process { """ input[0] = PRODIGAL.out.amino_acid_fasta - + """ } } @@ -55,7 +55,7 @@ nextflow_process { } - test("AMPlify predict - stub") { + test("AMPlify predict (with Prodigal) - sarscov2 - contigs.fasta - stub") { options "-stub" diff --git a/modules/nf-core/amplify/predict/tests/main.nf.test.snap b/modules/nf-core/amplify/predict/tests/main.nf.test.snap index d70e80eb..9803e2b7 100644 --- a/modules/nf-core/amplify/predict/tests/main.nf.test.snap +++ b/modules/nf-core/amplify/predict/tests/main.nf.test.snap @@ -29,10 +29,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-04-05T12:58:56.67316521" + "timestamp": "2024-12-13T12:43:54.777959891" }, "AMPlify predict (with Prodigal) - sarscov2 - contigs.fasta": { "content": [ @@ -69,4 +69,4 @@ }, "timestamp": "2024-04-05T12:58:49.894554665" } -} \ No newline at end of file +} diff --git a/modules/nf-core/amrfinderplus/run/environment.yml b/modules/nf-core/amrfinderplus/run/environment.yml index 214f44f4..0487b72d 100644 --- a/modules/nf-core/amrfinderplus/run/environment.yml +++ b/modules/nf-core/amrfinderplus/run/environment.yml @@ -1,7 +1,7 @@ -name: amrfinderplus_run +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::ncbi-amrfinderplus=3.12.8 diff --git a/modules/nf-core/amrfinderplus/run/meta.yml b/modules/nf-core/amrfinderplus/run/meta.yml index 465927df..d081a2bd 100644 --- a/modules/nf-core/amrfinderplus/run/meta.yml +++ b/modules/nf-core/amrfinderplus/run/meta.yml @@ -6,50 +6,64 @@ keywords: - antibiotic resistance tools: - amrfinderplus: - description: AMRFinderPlus finds antimicrobial resistance and other genes in protein or nucleotide sequences. + description: AMRFinderPlus finds antimicrobial resistance and other genes in protein + or nucleotide sequences. homepage: https://github.com/ncbi/amr/wiki documentation: https://github.com/ncbi/amr/wiki tool_dev_url: https://github.com/ncbi/amr doi: "10.1038/s41598-021-91456-0" licence: ["Public Domain"] + identifier: biotools:amrfinderplus input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Nucleotide or protein sequences in FASTA format - pattern: "*.{fasta,fasta.gz,fa,fa.gz,fna,fna.gz,faa,faa.gz}" - - db: - type: file - description: A compressed tarball of the AMRFinderPlus database to query - pattern: "*.tar.gz" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Nucleotide or protein sequences in FASTA format + pattern: "*.{fasta,fasta.gz,fa,fa.gz,fna,fna.gz,faa,faa.gz}" + - - db: + type: file + description: A compressed tarball of the AMRFinderPlus database to query + pattern: "*.tar.gz" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - report: - type: file - description: AMRFinder+ final report - pattern: "*.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.tsv: + type: file + description: AMRFinder+ final report + pattern: "*.tsv" - mutation_report: - type: file - description: Report of organism-specific point-mutations - pattern: "*-mutations.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}-mutations.tsv: + type: file + description: Report of organism-specific point-mutations + pattern: "*-mutations.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" - tool_version: - type: string - description: The version of the tool in string format (useful for downstream tools such as hAMRronization) + - VER: + type: string + description: The version of the tool in string format (useful for downstream + tools such as hAMRronization) - db_version: - type: string - description: The version of the used database in string format (useful for downstream tools such as hAMRronization) + - DBVER: + type: string + description: The version of the used database in string format (useful for downstream + tools such as hAMRronization) authors: - "@rpetit3" - "@louperelo" diff --git a/modules/nf-core/amrfinderplus/update/environment.yml b/modules/nf-core/amrfinderplus/update/environment.yml index d08f0725..0487b72d 100644 --- a/modules/nf-core/amrfinderplus/update/environment.yml +++ b/modules/nf-core/amrfinderplus/update/environment.yml @@ -1,7 +1,7 @@ -name: amrfinderplus_update +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::ncbi-amrfinderplus=3.12.8 diff --git a/modules/nf-core/amrfinderplus/update/meta.yml b/modules/nf-core/amrfinderplus/update/meta.yml index 7a9345d6..574957e1 100644 --- a/modules/nf-core/amrfinderplus/update/meta.yml +++ b/modules/nf-core/amrfinderplus/update/meta.yml @@ -6,27 +6,26 @@ keywords: - antibiotic resistance tools: - amrfinderplus: - description: AMRFinderPlus finds antimicrobial resistance and other genes in protein or nucleotide sequences. + description: AMRFinderPlus finds antimicrobial resistance and other genes in protein + or nucleotide sequences. homepage: https://github.com/ncbi/amr/wiki documentation: https://github.com/ncbi/amr/wiki tool_dev_url: https://github.com/ncbi/amr doi: "10.1038/s41598-021-91456-0" licence: ["Public Domain"] + identifier: biotools:amrfinderplus # this module does have any input. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - db: - type: file - description: The latest AMRFinder+ database in a compressed tarball - pattern: "*.tar.gz" + - amrfinderdb.tar.gz: + type: file + description: The latest AMRFinder+ database in a compressed tarball + pattern: "*.tar.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@rpetit3" maintainers: diff --git a/modules/nf-core/antismash/antismashlite/environment.yml b/modules/nf-core/antismash/antismashlite/environment.yml index 227b5264..dc2807d5 100644 --- a/modules/nf-core/antismash/antismashlite/environment.yml +++ b/modules/nf-core/antismash/antismashlite/environment.yml @@ -1,7 +1,7 @@ -name: antismash_antismashlite +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::antismash-lite=7.1.0 diff --git a/modules/nf-core/antismash/antismashlite/main.nf b/modules/nf-core/antismash/antismashlite/main.nf index 422e7be0..3a521557 100644 --- a/modules/nf-core/antismash/antismashlite/main.nf +++ b/modules/nf-core/antismash/antismashlite/main.nf @@ -1,44 +1,45 @@ process ANTISMASH_ANTISMASHLITE { - tag "$meta.id" + tag "${meta.id}" label 'process_medium' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/antismash-lite:7.1.0--pyhdfd78af_0' : - 'biocontainers/antismash-lite:7.1.0--pyhdfd78af_0' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/antismash-lite:7.1.0--pyhdfd78af_0' + : 'biocontainers/antismash-lite:7.1.0--pyhdfd78af_0'}" containerOptions { - workflow.containerEngine == 'singularity' ? - "-B $antismash_dir:/usr/local/lib/python3.10/site-packages/antismash" : - workflow.containerEngine == 'docker' ? - "-v \$PWD/$antismash_dir:/usr/local/lib/python3.10/site-packages/antismash" : - '' - } + ['singularity', 'apptainer'].contains(workflow.containerEngine) + ? "-B ${antismash_dir}:/usr/local/lib/python3.10/site-packages/antismash" + : workflow.containerEngine == 'docker' + ? "-v \$PWD/${antismash_dir}:/usr/local/lib/python3.10/site-packages/antismash" + : '' + } input: tuple val(meta), path(sequence_input) - path(databases) - path(antismash_dir) // Optional input: AntiSMASH installation folder. It is not needed for using this module with conda, but required for docker/singularity (see meta.yml). - path(gff) + path databases + path antismash_dir + // Optional input: AntiSMASH installation folder. It is not needed for using this module with conda, but required for docker/singularity (see meta.yml). + path gff output: - tuple val(meta), path("${prefix}/clusterblast/*_c*.txt") , optional: true, emit: clusterblast_file - tuple val(meta), path("${prefix}/{css,images,js}") , emit: html_accessory_files - tuple val(meta), path("${prefix}/knownclusterblast/region*/ctg*.html") , optional: true, emit: knownclusterblast_html - tuple val(meta), path("${prefix}/knownclusterblast/") , optional: true, emit: knownclusterblast_dir - tuple val(meta), path("${prefix}/knownclusterblast/*_c*.txt") , optional: true, emit: knownclusterblast_txt - tuple val(meta), path("${prefix}/svg/clusterblast*.svg") , optional: true, emit: svg_files_clusterblast - tuple val(meta), path("${prefix}/svg/knownclusterblast*.svg") , optional: true, emit: svg_files_knownclusterblast - tuple val(meta), path("${prefix}/*.gbk") , emit: gbk_input - tuple val(meta), path("${prefix}/*.json") , emit: json_results - tuple val(meta), path("${prefix}/*.log") , emit: log - tuple val(meta), path("${prefix}/*.zip") , emit: zip - tuple val(meta), path("${prefix}/*region*.gbk") , optional: true, emit: gbk_results - tuple val(meta), path("${prefix}/clusterblastoutput.txt") , optional: true, emit: clusterblastoutput - tuple val(meta), path("${prefix}/index.html") , emit: html - tuple val(meta), path("${prefix}/knownclusterblastoutput.txt") , optional: true, emit: knownclusterblastoutput - tuple val(meta), path("${prefix}/regions.js") , emit: json_sideloading - path "versions.yml" , emit: versions + tuple val(meta), path("${prefix}/clusterblast/*_c*.txt"), optional: true, emit: clusterblast_file + tuple val(meta), path("${prefix}/{css,images,js}"), emit: html_accessory_files + tuple val(meta), path("${prefix}/knownclusterblast/region*/ctg*.html"), optional: true, emit: knownclusterblast_html + tuple val(meta), path("${prefix}/knownclusterblast/"), optional: true, emit: knownclusterblast_dir + tuple val(meta), path("${prefix}/knownclusterblast/*_c*.txt"), optional: true, emit: knownclusterblast_txt + tuple val(meta), path("${prefix}/svg/clusterblast*.svg"), optional: true, emit: svg_files_clusterblast + tuple val(meta), path("${prefix}/svg/knownclusterblast*.svg"), optional: true, emit: svg_files_knownclusterblast + tuple val(meta), path("${prefix}/*.gbk"), emit: gbk_input + tuple val(meta), path("${prefix}/*.json"), emit: json_results + tuple val(meta), path("${prefix}/*.log"), emit: log + tuple val(meta), path("${prefix}/*.zip"), emit: zip + tuple val(meta), path("${prefix}/*region*.gbk"), optional: true, emit: gbk_results + tuple val(meta), path("${prefix}/clusterblastoutput.txt"), optional: true, emit: clusterblastoutput + tuple val(meta), path("${prefix}/index.html"), emit: html + tuple val(meta), path("${prefix}/knownclusterblastoutput.txt"), optional: true, emit: knownclusterblastoutput + tuple val(meta), path("${prefix}/regions.js"), emit: json_sideloading + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when @@ -53,25 +54,24 @@ process ANTISMASH_ANTISMASHLITE { ## this should be run as a separate module for versioning purposes antismash \\ - $args \\ - $gff_flag \\ - -c $task.cpus \\ - --output-dir $prefix \\ - --output-basename $prefix \\ + ${args} \\ + ${gff_flag} \\ + -c ${task.cpus} \\ + --output-dir ${prefix} \\ + --output-basename ${prefix} \\ --genefinding-tool none \\ - --logfile $prefix/${prefix}.log \\ - --databases $databases \\ - $sequence_input + --logfile ${prefix}/${prefix}.log \\ + --databases ${databases} \\ + ${sequence_input} cat <<-END_VERSIONS > versions.yml "${task.process}": - antismash-lite: \$(echo \$(antismash --version) | sed 's/antiSMASH //') + antismash-lite: \$(echo \$(antismash --version) | sed 's/antiSMASH //;s/-.*//g') END_VERSIONS """ stub: prefix = task.ext.suffix ? "${meta.id}${task.ext.suffix}" : "${meta.id}" - def VERSION = '7.1.0' // WARN: Version information not provided by tool during stub run. Please update this string when bumping container versions. """ mkdir -p ${prefix}/css mkdir ${prefix}/images @@ -91,7 +91,7 @@ process ANTISMASH_ANTISMASHLITE { cat <<-END_VERSIONS > versions.yml "${task.process}": - antismash-lite: $VERSION + antismash-lite: \$(echo \$(antismash --version) | sed 's/antiSMASH //;s/-.*//g') END_VERSIONS """ } diff --git a/modules/nf-core/antismash/antismashlite/meta.yml b/modules/nf-core/antismash/antismashlite/meta.yml index 21f506bd..63828343 100644 --- a/modules/nf-core/antismash/antismashlite/meta.yml +++ b/modules/nf-core/antismash/antismashlite/meta.yml @@ -23,110 +23,207 @@ tools: tool_dev_url: "https://github.com/antismash/antismash" doi: "10.1093/nar/gkab335" licence: ["AGPL v3"] + identifier: biotools:antismash input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - sequence_input: - type: file - description: nucleotide sequence file (annotated) - pattern: "*.{gbk, gb, gbff, genbank, embl, fasta, fna}" - - databases: - type: directory - description: | - Downloaded AntiSMASH databases (e.g. in the AntiSMASH installation directory - "data/databases") - pattern: "*/" - - antismash_dir: - type: directory - description: | - A local copy of an AntiSMASH installation folder. This is required when running with - docker and singularity (not required for conda), due to attempted 'modifications' of - files during database checks in the installation directory, something that cannot - be done in immutable docker/singularity containers. Therefore, a local installation - directory needs to be mounted (including all modified files from the downloading step) - to the container as a workaround. - pattern: "*/" - - gff: - type: file - description: Annotations in GFF3 format (only if sequence_input is in FASTA format) - pattern: "*.gff" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - sequence_input: + type: file + description: nucleotide sequence file (annotated) + pattern: "*.{gbk, gb, gbff, genbank, embl, fasta, fna}" + - - databases: + type: directory + description: | + Downloaded AntiSMASH databases (e.g. in the AntiSMASH installation directory + "data/databases") + pattern: "*/" + - - antismash_dir: + type: directory + description: | + A local copy of an AntiSMASH installation folder. This is required when running with + docker and singularity (not required for conda), due to attempted 'modifications' of + files during database checks in the installation directory, something that cannot + be done in immutable docker/singularity containers. Therefore, a local installation + directory needs to be mounted (including all modified files from the downloading step) + to the container as a workaround. + pattern: "*/" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - clusterblast_file: - type: file - description: Output of ClusterBlast algorithm - pattern: "clusterblast/*_c*.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/clusterblast/*_c*.txt: + type: file + description: Output of ClusterBlast algorithm + pattern: "clusterblast/*_c*.txt" - html_accessory_files: - type: directory - description: Accessory files for the HTML output - pattern: "{css/,images/,js/}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/{css,images,js}: + type: directory + description: Accessory files for the HTML output + pattern: "{css/,images/,js/}" - knownclusterblast_html: - type: file - description: Tables with MIBiG hits in HTML format - pattern: "knownclusterblast/region*/ctg*.html" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/knownclusterblast/region*/ctg*.html: + type: file + description: Tables with MIBiG hits in HTML format + pattern: "knownclusterblast/region*/ctg*.html" - knownclusterblast_dir: - type: directory - description: Directory with MIBiG hits - pattern: "knownclusterblast/" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/knownclusterblast/: + type: directory + description: Directory with MIBiG hits + pattern: "knownclusterblast/" - knownclusterblast_txt: - type: file - description: Tables with MIBiG hits - pattern: "knownclusterblast/*_c*.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/knownclusterblast/*_c*.txt: + type: file + description: Tables with MIBiG hits + pattern: "knownclusterblast/*_c*.txt" - svg_files_clusterblast: - type: file - description: SVG images showing the % identity of the aligned hits against their queries - pattern: "svg/clusterblast*.svg" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/svg/clusterblast*.svg: + type: file + description: SVG images showing the % identity of the aligned hits against their + queries + pattern: "svg/clusterblast*.svg" - svg_files_knownclusterblast: - type: file - description: SVG images showing the % identity of the aligned hits against their queries - pattern: "svg/knownclusterblast*.svg" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/svg/knownclusterblast*.svg: + type: file + description: SVG images showing the % identity of the aligned hits against their + queries + pattern: "svg/knownclusterblast*.svg" - gbk_input: - type: file - description: Nucleotide sequence and annotations in GenBank format; converted from input file - pattern: "*.gbk" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.gbk: + type: file + description: Nucleotide sequence and annotations in GenBank format; converted + from input file + pattern: "*.gbk" - json_results: - type: file - description: Nucleotide sequence and annotations in JSON format; converted from GenBank file (gbk_input) - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.json: + type: file + description: Nucleotide sequence and annotations in JSON format; converted from + GenBank file (gbk_input) + pattern: "*.json" - log: - type: file - description: Contains all the logging output that antiSMASH produced during its run - pattern: "*.log" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.log: + type: file + description: Contains all the logging output that antiSMASH produced during + its run + pattern: "*.log" - zip: - type: file - description: Contains a compressed version of the output folder in zip format - pattern: "*.zip" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.zip: + type: file + description: Contains a compressed version of the output folder in zip format + pattern: "*.zip" - gbk_results: - type: file - description: Nucleotide sequence and annotations in GenBank format; one file per antiSMASH hit - pattern: "*region*.gbk" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*region*.gbk: + type: file + description: Nucleotide sequence and annotations in GenBank format; one file + per antiSMASH hit + pattern: "*region*.gbk" - clusterblastoutput: - type: file - description: Raw BLAST output of known clusters previously predicted by antiSMASH using the built-in ClusterBlast algorithm - pattern: "clusterblastoutput.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/clusterblastoutput.txt: + type: file + description: Raw BLAST output of known clusters previously predicted by antiSMASH + using the built-in ClusterBlast algorithm + pattern: "clusterblastoutput.txt" - html: - type: file - description: Graphical web view of results in HTML format - patterN: "index.html" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/index.html: + type: file + description: Graphical web view of results in HTML format + patterN: "index.html" - knownclusterblastoutput: - type: file - description: Raw BLAST output of known clusters of the MIBiG database - pattern: "knownclusterblastoutput.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/knownclusterblastoutput.txt: + type: file + description: Raw BLAST output of known clusters of the MIBiG database + pattern: "knownclusterblastoutput.txt" - json_sideloading: - type: file - description: Sideloaded annotations of protoclusters and/or subregions (see antiSMASH documentation "Annotation sideloading") - pattern: "regions.js" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/regions.js: + type: file + description: Sideloaded annotations of protoclusters and/or subregions (see + antiSMASH documentation "Annotation sideloading") + pattern: "regions.js" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jasmezz" maintainers: diff --git a/modules/nf-core/antismash/antismashlite/tests/main.nf.test b/modules/nf-core/antismash/antismashlite/tests/main.nf.test index 5ee21d6d..d58fcc4c 100644 --- a/modules/nf-core/antismash/antismashlite/tests/main.nf.test +++ b/modules/nf-core/antismash/antismashlite/tests/main.nf.test @@ -3,6 +3,7 @@ nextflow_process { name "Test Process ANTISMASH_ANTISMASHLITE" script "../main.nf" process "ANTISMASH_ANTISMASHLITE" + config './nextflow.config' tag "modules" tag "modules_nfcore" @@ -96,7 +97,11 @@ nextflow_process { { assert path(process.out.html.get(0).get(1)).text.contains("https://antismash.secondarymetabolites.org/") }, { assert path(process.out.json_sideloading.get(0).get(1)).text.contains("\"seq_id\": \"NZ_CP069563.1\"") }, { assert path(process.out.log.get(0).get(1)).text.contains("antiSMASH status: SUCCESS") }, - { assert snapshot(process.out.versions).match("versions") } + { assert snapshot( + path(process.out.versions[0]).yaml, + file(process.out.versions[0]).name, + ).match("versions") + } ) } } @@ -119,7 +124,10 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out).match() } + { assert snapshot( + file(process.out.versions[0]).name, + ).match("versions_stub") + } ) } } diff --git a/modules/nf-core/antismash/antismashlite/tests/main.nf.test.snap b/modules/nf-core/antismash/antismashlite/tests/main.nf.test.snap index 618b06f9..7d2febc9 100644 --- a/modules/nf-core/antismash/antismashlite/tests/main.nf.test.snap +++ b/modules/nf-core/antismash/antismashlite/tests/main.nf.test.snap @@ -1,15 +1,28 @@ { + "versions_stub": { + "content": [ + "versions.yml" + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.4" + }, + "timestamp": "2025-01-30T14:55:41.041351955" + }, "versions": { "content": [ - [ - "versions.yml:md5,2a1c54c017741b59c057a05453fc067d" - ] + { + "ANTISMASH_ANTISMASHLITE": { + "antismash-lite": "7.1.0" + } + }, + "versions.yml" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-02-09T17:06:08.439031477" + "timestamp": "2025-01-30T13:48:51.158220245" }, "html_accessory_files": { "content": [ @@ -64,238 +77,9 @@ ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-09T17:06:08.392236617" - }, - "antismashlite - bacteroides_fragilis - genome - stub": { - "content": [ - { - "0": [ - - ], - "1": [ - [ - { - "id": "test" - }, - [ - [ - "bacteria.css:md5,d41d8cd98f00b204e9800998ecf8427e" - ], - [ - "about.svg:md5,d41d8cd98f00b204e9800998ecf8427e" - ], - [ - "antismash.js:md5,d41d8cd98f00b204e9800998ecf8427e", - "jquery.js:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - ] - ], - "10": [ - [ - { - "id": "test" - }, - "genome.zip:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "11": [ - [ - { - "id": "test" - }, - [ - "NZ_CP069563.1.region001.gbk:md5,d41d8cd98f00b204e9800998ecf8427e", - "NZ_CP069563.1.region002.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - ], - "12": [ - - ], - "13": [ - [ - { - "id": "test" - }, - "index.html:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "14": [ - - ], - "15": [ - [ - { - "id": "test" - }, - "regions.js:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "16": [ - "versions.yml:md5,2a1c54c017741b59c057a05453fc067d" - ], - "2": [ - - ], - "3": [ - - ], - "4": [ - - ], - "5": [ - - ], - "6": [ - - ], - "7": [ - [ - { - "id": "test" - }, - [ - "NZ_CP069563.1.region001.gbk:md5,d41d8cd98f00b204e9800998ecf8427e", - "NZ_CP069563.1.region002.gbk:md5,d41d8cd98f00b204e9800998ecf8427e", - "genome.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - ], - "8": [ - [ - { - "id": "test" - }, - "genome.json:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "9": [ - [ - { - "id": "test" - }, - "test.log:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "clusterblast_file": [ - - ], - "clusterblastoutput": [ - - ], - "gbk_input": [ - [ - { - "id": "test" - }, - [ - "NZ_CP069563.1.region001.gbk:md5,d41d8cd98f00b204e9800998ecf8427e", - "NZ_CP069563.1.region002.gbk:md5,d41d8cd98f00b204e9800998ecf8427e", - "genome.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - ], - "gbk_results": [ - [ - { - "id": "test" - }, - [ - "NZ_CP069563.1.region001.gbk:md5,d41d8cd98f00b204e9800998ecf8427e", - "NZ_CP069563.1.region002.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - ], - "html": [ - [ - { - "id": "test" - }, - "index.html:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "html_accessory_files": [ - [ - { - "id": "test" - }, - [ - [ - "bacteria.css:md5,d41d8cd98f00b204e9800998ecf8427e" - ], - [ - "about.svg:md5,d41d8cd98f00b204e9800998ecf8427e" - ], - [ - "antismash.js:md5,d41d8cd98f00b204e9800998ecf8427e", - "jquery.js:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - ] - ], - "json_results": [ - [ - { - "id": "test" - }, - "genome.json:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "json_sideloading": [ - [ - { - "id": "test" - }, - "regions.js:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "knownclusterblast_dir": [ - - ], - "knownclusterblast_html": [ - - ], - "knownclusterblast_txt": [ - - ], - "knownclusterblastoutput": [ - - ], - "log": [ - [ - { - "id": "test" - }, - "test.log:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ], - "svg_files_clusterblast": [ - - ], - "svg_files_knownclusterblast": [ - - ], - "versions": [ - "versions.yml:md5,2a1c54c017741b59c057a05453fc067d" - ], - "zip": [ - [ - { - "id": "test" - }, - "genome.zip:md5,d41d8cd98f00b204e9800998ecf8427e" - ] - ] - } - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-11T16:35:51.079804" + "timestamp": "2025-01-30T14:47:32.466485783" } } \ No newline at end of file diff --git a/modules/nf-core/antismash/antismashlite/tests/nextflow.config b/modules/nf-core/antismash/antismashlite/tests/nextflow.config new file mode 100644 index 00000000..eedb39ae --- /dev/null +++ b/modules/nf-core/antismash/antismashlite/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: ANTISMASH_ANTISMASHLITE { + memory = 7.GB + } +} diff --git a/modules/nf-core/antismash/antismashlitedownloaddatabases/environment.yml b/modules/nf-core/antismash/antismashlitedownloaddatabases/environment.yml index b9323a93..dc2807d5 100644 --- a/modules/nf-core/antismash/antismashlitedownloaddatabases/environment.yml +++ b/modules/nf-core/antismash/antismashlitedownloaddatabases/environment.yml @@ -1,7 +1,7 @@ -name: antismash_antismashlitedownloaddatabases +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::antismash-lite=7.1.0 diff --git a/modules/nf-core/antismash/antismashlitedownloaddatabases/main.nf b/modules/nf-core/antismash/antismashlitedownloaddatabases/main.nf index e63f20d2..52452dc2 100644 --- a/modules/nf-core/antismash/antismashlitedownloaddatabases/main.nf +++ b/modules/nf-core/antismash/antismashlitedownloaddatabases/main.nf @@ -2,9 +2,9 @@ process ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES { label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/antismash-lite:7.1.0--pyhdfd78af_0' : - 'biocontainers/antismash-lite:7.1.0--pyhdfd78af_0' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/antismash-lite:7.1.0--pyhdfd78af_0' + : 'biocontainers/antismash-lite:7.1.0--pyhdfd78af_0'}" /* These files are normally downloaded/created by download-antismash-databases itself, and must be retrieved for input by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. This is solely for use for CI tests of the nf-core/module version of antiSMASH. @@ -13,12 +13,12 @@ process ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES { */ containerOptions { - workflow.containerEngine == 'singularity' ? - "-B $database_css:/usr/local/lib/python3.10/site-packages/antismash/outputs/html/css,$database_detection:/usr/local/lib/python3.10/site-packages/antismash/detection,$database_modules:/usr/local/lib/python3.10/site-packages/antismash/modules" : - workflow.containerEngine == 'docker' ? - "-v \$PWD/$database_css:/usr/local/lib/python3.10/site-packages/antismash/outputs/html/css -v \$PWD/$database_detection:/usr/local/lib/python3.10/site-packages/antismash/detection -v \$PWD/$database_modules:/usr/local/lib/python3.10/site-packages/antismash/modules" : - '' - } + ['singularity', 'apptainer'].contains(workflow.containerEngine) + ? "-B ${database_css}:/usr/local/lib/python3.10/site-packages/antismash/outputs/html/css,${database_detection}:/usr/local/lib/python3.10/site-packages/antismash/detection,${database_modules}:/usr/local/lib/python3.10/site-packages/antismash/modules" + : workflow.containerEngine == 'docker' + ? "-v \$PWD/${database_css}:/usr/local/lib/python3.10/site-packages/antismash/outputs/html/css -v \$PWD/${database_detection}:/usr/local/lib/python3.10/site-packages/antismash/detection -v \$PWD/${database_modules}:/usr/local/lib/python3.10/site-packages/antismash/modules" + : '' + } input: path database_css @@ -26,8 +26,8 @@ process ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES { path database_modules output: - path("antismash_db") , emit: database - path("antismash_dir"), emit: antismash_dir + path ("antismash_db"), emit: database + path ("antismash_dir"), emit: antismash_dir path "versions.yml", emit: versions when: @@ -35,35 +35,34 @@ process ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES { script: def args = task.ext.args ?: '' - cp_cmd = ( session.config.conda && session.config.conda.enabled ) ? "cp -r \$(python -c 'import antismash;print(antismash.__file__.split(\"/__\")[0])') antismash_dir;" : "cp -r /usr/local/lib/python3.10/site-packages/antismash antismash_dir;" + cp_cmd = session.config.conda && session.config.conda.enabled ? "cp -r \$(python -c 'import antismash;print(antismash.__file__.split(\"/__\")[0])') antismash_dir;" : "cp -r /usr/local/lib/python3.10/site-packages/antismash antismash_dir;" """ download-antismash-databases \\ --database-dir antismash_db \\ - $args + ${args} - $cp_cmd + ${cp_cmd} cat <<-END_VERSIONS > versions.yml "${task.process}": - antismash-lite: \$(antismash --version | sed 's/antiSMASH //') + antismash-lite: \$(echo \$(antismash --version) | sed 's/antiSMASH //;s/-.*//g') END_VERSIONS """ stub: def args = task.ext.args ?: '' - cp_cmd = (session.config.conda && session.config.conda.enabled ) ? "cp -r \$(python -c 'import antismash;print(antismash.__file__.split(\"/__\")[0])') antismash_dir;" : "cp -r /usr/local/lib/python3.10/site-packages/antismash antismash_dir;" - def VERSION = '7.1.0' // WARN: Version information not provided by tool during stub run. Please update this string when bumping container versions. + cp_cmd = session.config.conda && session.config.conda.enabled ? "cp -r \$(python -c 'import antismash;print(antismash.__file__.split(\"/__\")[0])') antismash_dir;" : "cp -r /usr/local/lib/python3.10/site-packages/antismash antismash_dir;" """ - echo "download-antismash-databases --database-dir antismash_db $args" + echo "download-antismash-databases --database-dir antismash_db ${args}" - echo "$cp_cmd" + echo "${cp_cmd}" mkdir antismash_dir mkdir antismash_db cat <<-END_VERSIONS > versions.yml "${task.process}": - antismash-lite: $VERSION + antismash-lite: \$(echo \$(antismash --version) | sed 's/antiSMASH //;s/-.*//g') END_VERSIONS """ } diff --git a/modules/nf-core/antismash/antismashlitedownloaddatabases/meta.yml b/modules/nf-core/antismash/antismashlitedownloaddatabases/meta.yml index 010c6267..fdca8294 100644 --- a/modules/nf-core/antismash/antismashlitedownloaddatabases/meta.yml +++ b/modules/nf-core/antismash/antismashlitedownloaddatabases/meta.yml @@ -1,5 +1,7 @@ name: antismash_antismashlitedownloaddatabases -description: antiSMASH allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters. This module downloads the antiSMASH databases for conda and docker/singularity runs. +description: antiSMASH allows the rapid genome-wide identification, annotation and + analysis of secondary metabolite biosynthesis gene clusters. This module downloads + the antiSMASH databases for conda and docker/singularity runs. keywords: - secondary metabolites - BGC @@ -22,36 +24,40 @@ tools: tool_dev_url: https://github.com/antismash/antismash doi: "10.1093/nar/gkab335" licence: ["AGPL v3"] + identifier: biotools:antismash input: - - database_css: - type: directory - description: | - antismash/outputs/html/css folder which is being created during the antiSMASH database downloading step. These files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. - pattern: "css" - - database_detection: - type: directory - description: | - antismash/detection folder which is being created during the antiSMASH database downloading step. These files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. - pattern: "detection" - - database_modules: - type: directory - description: | - antismash/modules folder which is being created during the antiSMASH database downloading step. These files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. - pattern: "modules" + - - database_css: + type: directory + description: | + antismash/outputs/html/css folder which is being created during the antiSMASH database downloading step. These files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. + pattern: "css" + - - database_detection: + type: directory + description: | + antismash/detection folder which is being created during the antiSMASH database downloading step. These files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. + pattern: "detection" + - - database_modules: + type: directory + description: | + antismash/modules folder which is being created during the antiSMASH database downloading step. These files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database in pipelines. + pattern: "modules" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - database: - type: directory - description: Download directory for antiSMASH databases - pattern: "antismash_db" + - antismash_db: + type: directory + description: Download directory for antiSMASH databases + pattern: "antismash_db" - antismash_dir: - type: directory - description: | - antismash installation folder which is being modified during the antiSMASH database downloading step. The modified files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database and installation folder in pipelines. - pattern: "antismash_dir" + - antismash_dir: + type: directory + description: | + antismash installation folder which is being modified during the antiSMASH database downloading step. The modified files are normally downloaded by download-antismash-databases itself, and must be retrieved by the user by manually running the command with conda or a standalone installation of antiSMASH. Therefore we do not recommend using this module for production pipelines, but rather require users to specify their own local copy of the antiSMASH database and installation folder in pipelines. + pattern: "antismash_dir" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jasmezz" maintainers: diff --git a/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test b/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test index 55f5f2f5..72e5d7dd 100644 --- a/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test +++ b/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test @@ -3,6 +3,7 @@ nextflow_process { name "Test Process ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES" script "../main.nf" process "ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES" + config './nextflow.config' tag "modules" tag "modules_nfcore" @@ -64,10 +65,12 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot ( + { assert snapshot( file(process.out.database.get(0)).list().sort(), - process.out.versions, - ).match() } + path(process.out.versions[0]).yaml, + file(process.out.versions[0]).name, + ).match() + } ) } } @@ -128,7 +131,11 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out).match() } + { assert snapshot( + file(process.out.database.get(0)).list().sort(), + file(process.out.versions[0]).name, + ).match() + } ) } } diff --git a/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test.snap b/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test.snap index 21ee9d41..04f98af8 100644 --- a/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test.snap +++ b/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/main.nf.test.snap @@ -1,40 +1,16 @@ { "antiSMASH-lite downloaddatabases - stub": { "content": [ - { - "0": [ - [ - - ] - ], - "1": [ - [ - - ] - ], - "2": [ - "versions.yml:md5,9eccc775a12d25ca5dfe334e8874f12a" - ], - "antismash_dir": [ - [ - - ] - ], - "database": [ - [ - - ] - ], - "versions": [ - "versions.yml:md5,9eccc775a12d25ca5dfe334e8874f12a" - ] - } + [ + + ], + "versions.yml" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-12T13:41:29.456143" + "timestamp": "2025-01-30T13:47:43.854140981" }, "antiSMASH-lite downloaddatabases": { "content": [ @@ -49,14 +25,17 @@ "resfam", "tigrfam" ], - [ - "versions.yml:md5,9eccc775a12d25ca5dfe334e8874f12a" - ] + { + "ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES": { + "antismash-lite": "7.1.0" + } + }, + "versions.yml" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-12T13:41:08.116244" + "timestamp": "2025-01-30T13:57:10.845020955" } } \ No newline at end of file diff --git a/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/nextflow.config b/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/nextflow.config new file mode 100644 index 00000000..972dd7b0 --- /dev/null +++ b/modules/nf-core/antismash/antismashlitedownloaddatabases/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES { + memory = 7.GB + } +} diff --git a/modules/nf-core/argnorm/environment.yml b/modules/nf-core/argnorm/environment.yml index 771b87c9..91971001 100644 --- a/modules/nf-core/argnorm/environment.yml +++ b/modules/nf-core/argnorm/environment.yml @@ -1,7 +1,7 @@ -name: "argnorm" +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - "bioconda::argnorm=0.5.0" + - bioconda::argnorm=0.5.0 diff --git a/modules/nf-core/argnorm/meta.yml b/modules/nf-core/argnorm/meta.yml index a977e863..84842b9c 100644 --- a/modules/nf-core/argnorm/meta.yml +++ b/modules/nf-core/argnorm/meta.yml @@ -1,5 +1,6 @@ name: "argnorm" -description: Normalize antibiotic resistance genes (ARGs) using the ARO ontology (developed by CARD). +description: Normalize antibiotic resistance genes (ARGs) using the ARO ontology (developed + by CARD). keywords: - amr - antimicrobial resistance @@ -11,49 +12,48 @@ keywords: - drug categorization tools: - "argnorm": - description: "Normalize antibiotic resistance genes (ARGs) using the ARO ontology (developed by CARD)." + description: "Normalize antibiotic resistance genes (ARGs) using the ARO ontology + (developed by CARD)." homepage: "https://argnorm.readthedocs.io/en/latest/" documentation: "https://argnorm.readthedocs.io/en/latest/" tool_dev_url: "https://github.com/BigDataBiology/argNorm" licence: ["MIT"] + identifier: biotools:argnorm input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'sample1', single_end:false ]` - - - input_tsv: - type: file - description: ARG annotation output - pattern: "*.tsv" - - - tool: - type: string - description: ARG annotation tool used - pattern: "argsoap|abricate|deeparg|resfinder|amrfinderplus" - - - db: - type: string - description: Database used for ARG annotation - pattern: "sarg|ncbi|resfinder|deeparg|megares|argannot|resfinderfg" - + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - input_tsv: + type: file + description: ARG annotation output + pattern: "*.tsv" + - - tool: + type: string + description: ARG annotation tool used + pattern: "argsoap|abricate|deeparg|resfinder|amrfinderplus" + - - db: + type: string + description: Database used for ARG annotation + pattern: "sarg|ncbi|resfinder|deeparg|megares|argannot|resfinderfg" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'sample1', single_end:false ]` - tsv: - type: file - description: Normalized argNorm output - pattern: "*.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - "*.tsv": + type: file + description: Normalized argNorm output + pattern: "*.tsv" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Vedanth-Ramji" maintainers: diff --git a/modules/nf-core/bakta/bakta/environment.yml b/modules/nf-core/bakta/bakta/environment.yml index efb92265..c1b616a4 100644 --- a/modules/nf-core/bakta/bakta/environment.yml +++ b/modules/nf-core/bakta/bakta/environment.yml @@ -1,7 +1,7 @@ -name: bakta_bakta +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::bakta=1.9.3 + - bioconda::bakta=1.10.4 diff --git a/modules/nf-core/bakta/bakta/main.nf b/modules/nf-core/bakta/bakta/main.nf index 9a32c3da..4d192e45 100644 --- a/modules/nf-core/bakta/bakta/main.nf +++ b/modules/nf-core/bakta/bakta/main.nf @@ -4,8 +4,8 @@ process BAKTA_BAKTA { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bakta:1.9.3--pyhdfd78af_0' : - 'biocontainers/bakta:1.9.3--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/bakta:1.10.4--pyhdfd78af_0' : + 'biocontainers/bakta:1.10.4--pyhdfd78af_0' }" input: tuple val(meta), path(fasta) diff --git a/modules/nf-core/bakta/bakta/meta.yml b/modules/nf-core/bakta/bakta/meta.yml index c0e53e2a..7d734f28 100644 --- a/modules/nf-core/bakta/bakta/meta.yml +++ b/modules/nf-core/bakta/bakta/meta.yml @@ -12,76 +12,134 @@ tools: tool_dev_url: https://github.com/oschwengers/bakta doi: "10.1099/mgen.0.000685" licence: ["GPL v3"] + identifier: biotools:bakta input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: | - FASTA file to be annotated. Has to contain at least a non-empty string dummy value. - - db: - type: file - description: | - Path to the Bakta database. Must have amrfinderplus database directory already installed within it (in a directory called 'amrfinderplus-db/'). - - proteins: - type: file - description: FASTA/GenBank file of trusted proteins to first annotate from (optional) - - prodigal_tf: - type: file - description: Training file to use for Prodigal (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: | + FASTA file to be annotated. Has to contain at least a non-empty string dummy value. + - - db: + type: file + description: | + Path to the Bakta database. Must have amrfinderplus database directory already installed within it (in a directory called 'amrfinderplus-db/'). + - - proteins: + type: file + description: FASTA/GenBank file of trusted proteins to first annotate from (optional) + - - prodigal_tf: + type: file + description: Training file to use for Prodigal (optional) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - txt: - type: file - description: genome statistics and annotation summary - pattern: "*.txt" - - tsv: - type: file - description: annotations as simple human readble tab separated values - pattern: "*.tsv" - - gff: - type: file - description: annotations & sequences in GFF3 format - pattern: "*.gff3" - - gbff: - type: file - description: annotations & sequences in (multi) GenBank format - pattern: "*.gbff" - embl: - type: file - description: annotations & sequences in (multi) EMBL format - pattern: "*.embl" - - fna: - type: file - description: replicon/contig DNA sequences as FASTA - pattern: "*.fna" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.embl: + type: file + description: annotations & sequences in (multi) EMBL format + pattern: "*.embl" - faa: - type: file - description: CDS/sORF amino acid sequences as FASTA - pattern: "*.faa" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.faa: + type: file + description: CDS/sORF amino acid sequences as FASTA + pattern: "*.faa" - ffn: - type: file - description: feature nucleotide sequences as FASTA - pattern: "*.ffn" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.ffn: + type: file + description: feature nucleotide sequences as FASTA + pattern: "*.ffn" + - fna: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.fna: + type: file + description: replicon/contig DNA sequences as FASTA + pattern: "*.fna" + - gbff: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.gbff: + type: file + description: annotations & sequences in (multi) GenBank format + pattern: "*.gbff" + - gff: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.gff3: + type: file + description: annotations & sequences in GFF3 format + pattern: "*.gff3" - hypotheticals_tsv: - type: file - description: additional information on hypothetical protein CDS as simple human readble tab separated values - pattern: "*.hypotheticals.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.hypotheticals.tsv: + type: file + description: additional information on hypothetical protein CDS as simple human + readable tab separated values + pattern: "*.hypotheticals.tsv" - hypotheticals_faa: - type: file - description: hypothetical protein CDS amino acid sequences as FASTA - pattern: "*.hypotheticals.faa" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.hypotheticals.faa: + type: file + description: hypothetical protein CDS amino acid sequences as FASTA + pattern: "*.hypotheticals.faa" + - tsv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.tsv: + type: file + description: annotations as simple human readable tab separated values + pattern: "*.tsv" + - txt: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.txt: + type: file + description: genome statistics and annotation summary + pattern: "*.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@rpetit3" - "@oschwengers" diff --git a/modules/nf-core/bakta/bakta/tests/main.nf.test.snap b/modules/nf-core/bakta/bakta/tests/main.nf.test.snap index 40e30c36..cee06343 100644 --- a/modules/nf-core/bakta/bakta/tests/main.nf.test.snap +++ b/modules/nf-core/bakta/bakta/tests/main.nf.test.snap @@ -2,14 +2,14 @@ "versions": { "content": [ [ - "versions.yml:md5,f8b70ceb2a328c25a190699384e6152d" + "versions.yml:md5,c40bd66294f6eb4520f194325ef24f24" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-03-14T09:11:06.657602394" + "timestamp": "2025-01-25T11:59:09.981137" }, "Bakta - stub": { "content": [ @@ -31,7 +31,7 @@ ] ], "10": [ - "versions.yml:md5,f8b70ceb2a328c25a190699384e6152d" + "versions.yml:md5,c40bd66294f6eb4520f194325ef24f24" ], "2": [ [ @@ -178,14 +178,14 @@ ] ], "versions": [ - "versions.yml:md5,f8b70ceb2a328c25a190699384e6152d" + "versions.yml:md5,c40bd66294f6eb4520f194325ef24f24" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-03-14T09:11:15.532858932" + "timestamp": "2025-01-25T11:09:05.864545" } } \ No newline at end of file diff --git a/modules/nf-core/bakta/baktadbdownload/environment.yml b/modules/nf-core/bakta/baktadbdownload/environment.yml index f6a53ff7..c1b616a4 100644 --- a/modules/nf-core/bakta/baktadbdownload/environment.yml +++ b/modules/nf-core/bakta/baktadbdownload/environment.yml @@ -1,7 +1,7 @@ -name: bakta_baktadbdownload +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::bakta=1.9.3 + - bioconda::bakta=1.10.4 diff --git a/modules/nf-core/bakta/baktadbdownload/main.nf b/modules/nf-core/bakta/baktadbdownload/main.nf index e512d77d..cc2f445e 100644 --- a/modules/nf-core/bakta/baktadbdownload/main.nf +++ b/modules/nf-core/bakta/baktadbdownload/main.nf @@ -3,8 +3,8 @@ process BAKTA_BAKTADBDOWNLOAD { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bakta:1.9.3--pyhdfd78af_0' : - 'biocontainers/bakta:1.9.3--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/bakta:1.10.4--pyhdfd78af_0' : + 'biocontainers/bakta:1.10.4--pyhdfd78af_0' }" output: path "db*" , emit: db diff --git a/modules/nf-core/bakta/baktadbdownload/meta.yml b/modules/nf-core/bakta/baktadbdownload/meta.yml index 21acacda..a0a3a455 100644 --- a/modules/nf-core/bakta/baktadbdownload/meta.yml +++ b/modules/nf-core/bakta/baktadbdownload/meta.yml @@ -15,15 +15,18 @@ tools: tool_dev_url: https://github.com/oschwengers/bakta doi: "10.1099/mgen.0.000685" licence: ["GPL v3"] + identifier: biotools:bakta output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - db: - type: directory - description: BAKTA database directory - pattern: "db*/" + - db*: + type: directory + description: BAKTA database directory + pattern: "db*/" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" - "@jasmezz" diff --git a/modules/nf-core/bakta/baktadbdownload/tests/main.nf.test.snap b/modules/nf-core/bakta/baktadbdownload/tests/main.nf.test.snap index b1c82267..ef6aabe7 100644 --- a/modules/nf-core/bakta/baktadbdownload/tests/main.nf.test.snap +++ b/modules/nf-core/bakta/baktadbdownload/tests/main.nf.test.snap @@ -2,14 +2,14 @@ "Bakta database download": { "content": [ [ - "versions.yml:md5,df9b091b08a41b7d5eef95727b7eac29" + "versions.yml:md5,29d6ec77dc88492b2c53141e6541c289" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-03-19T11:34:41.812416438" + "timestamp": "2025-01-25T12:30:51.853371" }, "Bakta database download - stub": { "content": [ @@ -17,13 +17,13 @@ [ ], - "versions.yml:md5,df9b091b08a41b7d5eef95727b7eac29" + "versions.yml:md5,29d6ec77dc88492b2c53141e6541c289" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-03-19T11:35:01.082923401" + "timestamp": "2025-01-25T12:31:08.390845" } } \ No newline at end of file diff --git a/modules/nf-core/deeparg/downloaddata/environment.yml b/modules/nf-core/deeparg/downloaddata/environment.yml index 87435be5..91c8f5cf 100644 --- a/modules/nf-core/deeparg/downloaddata/environment.yml +++ b/modules/nf-core/deeparg/downloaddata/environment.yml @@ -1,7 +1,7 @@ -name: deeparg_downloaddata +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::deeparg=1.0.4 diff --git a/modules/nf-core/deeparg/downloaddata/main.nf b/modules/nf-core/deeparg/downloaddata/main.nf index 787c0027..7f17ebab 100644 --- a/modules/nf-core/deeparg/downloaddata/main.nf +++ b/modules/nf-core/deeparg/downloaddata/main.nf @@ -2,32 +2,33 @@ process DEEPARG_DOWNLOADDATA { label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/deeparg:1.0.4--pyhdfd78af_0' : - 'biocontainers/deeparg:1.0.4--pyhdfd78af_0' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/deeparg:1.0.4--pyhdfd78af_0' + : 'biocontainers/deeparg:1.0.4--pyhdfd78af_0'}" /* We have to force docker/singularity to mount a fake file to allow reading of a problematic file with borked read-write permissions in an upstream dependency (theanos). Original report: https://github.com/nf-core/funcscan/issues/23 */ containerOptions { - "${workflow.containerEngine}" == 'singularity' ? '-B $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' : - "${workflow.containerEngine}" == 'docker' ? '-v $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' : - '' + ['singularity', 'apptainer'].contains(workflow.containerEngine) + ? '-B $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' + : "${workflow.containerEngine}" == 'docker' + ? '-v $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' + : '' } - input: - output: - path "db/" , emit: db - path "versions.yml" , emit: versions + path "db/", emit: db + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when script: def args = task.ext.args ?: '' - def VERSION='1.0.4' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. + def VERSION = '1.0.4' + // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. """ # Theano needs a writable space and uses the home directory by default, @@ -38,24 +39,30 @@ process DEEPARG_DOWNLOADDATA { deeparg \\ download_data \\ - $args \\ + ${args} \\ -o db/ cat <<-END_VERSIONS > versions.yml "${task.process}": - deeparg: $VERSION + deeparg: ${VERSION} END_VERSIONS """ stub: def args = task.ext.args ?: '' - def VERSION='1.0.4' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. + def VERSION = '1.0.4' + // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. """ + echo "deeparg \\ + download_data \\ + ${args} \\ + -o db/" + mkdir db/ cat <<-END_VERSIONS > versions.yml "${task.process}": - deeparg: $VERSION + deeparg: ${VERSION} END_VERSIONS """ } diff --git a/modules/nf-core/deeparg/downloaddata/meta.yml b/modules/nf-core/deeparg/downloaddata/meta.yml index 65fb3903..5df2887b 100644 --- a/modules/nf-core/deeparg/downloaddata/meta.yml +++ b/modules/nf-core/deeparg/downloaddata/meta.yml @@ -1,5 +1,6 @@ name: deeparg_downloaddata -description: A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes +description: A deep learning based approach to predict Antibiotic Resistance Genes + (ARGs) from metagenomes keywords: - download - database @@ -9,22 +10,26 @@ keywords: - prediction tools: - deeparg: - description: A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes + description: A deep learning based approach to predict Antibiotic Resistance Genes + (ARGs) from metagenomes homepage: https://github.com/gaarangoa/deeparg documentation: https://github.com/gaarangoa/deeparg tool_dev_url: https://github.com/gaarangoa/deeparg doi: "10.1186/s40168-018-0401-z" licence: ["MIT"] + identifier: "" # No input required for download module. output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - db: - type: directory - description: Directory containing database required for deepARG. - pattern: "db/" + - db/: + type: directory + description: Directory containing database required for deepARG. + pattern: "db/" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/deeparg/predict/environment.yml b/modules/nf-core/deeparg/predict/environment.yml index aa686701..91c8f5cf 100644 --- a/modules/nf-core/deeparg/predict/environment.yml +++ b/modules/nf-core/deeparg/predict/environment.yml @@ -1,7 +1,7 @@ -name: deeparg_predict +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::deeparg=1.0.4 diff --git a/modules/nf-core/deeparg/predict/main.nf b/modules/nf-core/deeparg/predict/main.nf index 20fd0a93..2ac258a8 100644 --- a/modules/nf-core/deeparg/predict/main.nf +++ b/modules/nf-core/deeparg/predict/main.nf @@ -1,32 +1,34 @@ process DEEPARG_PREDICT { - tag "$meta.id" + tag "${meta.id}" label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/deeparg:1.0.4--pyhdfd78af_0' : - 'biocontainers/deeparg:1.0.4--pyhdfd78af_0' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/deeparg:1.0.4--pyhdfd78af_0' + : 'biocontainers/deeparg:1.0.4--pyhdfd78af_0'}" /* We have to force docker/singularity to mount a fake file to allow reading of a problematic file with borked read-write permissions in an upstream dependency (theanos). Original report: https://github.com/nf-core/funcscan/issues/23 */ containerOptions { - "${workflow.containerEngine}" == 'singularity' ? '-B $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' : - "${workflow.containerEngine}" == 'docker' ? '-v $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' : - '' + ['singularity', 'apptainer'].contains(workflow.containerEngine) + ? '-B $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' + : "${workflow.containerEngine}" == 'docker' + ? '-v $(which bash):/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO' + : '' } input: tuple val(meta), path(fasta), val(model) - path(db) + path db output: - tuple val(meta), path("*.align.daa") , emit: daa - tuple val(meta), path("*.align.daa.tsv") , emit: daa_tsv - tuple val(meta), path("*.mapping.ARG") , emit: arg + tuple val(meta), path("*.align.daa"), emit: daa + tuple val(meta), path("*.align.daa.tsv"), emit: daa_tsv + tuple val(meta), path("*.mapping.ARG"), emit: arg tuple val(meta), path("*.mapping.potential.ARG"), emit: potential_arg - path "versions.yml" , emit: versions + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when @@ -34,9 +36,10 @@ process DEEPARG_PREDICT { script: def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def VERSION='1.0.4' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. + def VERSION = '1.0.4' + // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. """ - DATABASE=`find -L $db -type d -name "database" | sed 's/database//'` + DATABASE=`find -L ${db} -type d -name "database" | sed 's/database//'` # Theano needs a writable space and uses the home directory by default, # but the latter is not always writable, for instance when Singularity @@ -46,22 +49,23 @@ process DEEPARG_PREDICT { deeparg \\ predict \\ - $args \\ - -i $fasta \\ + ${args} \\ + -i ${fasta} \\ -o ${prefix} \\ -d \$DATABASE \\ - --model $model + --model ${model} cat <<-END_VERSIONS > versions.yml "${task.process}": - deeparg: $VERSION + deeparg: ${VERSION} END_VERSIONS """ stub: def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def VERSION='1.0.4' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. + def VERSION = '1.0.4' + // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. """ touch ${prefix}.align.daa touch ${prefix}.align.daa.tsv @@ -70,7 +74,7 @@ process DEEPARG_PREDICT { cat <<-END_VERSIONS > versions.yml "${task.process}": - deeparg: $VERSION + deeparg: ${VERSION} END_VERSIONS """ } diff --git a/modules/nf-core/deeparg/predict/meta.yml b/modules/nf-core/deeparg/predict/meta.yml index d62c2c5f..dbd63945 100644 --- a/modules/nf-core/deeparg/predict/meta.yml +++ b/modules/nf-core/deeparg/predict/meta.yml @@ -1,5 +1,6 @@ name: deeparg_predict -description: A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes +description: A deep learning based approach to predict Antibiotic Resistance Genes + (ARGs) from metagenomes keywords: - deeparg - antimicrobial resistance @@ -11,56 +12,81 @@ keywords: - metagenomes tools: - deeparg: - description: A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes + description: A deep learning based approach to predict Antibiotic Resistance Genes + (ARGs) from metagenomes homepage: https://github.com/gaarangoa/deeparg documentation: https://github.com/gaarangoa/deeparg tool_dev_url: https://github.com/gaarangoa/deeparg doi: "10.1186/s40168-018-0401-z" licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - fasta: - type: file - description: FASTA file containing gene-like sequences - pattern: "*.{fasta,fa,fna}" - - model: - type: string - description: Which model to use, depending on input data. Either 'LS' or 'SS' for long or short sequences respectively - pattern: "LS|LS" - - db: - type: directory - description: Path to a directory containing the deepARG pre-built models - pattern: "*/" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - fasta: + type: file + description: FASTA file containing gene-like sequences + pattern: "*.{fasta,fa,fna}" + - model: + type: string + description: Which model to use, depending on input data. Either 'LS' or 'SS' + for long or short sequences respectively + pattern: "LS|LS" + - - db: + type: directory + description: Path to a directory containing the deepARG pre-built models + pattern: "*/" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - daa: - type: file - description: Sequences of ARG-like sequences from DIAMOND alignment - pattern: "*.align.daa" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.align.daa": + type: file + description: Sequences of ARG-like sequences from DIAMOND alignment + pattern: "*.align.daa" - daa_tsv: - type: file - description: Alignments scores against ARG-like sequences from DIAMOND alignment - pattern: "*.align.daa.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.align.daa.tsv": + type: file + description: Alignments scores against ARG-like sequences from DIAMOND alignment + pattern: "*.align.daa.tsv" - arg: - type: file - description: Table containing sequences with an ARG-like probability of more than specified thresholds - pattern: "*.mapping.ARG" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.mapping.ARG": + type: file + description: Table containing sequences with an ARG-like probability of more + than specified thresholds + pattern: "*.mapping.ARG" - potential_arg: - type: file - description: Table containing sequences with an ARG-like probability of less than specified thresholds, and requires manual inspection - pattern: "*.mapping.potential.ARG" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.mapping.potential.ARG": + type: file + description: Table containing sequences with an ARG-like probability of less + than specified thresholds, and requires manual inspection + pattern: "*.mapping.potential.ARG" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/deepbgc/download/environment.yml b/modules/nf-core/deepbgc/download/environment.yml index 84d467f0..999c6864 100644 --- a/modules/nf-core/deepbgc/download/environment.yml +++ b/modules/nf-core/deepbgc/download/environment.yml @@ -1,7 +1,7 @@ -name: deepbgc_download +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::deepbgc=0.1.31 diff --git a/modules/nf-core/deepbgc/download/main.nf b/modules/nf-core/deepbgc/download/main.nf index b141142c..6818a135 100644 --- a/modules/nf-core/deepbgc/download/main.nf +++ b/modules/nf-core/deepbgc/download/main.nf @@ -2,13 +2,13 @@ process DEEPBGC_DOWNLOAD { label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/deepbgc:0.1.31--pyhca03a8a_0': - 'biocontainers/deepbgc:0.1.31--pyhca03a8a_0' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/deepbgc:0.1.31--pyhca03a8a_0' + : 'biocontainers/deepbgc:0.1.31--pyhca03a8a_0'}" output: - path "deepbgc_db/" , emit: db - path "versions.yml" , emit: versions + path "deepbgc_db/", emit: db + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when @@ -27,4 +27,14 @@ process DEEPBGC_DOWNLOAD { deepbgc: \$(echo \$(deepbgc info 2>&1 /dev/null/ | grep 'version' | cut -d " " -f3) ) END_VERSIONS """ + + stub: + """ + mkdir -p deepbgc_db + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + deepbgc: \$(echo \$(deepbgc info 2>&1 /dev/null/ | grep 'version' | cut -d " " -f3) ) + END_VERSIONS + """ } diff --git a/modules/nf-core/deepbgc/download/meta.yml b/modules/nf-core/deepbgc/download/meta.yml index 6444dd41..4551e9a0 100644 --- a/modules/nf-core/deepbgc/download/meta.yml +++ b/modules/nf-core/deepbgc/download/meta.yml @@ -1,5 +1,6 @@ name: "deepbgc_download" -description: Database download module for DeepBGC which detects BGCs in bacterial and fungal genomes using deep learning. +description: Database download module for DeepBGC which detects BGCs in bacterial + and fungal genomes using deep learning. keywords: - database - download @@ -19,15 +20,18 @@ tools: tool_dev_url: "https://github.com/Merck/deepbgc" doi: "10.1093/nar/gkz654" licence: ["MIT"] + identifier: biotools:DeepBGC output: + - db: + - deepbgc_db/: + type: directory + description: Directory containing the DeepBGC database + pattern: "deepbgc_db/" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - deepbgc_db: - type: directory - description: Contains reference database files - pattern: "deepbgc_db" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" maintainers: diff --git a/modules/nf-core/deepbgc/download/tests/main.nf.test b/modules/nf-core/deepbgc/download/tests/main.nf.test new file mode 100644 index 00000000..25db42ab --- /dev/null +++ b/modules/nf-core/deepbgc/download/tests/main.nf.test @@ -0,0 +1,47 @@ +nextflow_process { + + name "Test Process DEEPBGC_DOWNLOAD" + script "../main.nf" + process "DEEPBGC_DOWNLOAD" + + tag "modules" + tag "modules_nfcore" + tag "deepbgc" + tag "deepbgc/download" + + test("deepbgc download db") { + + when { + process { + """ + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("deepbgc download db - stub") { + + options "-stub" + + when { + process { + """ + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } +} diff --git a/modules/nf-core/deepbgc/download/tests/main.nf.test.snap b/modules/nf-core/deepbgc/download/tests/main.nf.test.snap new file mode 100644 index 00000000..8d0c7fbb --- /dev/null +++ b/modules/nf-core/deepbgc/download/tests/main.nf.test.snap @@ -0,0 +1,94 @@ +{ + "deepbgc download db - stub": { + "content": [ + { + "0": [ + [ + + ] + ], + "1": [ + "versions.yml:md5,4130f2ce0a4d43fc3d8e04f4935f908b" + ], + "db": [ + [ + + ] + ], + "versions": [ + "versions.yml:md5,4130f2ce0a4d43fc3d8e04f4935f908b" + ] + } + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-16T13:16:09.517281467" + }, + "deepbgc download db": { + "content": [ + { + "0": [ + [ + [ + [ + "product_activity.pkl:md5,90f0c010460e9df882cb057664a49f30", + "product_class.pkl:md5,f78a2eda240403d2f40643d42202f3ac" + ], + [ + "clusterfinder_geneborder.pkl:md5,ca4be7031ae9f70780f17c616a4fa5b5", + "clusterfinder_original.pkl:md5,2ca2429bb9bc99a401d1093c376b37aa", + "clusterfinder_retrained.pkl:md5,65679a3b61c562ff4b84bdb574bb6d93", + "deepbgc.pkl:md5,7e9218be79ba45bc9adb23bed3845dc1" + ] + ], + [ + "Pfam-A.31.0.clans.tsv:md5,a0a4590ffb2b33b83ef2b28f6ead886b", + "Pfam-A.31.0.hmm:md5,79a3328e4c95b13949a4489b19959fc5", + "Pfam-A.31.0.hmm.h3f:md5,cbca323cf8dd4e5e7c109114ec444162", + "Pfam-A.31.0.hmm.h3i:md5,5242332a3f6a60cd1ab634cd9331afd6", + "Pfam-A.31.0.hmm.h3m:md5,1fe946fa2b3bcde1d4b2bad732bce612", + "Pfam-A.31.0.hmm.h3p:md5,27b98a1ded123b6a1ef72db01927017c" + ] + ] + ], + "1": [ + "versions.yml:md5,4130f2ce0a4d43fc3d8e04f4935f908b" + ], + "db": [ + [ + [ + [ + "product_activity.pkl:md5,90f0c010460e9df882cb057664a49f30", + "product_class.pkl:md5,f78a2eda240403d2f40643d42202f3ac" + ], + [ + "clusterfinder_geneborder.pkl:md5,ca4be7031ae9f70780f17c616a4fa5b5", + "clusterfinder_original.pkl:md5,2ca2429bb9bc99a401d1093c376b37aa", + "clusterfinder_retrained.pkl:md5,65679a3b61c562ff4b84bdb574bb6d93", + "deepbgc.pkl:md5,7e9218be79ba45bc9adb23bed3845dc1" + ] + ], + [ + "Pfam-A.31.0.clans.tsv:md5,a0a4590ffb2b33b83ef2b28f6ead886b", + "Pfam-A.31.0.hmm:md5,79a3328e4c95b13949a4489b19959fc5", + "Pfam-A.31.0.hmm.h3f:md5,cbca323cf8dd4e5e7c109114ec444162", + "Pfam-A.31.0.hmm.h3i:md5,5242332a3f6a60cd1ab634cd9331afd6", + "Pfam-A.31.0.hmm.h3m:md5,1fe946fa2b3bcde1d4b2bad732bce612", + "Pfam-A.31.0.hmm.h3p:md5,27b98a1ded123b6a1ef72db01927017c" + ] + ] + ], + "versions": [ + "versions.yml:md5,4130f2ce0a4d43fc3d8e04f4935f908b" + ] + } + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-16T13:15:16.323023111" + } +} \ No newline at end of file diff --git a/modules/nf-core/deepbgc/download/tests/tags.yml b/modules/nf-core/deepbgc/download/tests/tags.yml new file mode 100644 index 00000000..6f1c7569 --- /dev/null +++ b/modules/nf-core/deepbgc/download/tests/tags.yml @@ -0,0 +1,2 @@ +deepbgc/download: + - "modules/nf-core/deepbgc/download/**" diff --git a/modules/nf-core/deepbgc/pipeline/environment.yml b/modules/nf-core/deepbgc/pipeline/environment.yml index fe0087a2..999c6864 100644 --- a/modules/nf-core/deepbgc/pipeline/environment.yml +++ b/modules/nf-core/deepbgc/pipeline/environment.yml @@ -1,7 +1,7 @@ -name: deepbgc_pipeline +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::deepbgc=0.1.31 diff --git a/modules/nf-core/deepbgc/pipeline/main.nf b/modules/nf-core/deepbgc/pipeline/main.nf index fc72d238..1f93dff1 100644 --- a/modules/nf-core/deepbgc/pipeline/main.nf +++ b/modules/nf-core/deepbgc/pipeline/main.nf @@ -1,29 +1,29 @@ process DEEPBGC_PIPELINE { - tag "$meta.id" + tag "${meta.id}" label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/deepbgc:0.1.31--pyhca03a8a_0': - 'biocontainers/deepbgc:0.1.31--pyhca03a8a_0' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/deepbgc:0.1.31--pyhca03a8a_0' + : 'biocontainers/deepbgc:0.1.31--pyhca03a8a_0'}" input: tuple val(meta), path(genome) - path(db) + path db output: - tuple val(meta), path("${prefix}/README.txt") , optional: true, emit: readme - tuple val(meta), path("${prefix}/LOG.txt") , emit: log - tuple val(meta), path("${prefix}/${prefix}.antismash.json") , optional: true, emit: json - tuple val(meta), path("${prefix}/${prefix}.bgc.gbk") , optional: true, emit: bgc_gbk - tuple val(meta), path("${prefix}/${prefix}.bgc.tsv") , optional: true, emit: bgc_tsv - tuple val(meta), path("${prefix}/${prefix}.full.gbk") , optional: true, emit: full_gbk - tuple val(meta), path("${prefix}/${prefix}.pfam.tsv") , optional: true, emit: pfam_tsv - tuple val(meta), path("${prefix}/evaluation/${prefix}.bgc.png") , optional: true, emit: bgc_png - tuple val(meta), path("${prefix}/evaluation/${prefix}.pr.png") , optional: true, emit: pr_png - tuple val(meta), path("${prefix}/evaluation/${prefix}.roc.png") , optional: true, emit: roc_png - tuple val(meta), path("${prefix}/evaluation/${prefix}.score.png"), optional: true, emit: score_png - path "versions.yml" , emit: versions + tuple val(meta), path("${prefix}/README.txt"), optional: true, emit: readme + tuple val(meta), path("${prefix}/LOG.txt"), emit: log + tuple val(meta), path("${prefix}/${prefix}.antismash.json"), optional: true, emit: json + tuple val(meta), path("${prefix}/${prefix}.bgc.gbk"), optional: true, emit: bgc_gbk + tuple val(meta), path("${prefix}/${prefix}.bgc.tsv"), optional: true, emit: bgc_tsv + tuple val(meta), path("${prefix}/${prefix}.full.gbk"), optional: true, emit: full_gbk + tuple val(meta), path("${prefix}/${prefix}.pfam.tsv"), optional: true, emit: pfam_tsv + tuple val(meta), path("${prefix}/evaluation/${prefix}.bgc.png"), optional: true, emit: bgc_png + tuple val(meta), path("${prefix}/evaluation/${prefix}.pr.png"), optional: true, emit: pr_png + tuple val(meta), path("${prefix}/evaluation/${prefix}.roc.png"), optional: true, emit: roc_png + tuple val(meta), path("${prefix}/evaluation/${prefix}.score.png"), optional: true, emit: score_png + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when @@ -36,8 +36,8 @@ process DEEPBGC_PIPELINE { deepbgc \\ pipeline \\ - $args \\ - $genome + ${args} \\ + ${genome} if [[ "${genome.baseName}/" != "${prefix}/" ]]; then mv "${genome.baseName}/" "${prefix}/" @@ -55,7 +55,6 @@ process DEEPBGC_PIPELINE { """ stub: - def args = task.ext.args ?: '' prefix = task.ext.prefix ?: "${meta.id}" """ mkdir -p ${prefix}/evaluation diff --git a/modules/nf-core/deepbgc/pipeline/meta.yml b/modules/nf-core/deepbgc/pipeline/meta.yml index 5f939eaa..186c7d30 100644 --- a/modules/nf-core/deepbgc/pipeline/meta.yml +++ b/modules/nf-core/deepbgc/pipeline/meta.yml @@ -17,73 +17,138 @@ tools: tool_dev_url: "https://github.com/Merck/deepbgc" doi: "10.1093/nar/gkz654" licence: ["MIT"] + identifier: biotools:DeepBGC input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - genome: - type: file - description: FASTA/GenBank/Pfam CSV file - pattern: "*.{fasta,fa,fna,gbk,csv}" - - db: - type: directory - description: Database path + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - genome: + type: file + description: FASTA/GenBank/Pfam CSV file + pattern: "*.{fasta,fa,fna,gbk,csv}" + - - db: + type: directory + description: Database path output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - readme: - type: file - description: txt file containing description of output files - pattern: "*.{txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/README.txt: + type: file + description: txt file containing description of output files + pattern: "*.{txt}" - log: - type: file - description: Log output of DeepBGC - pattern: "*.{txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/LOG.txt: + type: file + description: Log output of DeepBGC + pattern: "*.{txt}" - json: - type: file - description: AntiSMASH JSON file for sideloading - pattern: "*.{json}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/${prefix}.antismash.json: + type: file + description: AntiSMASH JSON file for sideloading + pattern: "*.{json}" - bgc_gbk: - type: file - description: Sequences and features of all detected BGCs in GenBank format - pattern: "*.{bgc.gbk}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/${prefix}.bgc.gbk: + type: file + description: Sequences and features of all detected BGCs in GenBank format + pattern: "*.{bgc.gbk}" - bgc_tsv: - type: file - description: Table of detected BGCs and their properties - pattern: "*.{bgc.tsv}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/${prefix}.bgc.tsv: + type: file + description: Table of detected BGCs and their properties + pattern: "*.{bgc.tsv}" - full_gbk: - type: file - description: Fully annotated input sequence with proteins, Pfam domains (PFAM_domain features) and BGCs (cluster features) - pattern: "*.{full.gbk}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/${prefix}.full.gbk: + type: file + description: Fully annotated input sequence with proteins, Pfam domains (PFAM_domain + features) and BGCs (cluster features) + pattern: "*.{full.gbk}" - pfam_tsv: - type: file - description: Table of Pfam domains (pfam_id) from given sequence (sequence_id) in genomic order, with BGC detection scores - pattern: "*.{pfam.tsv}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/${prefix}.pfam.tsv: + type: file + description: Table of Pfam domains (pfam_id) from given sequence (sequence_id) + in genomic order, with BGC detection scores + pattern: "*.{pfam.tsv}" - bgc_png: - type: file - description: Detected BGCs plotted by their nucleotide coordinates - pattern: "*.{bgc.png}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/evaluation/${prefix}.bgc.png: + type: file + description: Detected BGCs plotted by their nucleotide coordinates + pattern: "*.{bgc.png}" - pr_png: - type: file - description: Precision-Recall curve based on predicted per-Pfam BGC scores - pattern: "*.{pr.png}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/evaluation/${prefix}.pr.png: + type: file + description: Precision-Recall curve based on predicted per-Pfam BGC scores + pattern: "*.{pr.png}" - roc_png: - type: file - description: ROC curve based on predicted per-Pfam BGC scores - pattern: "*.{roc.png}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/evaluation/${prefix}.roc.png: + type: file + description: ROC curve based on predicted per-Pfam BGC scores + pattern: "*.{roc.png}" - score_png: - type: file - description: BGC detection scores of each Pfam domain in genomic order - pattern: "*.{score.png}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - ${prefix}/evaluation/${prefix}.score.png: + type: file + description: BGC detection scores of each Pfam domain in genomic order + pattern: "*.{score.png}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" - "@jfy133" diff --git a/modules/nf-core/deepbgc/pipeline/tests/main.nf.test b/modules/nf-core/deepbgc/pipeline/tests/main.nf.test index 9dd24049..7e284269 100644 --- a/modules/nf-core/deepbgc/pipeline/tests/main.nf.test +++ b/modules/nf-core/deepbgc/pipeline/tests/main.nf.test @@ -12,42 +12,99 @@ nextflow_process { tag "gunzip" tag "prodigal" - setup { - run("DEEPBGC_DOWNLOAD") { - script "../..//download/main.nf" - process { - """ - """ + test("deepbgc pipeline gbk - bacteroides fragilis - test1_contigs.fa.gz") { + + setup { + run("DEEPBGC_DOWNLOAD") { + script "../..//download/main.nf" + process { + """ + """ + } } - } - run("GUNZIP") { - script "../../../gunzip/main.nf" - process { - """ - input[0] = Channel.fromList([ - tuple([ id:'test_gbk', single_end:false ], // meta map - file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true)) - ]) - """ + run("GUNZIP") { + script "../../../gunzip/main.nf" + process { + """ + input[0] = Channel.fromList([ + tuple([ id:'test_gbk', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true)) + ]) + """ + } + } + run("PRODIGAL") { + script "../../../prodigal/main.nf" + process { + """ + input[0] = GUNZIP.out.gunzip + input[1] = 'gbk' + """ + } } } - run("PRODIGAL") { - script "../../../prodigal/main.nf" + + when { process { """ - input[0] = GUNZIP.out.gunzip - input[1] = 'gbk' + input [0] = PRODIGAL.out.gene_annotations + input [1] = DEEPBGC_DOWNLOAD.out.db """ } } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + file(process.out.bgc_gbk[0][1]).name, + file(process.out.full_gbk[0][1]).name, + file(process.out.log[0][1]).name, + file(process.out.readme[0][1]).name, + process.out.json, + process.out.versions).match() + } + ) + } + } - test("deepbgc pipeline gbk - bacteroides fragilis - test1_contigs.fa.gz") { + test("deepbgc pipeline fa - bacteroides fragilis - test1_contigs.fa.gz") { + + setup { + run("DEEPBGC_DOWNLOAD") { + script "../..//download/main.nf" + process { + """ + """ + } + } + run("GUNZIP") { + script "../../../gunzip/main.nf" + process { + """ + input[0] = Channel.fromList([ + tuple([ id:'test_gbk', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true)) + ]) + """ + } + } + run("PRODIGAL") { + script "../../../prodigal/main.nf" + process { + """ + input[0] = GUNZIP.out.gunzip + input[1] = 'gbk' + """ + } + } + } when { process { """ - input [0] = PRODIGAL.out.gene_annotations + input [0] = GUNZIP.out.gunzip input [1] = DEEPBGC_DOWNLOAD.out.db """ } @@ -56,22 +113,59 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.versions).match("gbk_versions") }, - { assert snapshot(process.out.json).match("gbk_json") }, - { assert path(process.out.log.get(0).get(1)).exists() }, - { assert path(process.out.bgc_gbk.get(0).get(1)).exists() }, - { assert path(process.out.full_gbk.get(0).get(1)).exists() } + { assert snapshot( + file(process.out.bgc_tsv[0][1]).name, + file(process.out.full_gbk[0][1]).name, + file(process.out.json[0][1]).name, + file(process.out.log[0][1]).name, + process.out.bgc_gbk, + process.out.bgc_png, + process.out.pfam_tsv, + process.out.score_png, + process.out.versions).match() + } ) } - } - test("deepbgc pipeline fa - bacteroides fragilis - test1_contigs.fa.gz") { + test("deepbgc pipeline gbk - bacteroides fragilis - test1_contigs.fa.gz - stub") { + + options "-stub" + + setup { + run("DEEPBGC_DOWNLOAD") { + script "../..//download/main.nf" + process { + """ + """ + } + } + run("GUNZIP") { + script "../../../gunzip/main.nf" + process { + """ + input[0] = Channel.fromList([ + tuple([ id:'test_gbk', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true)) + ]) + """ + } + } + run("PRODIGAL") { + script "../../../prodigal/main.nf" + process { + """ + input[0] = GUNZIP.out.gunzip + input[1] = 'gbk' + """ + } + } + } when { process { """ - input [0] = GUNZIP.out.gunzip + input [0] = PRODIGAL.out.gene_annotations input [1] = DEEPBGC_DOWNLOAD.out.db """ } @@ -80,21 +174,45 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.versions).match("fa_versions") }, - { assert snapshot(process.out.bgc_gbk).match("fa_bgc_gbk") }, - { assert snapshot(process.out.bgc_png).match("fa_bgc_png") }, - { assert snapshot(process.out.score_png).match("fa_score_png") }, - { assert snapshot(process.out.pfam_tsv).match("fa_pfam_tsv") }, - { assert path(process.out.json.get(0).get(1)).exists() }, - { assert path(process.out.log.get(0).get(1)).exists() }, - { assert path(process.out.bgc_tsv.get(0).get(1)).exists() }, - { assert path(process.out.full_gbk.get(0).get(1)).exists() } + { assert snapshot(process.out).match() } ) } } test("deepbgc pipeline fa - bacteroides fragilis - test1_contigs.fa.gz - stub") { + options "-stub" + + setup { + run("DEEPBGC_DOWNLOAD") { + script "../..//download/main.nf" + process { + """ + """ + } + } + run("GUNZIP") { + script "../../../gunzip/main.nf" + process { + """ + input[0] = Channel.fromList([ + tuple([ id:'test_gbk', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true)) + ]) + """ + } + } + run("PRODIGAL") { + script "../../../prodigal/main.nf" + process { + """ + input[0] = GUNZIP.out.gunzip + input[1] = 'gbk' + """ + } + } + } + when { process { """ @@ -111,6 +229,5 @@ nextflow_process { ) } } - } diff --git a/modules/nf-core/deepbgc/pipeline/tests/main.nf.test.snap b/modules/nf-core/deepbgc/pipeline/tests/main.nf.test.snap index ef64db97..751b9228 100644 --- a/modules/nf-core/deepbgc/pipeline/tests/main.nf.test.snap +++ b/modules/nf-core/deepbgc/pipeline/tests/main.nf.test.snap @@ -1,33 +1,218 @@ { - "gbk_versions": { + "deepbgc pipeline gbk - bacteroides fragilis - test1_contigs.fa.gz - stub": { "content": [ - [ - "versions.yml:md5,988a1db70bd9e95ad22c25b4d6d40e6e" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.1" - }, - "timestamp": "2023-12-01T18:29:41.728695197" - }, - "fa_bgc_png": { - "content": [ - [ - [ - { - "id": "test_gbk", - "single_end": false - }, - "test_gbk.bgc.png:md5,f4a0fc6cd260e2d7ad16f7a1fa103f96" + { + "0": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "README.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "LOG.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.score.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "11": [ + "versions.yml:md5,988a1db70bd9e95ad22c25b4d6d40e6e" + ], + "2": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.antismash.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.full.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.pfam.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.pr.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "9": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.roc.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "bgc_gbk": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "bgc_png": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "bgc_tsv": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "full_gbk": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.full.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.antismash.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "LOG.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "pfam_tsv": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.pfam.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "pr_png": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.pr.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "readme": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "README.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "roc_png": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.roc.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "score_png": [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.score.png:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,988a1db70bd9e95ad22c25b4d6d40e6e" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-06-03T16:29:32.389704368" + "timestamp": "2024-12-16T14:41:33.07308826" }, "deepbgc pipeline fa - bacteroides fragilis - test1_contigs.fa.gz - stub": { "content": [ @@ -239,31 +424,35 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-06-03T16:32:11.354631831" + "timestamp": "2024-12-16T14:11:05.81892322" }, - "fa_score_png": { + "deepbgc pipeline fa - bacteroides fragilis - test1_contigs.fa.gz": { "content": [ + "test_gbk.bgc.tsv", + "test_gbk.full.gbk", + "test_gbk.antismash.json", + "LOG.txt", [ [ { "id": "test_gbk", "single_end": false }, - "test_gbk.score.png:md5,572e8882031f667580d8c8e13c2cbb91" + "test_gbk.bgc.gbk:md5,7fc70dd034903622dae273bf71b402f2" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" - }, - "timestamp": "2024-06-03T16:29:32.401051746" - }, - "fa_pfam_tsv": { - "content": [ + ], + [ + [ + { + "id": "test_gbk", + "single_end": false + }, + "test_gbk.bgc.png:md5,f4a0fc6cd260e2d7ad16f7a1fa103f96" + ] + ], [ [ { @@ -272,60 +461,49 @@ }, "test_gbk.pfam.tsv:md5,1179eb4e6df0c83aaeec18d7d34e7524" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" - }, - "timestamp": "2024-06-03T16:29:32.411632144" - }, - "gbk_json": { - "content": [ + ], [ [ { "id": "test_gbk", "single_end": false }, - "test_gbk.antismash.json:md5,889ac1efb6a9a7d7b8c65e4cd2233bba" + "test_gbk.score.png:md5,572e8882031f667580d8c8e13c2cbb91" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" - }, - "timestamp": "2024-06-03T16:25:25.861672633" - }, - "fa_versions": { - "content": [ + ], [ "versions.yml:md5,988a1db70bd9e95ad22c25b4d6d40e6e" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2023-12-01T18:44:16.352023677" + "timestamp": "2024-12-16T14:23:52.269487956" }, - "fa_bgc_gbk": { + "deepbgc pipeline gbk - bacteroides fragilis - test1_contigs.fa.gz": { "content": [ + "test_gbk.bgc.gbk", + "test_gbk.full.gbk", + "LOG.txt", + "README.txt", [ [ { "id": "test_gbk", "single_end": false }, - "test_gbk.bgc.gbk:md5,7fc70dd034903622dae273bf71b402f2" + "test_gbk.antismash.json:md5,889ac1efb6a9a7d7b8c65e4cd2233bba" ] + ], + [ + "versions.yml:md5,988a1db70bd9e95ad22c25b4d6d40e6e" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-06-03T16:29:32.383560585" + "timestamp": "2024-12-16T14:17:20.991066496" } } \ No newline at end of file diff --git a/modules/nf-core/fargene/environment.yml b/modules/nf-core/fargene/environment.yml index 56629ff4..197b2b32 100644 --- a/modules/nf-core/fargene/environment.yml +++ b/modules/nf-core/fargene/environment.yml @@ -1,7 +1,7 @@ -name: fargene +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::fargene=0.1 diff --git a/modules/nf-core/fargene/meta.yml b/modules/nf-core/fargene/meta.yml index 9fc5ce0f..e1bcc5ea 100644 --- a/modules/nf-core/fargene/meta.yml +++ b/modules/nf-core/fargene/meta.yml @@ -1,5 +1,6 @@ name: fargene -description: tool that takes either fragmented metagenomic data or longer sequences as input and predicts and delivers full-length antiobiotic resistance genes as output. +description: tool that takes either fragmented metagenomic data or longer sequences + as input and predicts and delivers full-length antiobiotic resistance genes as output. keywords: - antibiotic resistance genes - ARGs @@ -8,94 +9,192 @@ keywords: - contigs tools: - fargene: - description: Fragmented Antibiotic Resistance Gene Identifier takes either fragmented metagenomic data or longer sequences as input and predicts and delivers full-length antiobiotic resistance genes as output + description: Fragmented Antibiotic Resistance Gene Identifier takes either fragmented + metagenomic data or longer sequences as input and predicts and delivers full-length + antiobiotic resistance genes as output homepage: https://github.com/fannyhb/fargene documentation: https://github.com/fannyhb/fargene tool_dev_url: https://github.com/fannyhb/fargene licence: ["MIT"] + identifier: biotools:fargene input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: uncompressed fasta file or paired-end fastq files containing either genomes or longer contigs as nucleotide or protein sequences (fasta) or fragmented metagenomic reads (fastq) - pattern: "*.{fasta}" - - hmm_model: - type: string - description: name of custom hidden markov model to be used [pre-defined class_a, class_b_1_2, class_b_3, class_c, class_d_1, class_d_2, qnr, tet_efflux, tet_rpg, tet_enzyme] + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: uncompressed fasta file or paired-end fastq files containing either + genomes or longer contigs as nucleotide or protein sequences (fasta) or fragmented + metagenomic reads (fastq) + pattern: "*.{fasta}" + - - hmm_model: + type: string + description: name of custom hidden markov model to be used [pre-defined class_a, + class_b_1_2, class_b_3, class_c, class_d_1, class_d_2, qnr, tet_efflux, tet_rpg, + tet_enzyme] output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - log: - type: file - description: log file - pattern: "*.{log}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.log": + type: file + description: log file + pattern: "*.{log}" - txt: - type: file - description: analysis summary text file - pattern: "*.{txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/results_summary.txt: + type: file + description: analysis summary text file + pattern: "*.{txt}" - hmm: - type: file - description: output from hmmsearch (both single gene annotations + contigs) - pattern: "*.{out}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/hmmsearchresults/*.out: + type: file + description: output from hmmsearch (both single gene annotations + contigs) + pattern: "*.{out}" - hmm_genes: - type: file - description: output from hmmsearch (single gene annotations only) - pattern: "retrieved-*.{out}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/hmmsearchresults/retrieved-*.out: + type: file + description: output from hmmsearch (single gene annotations only) + pattern: "retrieved-*.{out}" - orfs: - type: file - description: open reading frames (ORFs) - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/predictedGenes/predicted-orfs.fasta: + type: file + description: open reading frames (ORFs) + pattern: "*.{fasta}" - orfs_amino: - type: file - description: protein translation of open reading frames (ORFs) - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/predictedGenes/predicted-orfs-amino.fasta: + type: file + description: protein translation of open reading frames (ORFs) + pattern: "*.{fasta}" - contigs: - type: file - description: (complete) contigs that passed the final full-length classification - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/predictedGenes/retrieved-contigs.fasta: + type: file + description: (complete) contigs that passed the final full-length classification + pattern: "*.{fasta}" - contigs_pept: - type: file - description: parts of the contigs that passed the final classification step that aligned with the HMM, as amino acid sequences - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/predictedGenes/retrieved-contigs-peptides.fasta: + type: file + description: parts of the contigs that passed the final classification step + that aligned with the HMM, as amino acid sequences + pattern: "*.{fasta}" - filtered: - type: file - description: sequences that passed the final classification step, but only the parts that where predicted by the HMM to be part of the gene - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/predictedGenes/*filtered.fasta: + type: file + description: sequences that passed the final classification step, but only the + parts that where predicted by the HMM to be part of the gene + pattern: "*.{fasta}" - filtered_pept: - type: file - description: sequences from filtered.fasta, translated in the same frame as the gene is predicted to be located - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/predictedGenes/*filtered-peptides.fasta: + type: file + description: sequences from filtered.fasta, translated in the same frame as + the gene is predicted to be located + pattern: "*.{fasta}" - fragments: - type: file - description: All quality controlled retrieved fragments that were classified as positive, together with its read-pair, gathered in two files - pattern: "*.{fastq}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/retrievedFragments/all_retrieved_*.fastq: + type: file + description: All quality controlled retrieved fragments that were classified + as positive, together with its read-pair, gathered in two files + pattern: "*.{fastq}" - trimmed: - type: file - description: The quality controlled retrieved fragments from each input file. - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/retrievedFragments/trimmedReads/*.fasta: + type: file + description: The quality controlled retrieved fragments from each input file. + pattern: "*.{fasta}" - spades: - type: directory - description: The output from the SPAdes assembly - pattern: "spades_assembly" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/spades_assembly/*: + type: directory + description: The output from the SPAdes assembly + pattern: "spades_assembly" - metagenome: - type: file - description: The FASTQ to FASTA converted input files from metagenomic reads. - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/tmpdir/*.fasta: + type: file + description: The FASTQ to FASTA converted input files from metagenomic reads. + pattern: "*.{fasta}" - tmp: - type: file - description: The from FASTQ to FASTA converted input files and their translated input sequences. Are only saved if option --store-peptides is used. - pattern: "*.{fasta}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/tmpdir/*.out: + type: file + description: The from FASTQ to FASTA converted input files and their translated + input sequences. Are only saved if option --store-peptides is used. + pattern: "*.{fasta}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" maintainers: diff --git a/modules/nf-core/gecco/run/environment.yml b/modules/nf-core/gecco/run/environment.yml index 9d7cde8d..bb47bc85 100644 --- a/modules/nf-core/gecco/run/environment.yml +++ b/modules/nf-core/gecco/run/environment.yml @@ -1,7 +1,7 @@ -name: gecco_run +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gecco=0.9.10 diff --git a/modules/nf-core/gecco/run/meta.yml b/modules/nf-core/gecco/run/meta.yml index a2f4a726..6a557cea 100644 --- a/modules/nf-core/gecco/run/meta.yml +++ b/modules/nf-core/gecco/run/meta.yml @@ -1,5 +1,7 @@ name: "gecco_run" -description: GECCO is a fast and scalable method for identifying putative novel Biosynthetic Gene Clusters (BGCs) in genomic and metagenomic data using Conditional Random Fields (CRFs). +description: GECCO is a fast and scalable method for identifying putative novel Biosynthetic + Gene Clusters (BGCs) in genomic and metagenomic data using Conditional Random Fields + (CRFs). keywords: - bgc - detection @@ -13,53 +15,86 @@ tools: tool_dev_url: "https://github.com/zellerlab/GECCO" doi: "10.1101/2021.05.03.442509" licence: ["GPL v3"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: A genomic file containing one or more sequences as input. Input type is any supported by Biopython (fasta, gbk, etc.) - pattern: "*" - - hmm: - type: file - description: Alternative HMM file(s) to use in HMMER format - pattern: "*.hmm" - - model_dir: - type: directory - description: Path to an alternative CRF (Conditional Random Fields) module to use + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: A genomic file containing one or more sequences as input. Input + type is any supported by Biopython (fasta, gbk, etc.) + pattern: "*" + - hmm: + type: file + description: Alternative HMM file(s) to use in HMMER format + pattern: "*.hmm" + - - model_dir: + type: directory + description: Path to an alternative CRF (Conditional Random Fields) module to + use output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - genes: - type: file - description: TSV file containing detected/predicted genes with BGC probability scores. Will not be generated if no hits are found. - pattern: "*.genes.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.genes.tsv": + type: file + description: TSV file containing detected/predicted genes with BGC probability + scores. Will not be generated if no hits are found. + pattern: "*.genes.tsv" - features: - type: file - description: TSV file containing identified domains - pattern: "*.features.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.features.tsv": + type: file + description: TSV file containing identified domains + pattern: "*.features.tsv" - clusters: - type: file - description: TSV file containing coordinates of predicted clusters and BGC types. Will not be generated if no hits are found. - pattern: "*.clusters.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.clusters.tsv": + type: file + description: TSV file containing coordinates of predicted clusters and BGC types. Will + not be generated if no hits are found. + pattern: "*.clusters.tsv" - gbk: - type: file - description: Per cluster GenBank file (if found) containing sequence with annotations. Will not be generated if no hits are found. - pattern: "*.gbk" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_cluster_*.gbk": + type: file + description: Per cluster GenBank file (if found) containing sequence with annotations. + Will not be generated if no hits are found. + pattern: "*.gbk" - json: - type: file - description: AntiSMASH v6 sideload JSON file (if --antismash-sideload) supplied. Will not be generated if no hits are found. - pattern: "*.gbk" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: AntiSMASH v6 sideload JSON file (if --antismash-sideload) supplied. + Will not be generated if no hits are found. + pattern: "*.gbk" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/gunzip/environment.yml b/modules/nf-core/gunzip/environment.yml index dfc02a7b..9b926b1f 100644 --- a/modules/nf-core/gunzip/environment.yml +++ b/modules/nf-core/gunzip/environment.yml @@ -1,9 +1,12 @@ -name: gunzip +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: + - conda-forge::coreutils=9.5 - conda-forge::grep=3.11 + - conda-forge::gzip=1.13 + - conda-forge::lbzip2=2.5 - conda-forge::sed=4.8 - conda-forge::tar=1.34 diff --git a/modules/nf-core/gunzip/main.nf b/modules/nf-core/gunzip/main.nf index 5e67e3b9..3ffc8e92 100644 --- a/modules/nf-core/gunzip/main.nf +++ b/modules/nf-core/gunzip/main.nf @@ -1,37 +1,37 @@ process GUNZIP { - tag "$archive" + tag "${archive}" label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ubuntu:22.04' : - 'nf-core/ubuntu:22.04' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/52/52ccce28d2ab928ab862e25aae26314d69c8e38bd41ca9431c67ef05221348aa/data' + : 'community.wave.seqera.io/library/coreutils_grep_gzip_lbzip2_pruned:838ba80435a629f8'}" input: tuple val(meta), path(archive) output: - tuple val(meta), path("$gunzip"), emit: gunzip - path "versions.yml" , emit: versions + tuple val(meta), path("${gunzip}"), emit: gunzip + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when script: - def args = task.ext.args ?: '' - def extension = ( archive.toString() - '.gz' ).tokenize('.')[-1] - def name = archive.toString() - '.gz' - ".$extension" - def prefix = task.ext.prefix ?: name - gunzip = prefix + ".$extension" + def args = task.ext.args ?: '' + def extension = (archive.toString() - '.gz').tokenize('.')[-1] + def name = archive.toString() - '.gz' - ".${extension}" + def prefix = task.ext.prefix ?: name + gunzip = prefix + ".${extension}" """ # Not calling gunzip itself because it creates files # with the original group ownership rather than the # default one for that user / the work directory gzip \\ -cd \\ - $args \\ - $archive \\ - > $gunzip + ${args} \\ + ${archive} \\ + > ${gunzip} cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -40,13 +40,13 @@ process GUNZIP { """ stub: - def args = task.ext.args ?: '' - def extension = ( archive.toString() - '.gz' ).tokenize('.')[-1] - def name = archive.toString() - '.gz' - ".$extension" - def prefix = task.ext.prefix ?: name - gunzip = prefix + ".$extension" + def args = task.ext.args ?: '' + def extension = (archive.toString() - '.gz').tokenize('.')[-1] + def name = archive.toString() - '.gz' - ".${extension}" + def prefix = task.ext.prefix ?: name + gunzip = prefix + ".${extension}" """ - touch $gunzip + touch ${gunzip} cat <<-END_VERSIONS > versions.yml "${task.process}": gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//') diff --git a/modules/nf-core/gunzip/meta.yml b/modules/nf-core/gunzip/meta.yml index f32973a0..69d31024 100644 --- a/modules/nf-core/gunzip/meta.yml +++ b/modules/nf-core/gunzip/meta.yml @@ -10,25 +10,32 @@ tools: gzip is a file format and a software application used for file compression and decompression. documentation: https://www.gnu.org/software/gzip/manual/gzip.html licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Optional groovy Map containing meta information - e.g. [ id:'test', single_end:false ] - - archive: - type: file - description: File to be compressed/uncompressed - pattern: "*.*" + - - meta: + type: map + description: | + Optional groovy Map containing meta information + e.g. [ id:'test', single_end:false ] + - archive: + type: file + description: File to be compressed/uncompressed + pattern: "*.*" output: - gunzip: - type: file - description: Compressed/uncompressed file - pattern: "*.*" + - meta: + type: file + description: Compressed/uncompressed file + pattern: "*.*" + - ${gunzip}: + type: file + description: Compressed/uncompressed file + pattern: "*.*" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/gunzip/tests/main.nf.test.snap b/modules/nf-core/gunzip/tests/main.nf.test.snap index 069967e7..a0f0e67e 100644 --- a/modules/nf-core/gunzip/tests/main.nf.test.snap +++ b/modules/nf-core/gunzip/tests/main.nf.test.snap @@ -11,7 +11,7 @@ ] ], "1": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ], "gunzip": [ [ @@ -22,15 +22,15 @@ ] ], "versions": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-06-25T11:35:10.861293" + "timestamp": "2024-12-13T11:48:22.080222697" }, "Should run without failures - stub": { "content": [ @@ -44,7 +44,7 @@ ] ], "1": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ], "gunzip": [ [ @@ -55,15 +55,15 @@ ] ], "versions": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-06-25T11:35:05.857145" + "timestamp": "2024-12-13T11:48:14.593020264" }, "Should run without failures": { "content": [ @@ -77,7 +77,7 @@ ] ], "1": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ], "gunzip": [ [ @@ -88,15 +88,15 @@ ] ], "versions": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2023-10-17T15:35:37.690477896" + "timestamp": "2024-12-13T11:48:01.295397925" }, "Should run without failures - prefix": { "content": [ @@ -110,7 +110,7 @@ ] ], "1": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ], "gunzip": [ [ @@ -121,14 +121,14 @@ ] ], "versions": [ - "versions.yml:md5,54376d32aca20e937a4ec26dac228e84" + "versions.yml:md5,d327e4a19a6d5c5e974136cef8999d8c" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-06-25T11:33:32.921739" + "timestamp": "2024-12-13T11:48:07.414271387" } } \ No newline at end of file diff --git a/modules/nf-core/hamronization/abricate/environment.yml b/modules/nf-core/hamronization/abricate/environment.yml index 75f349f1..5826a865 100644 --- a/modules/nf-core/hamronization/abricate/environment.yml +++ b/modules/nf-core/hamronization/abricate/environment.yml @@ -1,7 +1,7 @@ -name: hamronization_abricate +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hamronization=1.1.4 diff --git a/modules/nf-core/hamronization/abricate/meta.yml b/modules/nf-core/hamronization/abricate/meta.yml index 4a0867d6..b1346892 100644 --- a/modules/nf-core/hamronization/abricate/meta.yml +++ b/modules/nf-core/hamronization/abricate/meta.yml @@ -1,5 +1,6 @@ name: "hamronization_abricate" -description: Tool to convert and summarize ABRicate outputs using the hAMRonization specification +description: Tool to convert and summarize ABRicate outputs using the hAMRonization + specification keywords: - amr - antimicrobial resistance @@ -7,51 +8,61 @@ keywords: - abricate tools: - "hamronization": - description: "Tool to convert and summarize AMR gene detection outputs using the hAMRonization specification" + description: "Tool to convert and summarize AMR gene detection outputs using the + hAMRonization specification" homepage: "https://github.com/pha4ge/hAMRonization/" documentation: "https://github.com/pha4ge/hAMRonization/" tool_dev_url: "https://github.com/pha4ge/hAMRonization" licence: ["GNU Lesser General Public v3 (LGPL v3)"] + identifier: biotools:hamronization input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - report: - type: file - description: Output TSV or CSV file from ABRicate - pattern: "*.{csv,tsv}" - - format: - type: string - description: Type of report file to be produced - pattern: "tsv|json" - - software_version: - type: string - description: Version of ABRicate used - pattern: "[0-9].[0-9].[0-9]" - - reference_db_version: - type: string - description: Database version of ABRicate used - pattern: "[0-9][0-9][0-9][0-9]-[A-Z][a-z][a-z]-[0-9][0-9]" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - report: + type: file + description: Output TSV or CSV file from ABRicate + pattern: "*.{csv,tsv}" + - - format: + type: string + description: Type of report file to be produced + pattern: "tsv|json" + - - software_version: + type: string + description: Version of ABRicate used + pattern: "[0-9].[0-9].[0-9]" + - - reference_db_version: + type: string + description: Database version of ABRicate used + pattern: "[0-9][0-9][0-9][0-9]-[A-Z][a-z][a-z]-[0-9][0-9]" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: hAMRonised report in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: hAMRonised report in JSON format + pattern: "*.json" - tsv: - type: file - description: hAMRonised report in TSV format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: hAMRonised report in TSV format + pattern: "*.json" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jasmezz" maintainers: diff --git a/modules/nf-core/hamronization/amrfinderplus/environment.yml b/modules/nf-core/hamronization/amrfinderplus/environment.yml index 2f9cb27f..5826a865 100644 --- a/modules/nf-core/hamronization/amrfinderplus/environment.yml +++ b/modules/nf-core/hamronization/amrfinderplus/environment.yml @@ -1,7 +1,7 @@ -name: hamronization_amrfinderplus +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hamronization=1.1.4 diff --git a/modules/nf-core/hamronization/amrfinderplus/meta.yml b/modules/nf-core/hamronization/amrfinderplus/meta.yml index c0997150..aba55b1f 100644 --- a/modules/nf-core/hamronization/amrfinderplus/meta.yml +++ b/modules/nf-core/hamronization/amrfinderplus/meta.yml @@ -1,5 +1,6 @@ name: "hamronization_amrfinderplus" -description: Tool to convert and summarize AMRfinderPlus outputs using the hAMRonization specification. +description: Tool to convert and summarize AMRfinderPlus outputs using the hAMRonization + specification. keywords: - amr - antimicrobial resistance @@ -9,51 +10,61 @@ keywords: - amrfinderplus tools: - "hamronization": - description: "Tool to convert and summarize AMR gene detection outputs using the hAMRonization specification" + description: "Tool to convert and summarize AMR gene detection outputs using the + hAMRonization specification" homepage: "https://github.com/pha4ge/hAMRonization/" documentation: "https://github.com/pha4ge/hAMRonization/" tool_dev_url: "https://github.com/pha4ge/hAMRonization" licence: ["GNU Lesser General Public v3 (LGPL v3)"] + identifier: biotools:hamronization input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - report: - type: file - description: Output .tsv file from AMRfinderPlus - pattern: "*.tsv" - - format: - type: string - description: Type of report file to be produced - pattern: "tsv|json" - - software_version: - type: string - description: Version of AMRfinder used - pattern: "[0-9].[0-9].[0-9]" - - reference_db_version: - type: string - description: Database version of ncbi_AMRfinder used - pattern: "[0-9]-[0-9]-[0-9].[0-9]" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - report: + type: file + description: Output .tsv file from AMRfinderPlus + pattern: "*.tsv" + - - format: + type: string + description: Type of report file to be produced + pattern: "tsv|json" + - - software_version: + type: string + description: Version of AMRfinder used + pattern: "[0-9].[0-9].[0-9]" + - - reference_db_version: + type: string + description: Database version of ncbi_AMRfinder used + pattern: "[0-9]-[0-9]-[0-9].[0-9]" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: hAMRonised report in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: hAMRonised report in JSON format + pattern: "*.json" - tsv: - type: file - description: hAMRonised report in TSV format - pattern: "*.tsv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: hAMRonised report in TSV format + pattern: "*.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" maintainers: diff --git a/modules/nf-core/hamronization/deeparg/environment.yml b/modules/nf-core/hamronization/deeparg/environment.yml index c9db54c6..5826a865 100644 --- a/modules/nf-core/hamronization/deeparg/environment.yml +++ b/modules/nf-core/hamronization/deeparg/environment.yml @@ -1,7 +1,7 @@ -name: hamronization_deeparg +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hamronization=1.1.4 diff --git a/modules/nf-core/hamronization/deeparg/meta.yml b/modules/nf-core/hamronization/deeparg/meta.yml index de01196e..39149a34 100644 --- a/modules/nf-core/hamronization/deeparg/meta.yml +++ b/modules/nf-core/hamronization/deeparg/meta.yml @@ -1,5 +1,6 @@ name: hamronization_deeparg -description: Tool to convert and summarize DeepARG outputs using the hAMRonization specification +description: Tool to convert and summarize DeepARG outputs using the hAMRonization + specification keywords: - amr - antimicrobial resistance @@ -7,51 +8,61 @@ keywords: - deeparg tools: - hamronization: - description: Tool to convert and summarize AMR gene detection outputs using the hAMRonization specification + description: Tool to convert and summarize AMR gene detection outputs using the + hAMRonization specification homepage: https://github.com/pha4ge/hAMRonization/ documentation: https://github.com/pha4ge/hAMRonization/ tool_dev_url: https://github.com/pha4ge/hAMRonization licence: ["GNU Lesser General Public v3 (LGPL v3)"] + identifier: biotools:hamronization input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - report: - type: file - description: Output .mapping.ARG file from DeepARG - pattern: "*.mapping.ARG" - - format: - type: string - description: Type of report file to be produced - pattern: "tsv|json" - - software_version: - type: string - description: Version of DeepARG used - pattern: "[0-9].[0-9].[0-9]" - - reference_db_version: - type: integer - description: Database version of DeepARG used - pattern: "[0-9]" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - report: + type: file + description: Output .mapping.ARG file from DeepARG + pattern: "*.mapping.ARG" + - - format: + type: string + description: Type of report file to be produced + pattern: "tsv|json" + - - software_version: + type: string + description: Version of DeepARG used + pattern: "[0-9].[0-9].[0-9]" + - - reference_db_version: + type: integer + description: Database version of DeepARG used + pattern: "[0-9]" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: hAMRonised report in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: hAMRonised report in JSON format + pattern: "*.json" - tsv: - type: file - description: hAMRonised report in TSV format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: hAMRonised report in TSV format + pattern: "*.json" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/hamronization/fargene/environment.yml b/modules/nf-core/hamronization/fargene/environment.yml index 6507e7d4..5826a865 100644 --- a/modules/nf-core/hamronization/fargene/environment.yml +++ b/modules/nf-core/hamronization/fargene/environment.yml @@ -1,7 +1,7 @@ -name: hamronization_fargene +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hamronization=1.1.4 diff --git a/modules/nf-core/hamronization/fargene/meta.yml b/modules/nf-core/hamronization/fargene/meta.yml index 45a3811d..efd3de36 100644 --- a/modules/nf-core/hamronization/fargene/meta.yml +++ b/modules/nf-core/hamronization/fargene/meta.yml @@ -1,5 +1,6 @@ name: "hamronization_fargene" -description: Tool to convert and summarize fARGene outputs using the hAMRonization specification +description: Tool to convert and summarize fARGene outputs using the hAMRonization + specification keywords: - amr - antimicrobial resistance @@ -9,51 +10,61 @@ keywords: - fARGene tools: - hamronization: - description: "Tool to convert and summarize AMR gene detection outputs using the hAMRonization specification" + description: "Tool to convert and summarize AMR gene detection outputs using the + hAMRonization specification" homepage: "https://github.com/pha4ge/hAMRonization/" documentation: "https://github.com/pha4ge/hAMRonization/" tool_dev_url: "https://github.com/pha4ge/hAMRonization" licence: ["GNU Lesser General Public v3 (LGPL v3)"] + identifier: biotools:hamronization input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - report: - type: file - description: Output .txt file from fARGene - pattern: "*.txt" - - format: - type: string - description: Type of report file to be produced - pattern: "tsv|json" - - software_version: - type: string - description: Version of fARGene used - pattern: "[0-9].[0-9].[0-9]" - - reference_db_version: - type: string - description: Database version of fARGene used - pattern: "[0-9].[0-9].[0-9]" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - report: + type: file + description: Output .txt file from fARGene + pattern: "*.txt" + - - format: + type: string + description: Type of report file to be produced + pattern: "tsv|json" + - - software_version: + type: string + description: Version of fARGene used + pattern: "[0-9].[0-9].[0-9]" + - - reference_db_version: + type: string + description: Database version of fARGene used + pattern: "[0-9].[0-9].[0-9]" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: hAMRonised report in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: hAMRonised report in JSON format + pattern: "*.json" - tsv: - type: file - description: hAMRonised report in TSV format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: hAMRonised report in TSV format + pattern: "*.json" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/hamronization/rgi/environment.yml b/modules/nf-core/hamronization/rgi/environment.yml index 91d03e49..5826a865 100644 --- a/modules/nf-core/hamronization/rgi/environment.yml +++ b/modules/nf-core/hamronization/rgi/environment.yml @@ -1,7 +1,7 @@ -name: hamronization_rgi +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hamronization=1.1.4 diff --git a/modules/nf-core/hamronization/rgi/meta.yml b/modules/nf-core/hamronization/rgi/meta.yml index 0cca8502..525148e5 100644 --- a/modules/nf-core/hamronization/rgi/meta.yml +++ b/modules/nf-core/hamronization/rgi/meta.yml @@ -9,51 +9,61 @@ keywords: - rgi tools: - hamronization: - description: "Tool to convert and summarize AMR gene detection outputs using the hAMRonization specification" + description: "Tool to convert and summarize AMR gene detection outputs using the + hAMRonization specification" homepage: "https://github.com/pha4ge/hAMRonization/" documentation: "https://github.com/pha4ge/hAMRonization/" tool_dev_url: "https://github.com/pha4ge/hAMRonization" licence: ["GNU Lesser General Public v3 (LGPL v3)"] + identifier: biotools:hamronization input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - report: - type: file - description: Output .txt file from RGI - pattern: "*.txt" - - format: - type: string - description: Type of report file to be produced - pattern: "tsv|json" - - software_version: - type: string - description: Version of DeepARG used - pattern: "[0-9].[0-9].[0-9]" - - reference_db_version: - type: string - description: Database version of DeepARG used - pattern: "[0-9].[0-9].[0-9]" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - report: + type: file + description: Output .txt file from RGI + pattern: "*.txt" + - - format: + type: string + description: Type of report file to be produced + pattern: "tsv|json" + - - software_version: + type: string + description: Version of DeepARG used + pattern: "[0-9].[0-9].[0-9]" + - - reference_db_version: + type: string + description: Database version of DeepARG used + pattern: "[0-9].[0-9].[0-9]" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: hAMRonised report in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: hAMRonised report in JSON format + pattern: "*.json" - tsv: - type: file - description: hAMRonised report in TSV format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: hAMRonised report in TSV format + pattern: "*.json" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" maintainers: diff --git a/modules/nf-core/hamronization/summarize/environment.yml b/modules/nf-core/hamronization/summarize/environment.yml index 1872a689..5826a865 100644 --- a/modules/nf-core/hamronization/summarize/environment.yml +++ b/modules/nf-core/hamronization/summarize/environment.yml @@ -1,7 +1,7 @@ -name: hamronization_summarize +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hamronization=1.1.4 diff --git a/modules/nf-core/hamronization/summarize/meta.yml b/modules/nf-core/hamronization/summarize/meta.yml index 7d4c7b68..54ceeff3 100644 --- a/modules/nf-core/hamronization/summarize/meta.yml +++ b/modules/nf-core/hamronization/summarize/meta.yml @@ -1,42 +1,49 @@ name: hamronization_summarize -description: Tool to summarize and combine all hAMRonization reports into a single file +description: Tool to summarize and combine all hAMRonization reports into a single + file keywords: - amr - antimicrobial resistance - reporting tools: - hamronization: - description: Tool to convert and summarize AMR gene detection outputs using the hAMRonization specification + description: Tool to convert and summarize AMR gene detection outputs using the + hAMRonization specification homepage: https://github.com/pha4ge/hAMRonization/ documentation: https://github.com/pha4ge/hAMRonization/ tool_dev_url: https://github.com/pha4ge/hAMRonization licence: ["GNU Lesser General Public v3 (LGPL v3)"] + identifier: biotools:hamronization input: - - reports: - type: file - description: List of multiple hAMRonization reports in either JSON or TSV format - pattern: "*.{json,tsv}" - - format: - type: string - description: Type of final combined report file to be produced - pattern: "tsv|json|interactive" + - - reports: + type: file + description: List of multiple hAMRonization reports in either JSON or TSV format + pattern: "*.{json,tsv}" + - - format: + type: string + description: Type of final combined report file to be produced + pattern: "tsv|json|interactive" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: hAMRonised summary in JSON format - pattern: "*.json" + - hamronization_combined_report.json: + type: file + description: hAMRonised summary in JSON format + pattern: "*.json" - tsv: - type: file - description: hAMRonised summary in TSV format - pattern: "*.json" + - hamronization_combined_report.tsv: + type: file + description: hAMRonised summary in TSV format + pattern: "*.json" - html: - type: file - description: hAMRonised summary in HTML format - pattern: "*.html" + - hamronization_combined_report.html: + type: file + description: hAMRonised summary in HTML format + pattern: "*.html" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/hmmer/hmmsearch/environment.yml b/modules/nf-core/hmmer/hmmsearch/environment.yml index d672c2b3..1967d405 100644 --- a/modules/nf-core/hmmer/hmmsearch/environment.yml +++ b/modules/nf-core/hmmer/hmmsearch/environment.yml @@ -1,7 +1,7 @@ -name: hmmer_hmmsearch +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::hmmer=3.4 diff --git a/modules/nf-core/hmmer/hmmsearch/meta.yml b/modules/nf-core/hmmer/hmmsearch/meta.yml index 39893c3b..0e078659 100644 --- a/modules/nf-core/hmmer/hmmsearch/meta.yml +++ b/modules/nf-core/hmmer/hmmsearch/meta.yml @@ -13,55 +13,79 @@ tools: tool_dev_url: https://github.com/EddyRivasLab/hmmer doi: "10.1371/journal.pcbi.1002195" licence: ["BSD"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - hmmfile: - type: file - description: One or more HMM profiles created with hmmbuild - pattern: "*.{hmm,hmm.gz}" - - seqdb: - type: file - description: Database of sequences in FASTA format - pattern: "*.{fasta,fna,faa,fa,fasta.gz,fna.gz,faa.gz,fa.gz}" - - write_align: - type: boolean - description: Flag to save optional alignment output. Specify with 'true' to save. - - write_target: - type: boolean - description: Flag to save optional per target summary. Specify with 'true' to save. - - write_domain: - type: boolean - description: Flag to save optional per domain summary. Specify with 'true' to save. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - hmmfile: + type: file + description: One or more HMM profiles created with hmmbuild + pattern: "*.{hmm,hmm.gz}" + - seqdb: + type: file + description: Database of sequences in FASTA format + pattern: "*.{fasta,fna,faa,fa,fasta.gz,fna.gz,faa.gz,fa.gz}" + - write_align: + type: boolean + description: Flag to save optional alignment output. Specify with 'true' to + save. + - write_target: + type: boolean + description: Flag to save optional per target summary. Specify with 'true' to + save. + - write_domain: + type: boolean + description: Flag to save optional per domain summary. Specify with 'true' to + save. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - output: - type: file - description: Human readable output summarizing hmmsearch results - pattern: "*.{txt.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.txt.gz": + type: file + description: Human readable output summarizing hmmsearch results + pattern: "*.{txt.gz}" - alignments: - type: file - description: Optional multiple sequence alignment (MSA) in Stockholm format - pattern: "*.{sto.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.sto.gz": + type: file + description: Optional multiple sequence alignment (MSA) in Stockholm format + pattern: "*.{sto.gz}" - target_summary: - type: file - description: Optional tabular (space-delimited) summary of per-target output - pattern: "*.{tbl.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbl.gz": + type: file + description: Optional tabular (space-delimited) summary of per-target output + pattern: "*.{tbl.gz}" - domain_summary: - type: file - description: Optional tabular (space-delimited) summary of per-domain output - pattern: "*.{domtbl.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.domtbl.gz": + type: file + description: Optional tabular (space-delimited) summary of per-domain output + pattern: "*.{domtbl.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Midnighter" maintainers: diff --git a/modules/nf-core/interproscan/environment.yml b/modules/nf-core/interproscan/environment.yml new file mode 100644 index 00000000..8e82f003 --- /dev/null +++ b/modules/nf-core/interproscan/environment.yml @@ -0,0 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::interproscan=5.59_91.0 diff --git a/modules/nf-core/interproscan/main.nf b/modules/nf-core/interproscan/main.nf new file mode 100644 index 00000000..add9b031 --- /dev/null +++ b/modules/nf-core/interproscan/main.nf @@ -0,0 +1,66 @@ +process INTERPROSCAN { + tag "$meta.id" + label 'process_medium' + label 'process_long' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/interproscan:5.59_91.0--hec16e2b_1' : + 'biocontainers/interproscan:5.59_91.0--hec16e2b_1' }" + + input: + tuple val(meta), path(fasta) + path(interproscan_database, stageAs: 'data') + + output: + tuple val(meta), path('*.tsv') , optional: true, emit: tsv + tuple val(meta), path('*.xml') , optional: true, emit: xml + tuple val(meta), path('*.gff3'), optional: true, emit: gff3 + tuple val(meta), path('*.json'), optional: true, emit: json + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def is_compressed = fasta.name.endsWith(".gz") + def fasta_name = fasta.name.replace(".gz", "") + """ + if [ -d 'data' ]; then + # Find interproscan.properties to link data/ from work directory + INTERPROSCAN_DIR="\$( dirname "\$( dirname "\$( which interproscan.sh )" )" )" + INTERPROSCAN_PROPERTIES="\$( find "\$INTERPROSCAN_DIR/share" -name "interproscan.properties" )" + cp "\$INTERPROSCAN_PROPERTIES" . + sed -i "/^bin\\.directory=/ s|.*|bin.directory=\$INTERPROSCAN_DIR/bin|" interproscan.properties + export INTERPROSCAN_CONF=interproscan.properties + fi # else use sample DB included with conda ( testing only! ) + + if ${is_compressed} ; then + gzip -c -d ${fasta} > ${fasta_name} + fi + + interproscan.sh \\ + --cpu ${task.cpus} \\ + --input ${fasta_name} \\ + ${args} \\ + --output-file-base ${prefix} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + interproscan: \$( interproscan.sh --version | sed '1!d; s/.*version //' ) + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.{tsv,xml,json,gff3} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + interproscan: \$( interproscan.sh --version | sed '1!d; s/.*version //' ) + END_VERSIONS + """ +} diff --git a/modules/nf-core/interproscan/meta.yml b/modules/nf-core/interproscan/meta.yml new file mode 100644 index 00000000..0bb10f7d --- /dev/null +++ b/modules/nf-core/interproscan/meta.yml @@ -0,0 +1,82 @@ +name: "interproscan" +description: Produces protein annotations and predictions from an amino acids FASTA + file +keywords: + - annotation + - fasta + - protein + - dna + - interproscan +tools: + - "interproscan": + description: "InterPro integrates together predictive information about proteins + function from a number of partner resources" + homepage: "https://www.ebi.ac.uk/interpro/search/sequence/" + documentation: "https://interproscan-docs.readthedocs.io" + tool_dev_url: "https://github.com/ebi-pf-team/interproscan" + doi: "10.1093/bioinformatics/btu031" + licence: ["GPL v3"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Input fasta file containing the amino acid or dna query sequences + pattern: "*.{fa,fasta,fa.gz,fasta.gz}" + - - interproscan_database: + type: directory + description: Path to the interproscan database (untarred + http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/${version_major}-${version_minor}/interproscan-${version_major}-${version_minor}-64-bit.tar.gz) +output: + - tsv: + - meta: + type: file + description: Tab separated file containing with detailed hits + pattern: "*.{tsv}" + - "*.tsv": + type: file + description: Tab separated file containing with detailed hits + pattern: "*.{tsv}" + - xml: + - meta: + type: file + description: XML file containing with detailed hits + pattern: "*.{xml}" + - "*.xml": + type: file + description: XML file containing with detailed hits + pattern: "*.{xml}" + - gff3: + - meta: + type: file + description: GFF3 file containing with detailed hits + pattern: "*.{gff3}" + - "*.gff3": + type: file + description: GFF3 file containing with detailed hits + pattern: "*.{gff3}" + - json: + - meta: + type: file + description: JSON file containing with detailed hits + pattern: "*.{json}" + - "*.json": + type: file + description: JSON file containing with detailed hits + pattern: "*.{json}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@toniher" + - "@mahesh-panchal" +maintainers: + - "@toniher" + - "@vagkaratzas" + - "@mahesh-panchal" diff --git a/modules/nf-core/interproscan/tests/main.nf.test b/modules/nf-core/interproscan/tests/main.nf.test new file mode 100644 index 00000000..1fe4625d --- /dev/null +++ b/modules/nf-core/interproscan/tests/main.nf.test @@ -0,0 +1,100 @@ +nextflow_process { + + name "Test Process INTERPROSCAN" + script "../main.nf" + process "INTERPROSCAN" + config "./nextflow.config" + tag "modules" + tag "modules_nfcore" + tag "interproscan" + + // Note: Regular tests have been commented out because Interproscan has a harded coded a requirement of 10G memory, + // and so will therefore not run on the nf-core test runners without being killed. + + // test("sarscov2 - proteome_fasta") { + + // when { + // process { + // """ + // input[0] = [ + // [ id:'test' ], + // file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/proteome.fasta', checkIfExists: true) + // ] + // input[1] = [] + // """ + // } + // } + + // then { + // assertAll( + // { assert process.success }, + // { assert snapshot( + // path(process.out.tsv[0][1]).readLines()[0] + // .contains("ENSSASP00005000004.1 4c35f09aac2f7be4f3cffd30c6aecac8 1273 Coils Coil Coil 1176 1203 - T"), + // process.out.xml, + // process.out.json, + // path(process.out.gff3[0][1]).readLines()[0..4,6..-1], + // process.out.versions, + // ).match() + // } + // ) + // } + + // } + + // test("sarscov2 - proteome_fasta_gz") { + + // when { + // process { + // """ + // input[0] = [ + // [ id:'test' ], + // file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/proteome.fasta.gz', checkIfExists: true) + // ] + // input[1] = [] + // """ + // } + // } + + // then { + // assertAll( + // { assert process.success }, + // { assert snapshot( + // path(process.out.tsv[0][1]).readLines()[0] + // .contains("ENSSASP00005000004.1 4c35f09aac2f7be4f3cffd30c6aecac8 1273 Coils Coil Coil 1176 1203 - T"), + // process.out.xml, + // process.out.json, + // path(process.out.gff3[0][1]).readLines()[0..4,6..-1], + // process.out.versions, + // ).match() + // } + // ) + // } + + // } + + test("sarscov2 - proteome_fasta_gz - stub") { + + options '-stub' + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/proteome.fasta.gz', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } +} diff --git a/modules/nf-core/interproscan/tests/main.nf.test.snap b/modules/nf-core/interproscan/tests/main.nf.test.snap new file mode 100644 index 00000000..0529dfe4 --- /dev/null +++ b/modules/nf-core/interproscan/tests/main.nf.test.snap @@ -0,0 +1,207 @@ +{ + "sarscov2 - proteome_fasta_gz - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.xml:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test" + }, + "test.gff3:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test" + }, + "test.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + "versions.yml:md5,8bd8c66c2f1a7854faa29781761642c2" + ], + "gff3": [ + [ + { + "id": "test" + }, + "test.gff3:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test" + }, + "test.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tsv": [ + [ + { + "id": "test" + }, + "test.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,8bd8c66c2f1a7854faa29781761642c2" + ], + "xml": [ + [ + { + "id": "test" + }, + "test.xml:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-27T12:51:27.943051636" + }, + "sarscov2 - proteome_fasta_gz": { + "content": [ + true, + [ + [ + { + "id": "test" + }, + "test.xml:md5,7a211c1a4761e2b9b8700e6e9abbb15f" + ] + ], + [ + [ + { + "id": "test" + }, + "test.json:md5,b05cffc28b7bfeb3dabe43c2927b2024" + ] + ], + [ + "##gff-version 3", + "##feature-ontology http://song.cvs.sourceforge.net/viewvc/song/ontology/sofa.obo?revision=1.269", + "##interproscan-version 5.59-91.0", + "##sequence-region ENSSASP00005000004.1 1 1273", + "ENSSASP00005000004.1\t.\tpolypeptide\t1\t1273\t.\t+\t.\tID=ENSSASP00005000004.1;md5=4c35f09aac2f7be4f3cffd30c6aecac8", + "##FASTA", + ">ENSSASP00005000004.1", + "MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFS", + "NVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV", + "NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLE", + "GKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT", + "LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK", + "CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISN", + "CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD", + "YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC", + "NGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVN", + "FNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP", + "GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSY", + "ECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTI", + "SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQE", + "VFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC", + "LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAM", + "QMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN", + "TLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRA", + "SANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPA", + "ICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDP", + "LQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL", + "QELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD", + "SEPVLKGVKLHYT", + ">match$1_1176_1203", + "VVNIQKEIDRLNEVAKNLNESLIDLQEL" + ], + [ + "versions.yml:md5,8bd8c66c2f1a7854faa29781761642c2" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-27T12:51:14.476645388" + }, + "sarscov2 - proteome_fasta": { + "content": [ + true, + [ + [ + { + "id": "test" + }, + "test.xml:md5,7a211c1a4761e2b9b8700e6e9abbb15f" + ] + ], + [ + [ + { + "id": "test" + }, + "test.json:md5,b05cffc28b7bfeb3dabe43c2927b2024" + ] + ], + [ + "##gff-version 3", + "##feature-ontology http://song.cvs.sourceforge.net/viewvc/song/ontology/sofa.obo?revision=1.269", + "##interproscan-version 5.59-91.0", + "##sequence-region ENSSASP00005000004.1 1 1273", + "ENSSASP00005000004.1\t.\tpolypeptide\t1\t1273\t.\t+\t.\tID=ENSSASP00005000004.1;md5=4c35f09aac2f7be4f3cffd30c6aecac8", + "##FASTA", + ">ENSSASP00005000004.1", + "MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFS", + "NVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV", + "NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLE", + "GKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT", + "LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK", + "CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISN", + "CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD", + "YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC", + "NGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVN", + "FNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP", + "GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSY", + "ECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTI", + "SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQE", + "VFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC", + "LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAM", + "QMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN", + "TLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRA", + "SANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPA", + "ICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDP", + "LQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL", + "QELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD", + "SEPVLKGVKLHYT", + ">match$1_1176_1203", + "VVNIQKEIDRLNEVAKNLNESLIDLQEL" + ], + [ + "versions.yml:md5,8bd8c66c2f1a7854faa29781761642c2" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-27T12:50:27.562653728" + } +} \ No newline at end of file diff --git a/modules/nf-core/interproscan/tests/nextflow.config b/modules/nf-core/interproscan/tests/nextflow.config new file mode 100644 index 00000000..2043e2c7 --- /dev/null +++ b/modules/nf-core/interproscan/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: INTERPROSCAN { + ext.args = '-appl Coils' + } +} diff --git a/modules/nf-core/interproscan/tests/tags.yml b/modules/nf-core/interproscan/tests/tags.yml new file mode 100644 index 00000000..ddb90f86 --- /dev/null +++ b/modules/nf-core/interproscan/tests/tags.yml @@ -0,0 +1,2 @@ +interproscan: + - modules/nf-core/interproscan/** diff --git a/modules/nf-core/macrel/contigs/environment.yml b/modules/nf-core/macrel/contigs/environment.yml index e6c11226..ea2b6ac6 100644 --- a/modules/nf-core/macrel/contigs/environment.yml +++ b/modules/nf-core/macrel/contigs/environment.yml @@ -1,7 +1,7 @@ -name: macrel_contigs +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::macrel=1.2.0 + - bioconda::macrel=1.4.0 diff --git a/modules/nf-core/macrel/contigs/main.nf b/modules/nf-core/macrel/contigs/main.nf index 6b62a868..b8f8f522 100644 --- a/modules/nf-core/macrel/contigs/main.nf +++ b/modules/nf-core/macrel/contigs/main.nf @@ -4,8 +4,8 @@ process MACREL_CONTIGS { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/macrel:1.2.0--pyh5e36f6f_0': - 'biocontainers/macrel:1.2.0--pyh5e36f6f_0' }" + 'https://depot.galaxyproject.org/singularity/macrel:1.4.0--pyh7e72e81_0': + 'biocontainers/macrel:1.4.0--pyh7e72e81_0' }" input: tuple val(meta), path(fasta) @@ -35,6 +35,24 @@ process MACREL_CONTIGS { gzip --no-name ${prefix}/*.faa + cat <<-END_VERSIONS > versions.yml + "${task.process}": + macrel: \$(echo \$(macrel --version | sed 's/macrel //g')) + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir ${prefix} + + touch ${prefix}/${prefix}_log.txt + echo | gzip > ${prefix}/${prefix}.smorfs.faa.gz + echo | gzip > ${prefix}/${prefix}.all_orfs.faa.gz + echo | gzip > ${prefix}/${prefix}.prediction.gz + touch ${prefix}/${prefix}.md + + cat <<-END_VERSIONS > versions.yml "${task.process}": macrel: \$(echo \$(macrel --version | sed 's/macrel //g')) diff --git a/modules/nf-core/macrel/contigs/meta.yml b/modules/nf-core/macrel/contigs/meta.yml index ba0b0e6f..c1c03f42 100644 --- a/modules/nf-core/macrel/contigs/meta.yml +++ b/modules/nf-core/macrel/contigs/meta.yml @@ -1,5 +1,7 @@ name: macrel_contigs -description: A tool that mines antimicrobial peptides (AMPs) from (meta)genomes by predicting peptides from genomes (provided as contigs) and outputs all the predicted anti-microbial peptides found. +description: A tool that mines antimicrobial peptides (AMPs) from (meta)genomes by + predicting peptides from genomes (provided as contigs) and outputs all the predicted + anti-microbial peptides found. keywords: - AMP - antimicrobial peptides @@ -14,46 +16,76 @@ tools: tool_dev_url: https://github.com/BigDataBiology/macrel doi: "10.7717/peerj.10555" licence: ["MIT"] + identifier: biotools:macrel input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: A fasta file with nucleotide sequences. - pattern: "*.{fasta,fa,fna,fasta.gz,fa.gz,fna.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: A fasta file with nucleotide sequences. + pattern: "*.{fasta,fa,fna,fasta.gz,fa.gz,fna.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - amp_prediction: - type: file - description: A zipped file, with all predicted amps in a table format. - pattern: "*.prediction.gz" - smorfs: - type: file - description: A zipped fasta file containing aminoacid sequences showing the general gene prediction information in the contigs. - pattern: "*.smorfs.faa.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*/*.smorfs.faa.gz": + type: file + description: A zipped fasta file containing aminoacid sequences showing the + general gene prediction information in the contigs. + pattern: "*.smorfs.faa.gz" - all_orfs: - type: file - description: A zipped fasta file containing amino acid sequences showing the general gene prediction information in the contigs. - pattern: "*.all_orfs.faa.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*/*.all_orfs.faa.gz": + type: file + description: A zipped fasta file containing amino acid sequences showing the + general gene prediction information in the contigs. + pattern: "*.all_orfs.faa.gz" + - amp_prediction: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*/*.prediction.gz": + type: file + description: A zipped file, with all predicted amps in a table format. + pattern: "*.prediction.gz" - readme_file: - type: file - description: A readme file containing tool specific information (e.g. citations, details about the output, etc.). - pattern: "*.md" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*/*.md": + type: file + description: A readme file containing tool specific information (e.g. citations, + details about the output, etc.). + pattern: "*.md" - log_file: - type: file - description: A log file containing the information pertaining to the run. - pattern: "*_log.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*/*_log.txt": + type: file + description: A log file containing the information pertaining to the run. + pattern: "*_log.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@darcy220606" maintainers: diff --git a/modules/nf-core/macrel/contigs/tests/main.nf.test b/modules/nf-core/macrel/contigs/tests/main.nf.test new file mode 100644 index 00000000..5b641b1e --- /dev/null +++ b/modules/nf-core/macrel/contigs/tests/main.nf.test @@ -0,0 +1,66 @@ + +nextflow_process { + + name "Test Process MACREL_CONTIGS" + script "../main.nf" + process "MACREL_CONTIGS" + + tag "modules" + tag "modules_nfcore" + tag "macrel" + tag "macrel/contigs" + + test("test-macrel-contigs") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.smorfs, + path(process.out.all_orfs[0][1]).linesGzip[0], + process.out.amp_prediction, + process.out.readme_file, + file(process.out.log_file[0][1]).name, + process.out.versions + ).match() + } + ) + } + } + + test("test-macrel-contigs-stub") { + options '-stub' + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/prokaryotes/bacteroides_fragilis/illumina/fasta/test1.contigs.fa.gz', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/macrel/contigs/tests/main.nf.test.snap b/modules/nf-core/macrel/contigs/tests/main.nf.test.snap new file mode 100644 index 00000000..3908c49c --- /dev/null +++ b/modules/nf-core/macrel/contigs/tests/main.nf.test.snap @@ -0,0 +1,150 @@ +{ + "test-macrel-contigs": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.smorfs.faa.gz:md5,2433037a55de266a1203759834849669" + ] + ], + ">k141_0_1 # 235 # 468 # -1 # ID=1_1;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.333", + [ + [ + { + "id": "test", + "single_end": false + }, + "test.prediction.gz:md5,c929d870dc197f9d5d36d3d5f683cbf4" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "README.md:md5,cf088d9256ff7b7730699f17b64b4028" + ] + ], + "test_log.txt", + [ + "versions.yml:md5,ab072d9245c9b28a8bc694e98795c924" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-30T20:48:49.632715" + }, + "test-macrel-contigs-stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.smorfs.faa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.all_orfs.faa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.prediction.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.md:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + "versions.yml:md5,ab072d9245c9b28a8bc694e98795c924" + ], + "all_orfs": [ + [ + { + "id": "test", + "single_end": false + }, + "test.all_orfs.faa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "amp_prediction": [ + [ + { + "id": "test", + "single_end": false + }, + "test.prediction.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "log_file": [ + [ + { + "id": "test", + "single_end": false + }, + "test_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "readme_file": [ + [ + { + "id": "test", + "single_end": false + }, + "test.md:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "smorfs": [ + [ + { + "id": "test", + "single_end": false + }, + "test.smorfs.faa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ab072d9245c9b28a8bc694e98795c924" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-30T20:50:42.040416" + } +} \ No newline at end of file diff --git a/modules/nf-core/mmseqs/createdb/environment.yml b/modules/nf-core/mmseqs/createdb/environment.yml index 77b28f59..69afa609 100644 --- a/modules/nf-core/mmseqs/createdb/environment.yml +++ b/modules/nf-core/mmseqs/createdb/environment.yml @@ -1,7 +1,7 @@ -name: mmseqs_createdb +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::mmseqs2=15.6f452 + - bioconda::mmseqs2=17.b804f diff --git a/modules/nf-core/mmseqs/createdb/main.nf b/modules/nf-core/mmseqs/createdb/main.nf index 9487e5bc..6f8d5b15 100644 --- a/modules/nf-core/mmseqs/createdb/main.nf +++ b/modules/nf-core/mmseqs/createdb/main.nf @@ -4,8 +4,8 @@ process MMSEQS_CREATEDB { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mmseqs2:15.6f452--pl5321h6a68c12_0': - 'biocontainers/mmseqs2:15.6f452--pl5321h6a68c12_0' }" + 'https://depot.galaxyproject.org/singularity/mmseqs2:17.b804f--hd6d6fdc_1': + 'biocontainers/mmseqs2:17.b804f--hd6d6fdc_1' }" input: tuple val(meta), path(sequence) @@ -33,8 +33,7 @@ process MMSEQS_CREATEDB { createdb \\ ${sequence_name} \\ ${prefix}/${prefix} \\ - $args \\ - --compressed 1 + $args cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/mmseqs/createdb/meta.yml b/modules/nf-core/mmseqs/createdb/meta.yml index a011020b..c392a360 100644 --- a/modules/nf-core/mmseqs/createdb/meta.yml +++ b/modules/nf-core/mmseqs/createdb/meta.yml @@ -1,4 +1,3 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json name: "mmseqs_createdb" description: Create an MMseqs database from an existing FASTA/Q file @@ -11,35 +10,40 @@ keywords: - mmseqs2 tools: - "mmseqs": - description: "MMseqs2: ultra fast and sensitive sequence search and clustering suite" + description: "MMseqs2: ultra fast and sensitive sequence search and clustering + suite" homepage: "https://github.com/soedinglab/MMseqs2" documentation: "https://mmseqs.com/latest/userguide.pdf" tool_dev_url: "https://github.com/soedinglab/MMseqs2" doi: "10.1093/bioinformatics/btw006" licence: ["GPL v3"] + identifier: biotools:mmseqs input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - sequence: - type: file - description: Input sequences in FASTA/Q (zipped or unzipped) format to parse into an mmseqs database - pattern: "*.{fasta,fasta.gz,fa,fa.gz,fna,fna.gz,fastq,fastq.gz,fq,fq.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - sequence: + type: file + description: Input sequences in FASTA/Q (zipped or unzipped) format to parse + into an mmseqs database + pattern: "*.{fasta,fasta.gz,fa,fa.gz,fna,fna.gz,fastq,fastq.gz,fq,fq.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - db: - type: directory - description: The created MMseqs2 database + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - ${prefix}/: + type: directory + description: The created MMseqs2 database - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Joon-Klaps" maintainers: diff --git a/modules/nf-core/mmseqs/createdb/tests/main.nf.test.snap b/modules/nf-core/mmseqs/createdb/tests/main.nf.test.snap index a24c4118..9eee149b 100644 --- a/modules/nf-core/mmseqs/createdb/tests/main.nf.test.snap +++ b/modules/nf-core/mmseqs/createdb/tests/main.nf.test.snap @@ -8,26 +8,26 @@ "single_end": false }, [ - "test:md5,7c3c2c5926cf8fa82e66b9628f680256", - "test.dbtype:md5,c8ed20c23ba91f4577f84c940c86c7db", - "test.index:md5,5b2fd8abd0ad3fee24738af7082e6a6e", + "test:md5,a2cda8768736a7a317a09d61556194bd", + "test.dbtype:md5,4352d88a78aa39750bf70cd6f27bcaa5", + "test.index:md5,4ba298b011e2472ce9f6b99fe6b6e3d5", "test.lookup:md5,32f88756dbcb6aaf7b239b0d61730f1b", "test.source:md5,9ada5b3ea6e1a7e16c4418eb98ae8d9d", - "test_h:md5,8c29f5ed94d83d7115e9c8a883ce358d", - "test_h.dbtype:md5,8895d3d8e9322aedbf45249dfb3ddb0a", - "test_h.index:md5,87c7c8c6d16018ebfaa6f408391a5ae2" + "test_h:md5,21c399702a071bdeecce09f9d1df4531", + "test_h.dbtype:md5,740bab4f9ec8808aedb68d6b1281aeb2", + "test_h.index:md5,d767fb43b37c0a644c676b00f9f93477" ] ] ], [ - "versions.yml:md5,e644cbe263d4560298438a24f268eb6f" + "versions.yml:md5,c62b08152082097334109fe08ec6333a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-08-09T10:01:44.163384" + "timestamp": "2025-01-20T14:11:57.883871" }, "Should build an mmseqs db from a zipped amino acid sequence file": { "content": [ @@ -37,25 +37,25 @@ "id": "test" }, [ - "test:md5,4b494965ed7ab67da8ca3f39523eb104", - "test.dbtype:md5,152afd7bf4dbe26f85032eee0269201a", - "test.index:md5,46f9d884e9a7f442fe1cd2ce339734e3", + "test:md5,1162504bc65aacf734abdcb0cdbe87de", + "test.dbtype:md5,f1d3ff8443297732862df21dc4e57262", + "test.index:md5,8cdcbc06c2b99fdb09f3d1735a76def9", "test.lookup:md5,3e27cb93d9ee875ad42a6f32f5651bdc", "test.source:md5,eaa64fc8a5f7ec1ee49b0dcbd1a72e9d", - "test_h:md5,6e798b81c70d191f78939c2dd6223a7f", - "test_h.dbtype:md5,8895d3d8e9322aedbf45249dfb3ddb0a", - "test_h.index:md5,d5ac49ff56df064b980fa0eb5da57673" + "test_h:md5,f258f8cc04f83c270a75e8b00a6d2d89", + "test_h.dbtype:md5,740bab4f9ec8808aedb68d6b1281aeb2", + "test_h.index:md5,844bf1950bcd37284fdc5d7117ee4241" ] ] ], [ - "versions.yml:md5,e644cbe263d4560298438a24f268eb6f" + "versions.yml:md5,c62b08152082097334109fe08ec6333a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-08-09T10:01:48.894044" + "timestamp": "2025-01-20T14:12:10.986433" } } \ No newline at end of file diff --git a/modules/nf-core/mmseqs/createdb/tests/tags.yml b/modules/nf-core/mmseqs/createdb/tests/tags.yml deleted file mode 100644 index 1f511ab0..00000000 --- a/modules/nf-core/mmseqs/createdb/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -mmseqs/createdb: - - modules/nf-core/mmseqs/createdb/** diff --git a/modules/nf-core/mmseqs/createtsv/environment.yml b/modules/nf-core/mmseqs/createtsv/environment.yml index 4840fc02..69afa609 100644 --- a/modules/nf-core/mmseqs/createtsv/environment.yml +++ b/modules/nf-core/mmseqs/createtsv/environment.yml @@ -1,7 +1,7 @@ -name: mmseqs_createtsv +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::mmseqs2=15.6f452 + - bioconda::mmseqs2=17.b804f diff --git a/modules/nf-core/mmseqs/createtsv/main.nf b/modules/nf-core/mmseqs/createtsv/main.nf index dcd4c13d..3ab0159a 100644 --- a/modules/nf-core/mmseqs/createtsv/main.nf +++ b/modules/nf-core/mmseqs/createtsv/main.nf @@ -5,8 +5,8 @@ process MMSEQS_CREATETSV { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mmseqs2:15.6f452--pl5321h6a68c12_0': - 'biocontainers/mmseqs2:15.6f452--pl5321h6a68c12_0' }" + 'https://depot.galaxyproject.org/singularity/mmseqs2:17.b804f--hd6d6fdc_1': + 'biocontainers/mmseqs2:17.b804f--hd6d6fdc_1' }" input: tuple val(meta), path(db_result) @@ -22,9 +22,9 @@ process MMSEQS_CREATETSV { script: def args = task.ext.args ?: '' - def args2 = task.ext.args ?: "*.dbtype" // database generated by mmyseqs cluster | search | taxonomy | ... - def args3 = task.ext.args ?: "*.dbtype" // database generated by mmyseqs/createdb - def args4 = task.ext.args ?: "*.dbtype" // database generated by mmyseqs/createdb + def args2 = task.ext.args2 ?: "*.dbtype" // database generated by mmyseqs cluster | search | taxonomy | ... + def args3 = task.ext.args3 ?: "*.dbtype" // database generated by mmyseqs/createdb + def args4 = task.ext.args4 ?: "*.dbtype" // database generated by mmyseqs/createdb def prefix = task.ext.prefix ?: "${meta.id}" """ @@ -40,8 +40,7 @@ process MMSEQS_CREATETSV { \$DB_RESULT_PATH_NAME \\ ${prefix}.tsv \\ $args \\ - --threads ${task.cpus} \\ - --compressed 1 + --threads ${task.cpus} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/mmseqs/createtsv/meta.yml b/modules/nf-core/mmseqs/createtsv/meta.yml index e85b066f..5a50ff34 100644 --- a/modules/nf-core/mmseqs/createtsv/meta.yml +++ b/modules/nf-core/mmseqs/createtsv/meta.yml @@ -1,7 +1,7 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json name: "mmseqs_createtsv" -description: Create a tsv file from a query and a target database as well as the result database +description: Create a tsv file from a query and a target database as well as the result + database keywords: - protein sequence - databases @@ -12,53 +12,58 @@ keywords: - tsv tools: - "mmseqs": - description: "MMseqs2: ultra fast and sensitive sequence search and clustering suite" + description: "MMseqs2: ultra fast and sensitive sequence search and clustering + suite" homepage: "https://github.com/soedinglab/MMseqs2" documentation: "https://mmseqs.com/latest/userguide.pdf" tool_dev_url: "https://github.com/soedinglab/MMseqs2" doi: "10.1093/bioinformatics/btw006" licence: ["GPL v3"] + identifier: biotools:mmseqs input: # Only when we have meta - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - db_result: - type: directory - description: an MMseqs2 database with result data - - meta2: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - db_query: - type: directory - description: an MMseqs2 database with query data - - meta3: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - db_target: - type: directory - description: an MMseqs2 database with target data + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - db_result: + type: directory + description: an MMseqs2 database with result data + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - db_query: + type: directory + description: an MMseqs2 database with query data + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - db_target: + type: directory + description: an MMseqs2 database with target data output: #Only when we have meta - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - tsv: - type: file - description: The resulting tsv file created using the query, target and result MMseqs databases - pattern: "*.{tsv}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.tsv": + type: file + description: The resulting tsv file created using the query, target and result + MMseqs databases + pattern: "*.{tsv}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Joon-Klaps" maintainers: diff --git a/modules/nf-core/mmseqs/createtsv/tests/main.nf.test.snap b/modules/nf-core/mmseqs/createtsv/tests/main.nf.test.snap index 1087de88..a70f839f 100644 --- a/modules/nf-core/mmseqs/createtsv/tests/main.nf.test.snap +++ b/modules/nf-core/mmseqs/createtsv/tests/main.nf.test.snap @@ -12,7 +12,7 @@ ] ], "1": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ], "tsv": [ [ @@ -24,15 +24,15 @@ ] ], "versions": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-07-12T13:55:17.642787" + "timestamp": "2025-01-20T17:29:15.220926" }, "mmseqs/createtsv - sarscov2 - cluster - stub": { "content": [ @@ -47,7 +47,7 @@ ] ], "1": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ], "tsv": [ [ @@ -59,15 +59,15 @@ ] ], "versions": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-07-12T13:55:33.645454" + "timestamp": "2025-01-20T17:29:32.089204" }, "mmseqs/createtsv - bacteroides_fragilis - taxonomy": { "content": [ @@ -82,7 +82,7 @@ ] ], "1": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ], "tsv": [ [ @@ -94,15 +94,15 @@ ] ], "versions": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-07-12T13:54:45.718678" + "timestamp": "2025-01-20T17:28:41.472818" }, "mmseqs/createtsv - sarscov2 - cluster": { "content": [ @@ -113,11 +113,11 @@ "id": "test_result", "single_end": true }, - "test_result.tsv:md5,4e7ba50ce2879660dc6595286bf0d097" + "test_result.tsv:md5,c81449fb936b76aad6f925b965e84bc5" ] ], "1": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ], "tsv": [ [ @@ -125,18 +125,18 @@ "id": "test_result", "single_end": true }, - "test_result.tsv:md5,4e7ba50ce2879660dc6595286bf0d097" + "test_result.tsv:md5,c81449fb936b76aad6f925b965e84bc5" ] ], "versions": [ - "versions.yml:md5,20a853f50c920d431e5ab7593ca79e6f" + "versions.yml:md5,ce808eb9a57e201a48afec56168f9e77" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-07-12T13:55:02.731974" + "timestamp": "2025-01-20T17:28:58.633976" } } \ No newline at end of file diff --git a/modules/nf-core/mmseqs/createtsv/tests/tags.yml b/modules/nf-core/mmseqs/createtsv/tests/tags.yml deleted file mode 100644 index e27827f5..00000000 --- a/modules/nf-core/mmseqs/createtsv/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -mmseqs/createtsv: - - "modules/nf-core/mmseqs/createtsv/**" diff --git a/modules/nf-core/mmseqs/databases/environment.yml b/modules/nf-core/mmseqs/databases/environment.yml index 3bf8437d..69afa609 100644 --- a/modules/nf-core/mmseqs/databases/environment.yml +++ b/modules/nf-core/mmseqs/databases/environment.yml @@ -1,7 +1,7 @@ -name: mmseqs_databases +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::mmseqs2=15.6f452 + - bioconda::mmseqs2=17.b804f diff --git a/modules/nf-core/mmseqs/databases/main.nf b/modules/nf-core/mmseqs/databases/main.nf index 3e228b29..51d54ab7 100644 --- a/modules/nf-core/mmseqs/databases/main.nf +++ b/modules/nf-core/mmseqs/databases/main.nf @@ -4,15 +4,15 @@ process MMSEQS_DATABASES { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mmseqs2:15.6f452--pl5321h6a68c12_0': - 'biocontainers/mmseqs2:15.6f452--pl5321h6a68c12_0' }" + 'https://depot.galaxyproject.org/singularity/mmseqs2:17.b804f--hd6d6fdc_1': + 'biocontainers/mmseqs2:17.b804f--hd6d6fdc_1' }" input: val database output: - path "${prefix}/" , emit: database - path "versions.yml" , emit: versions + path "${prefix}/" , emit: database + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -28,7 +28,6 @@ process MMSEQS_DATABASES { ${prefix}/database \\ tmp/ \\ --threads ${task.cpus} \\ - --compressed 1 \\ ${args} cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/mmseqs/databases/meta.yml b/modules/nf-core/mmseqs/databases/meta.yml index 803a87f6..be9380fb 100644 --- a/modules/nf-core/mmseqs/databases/meta.yml +++ b/modules/nf-core/mmseqs/databases/meta.yml @@ -1,4 +1,3 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json name: "mmseqs_databases" description: Download an mmseqs-formatted database @@ -9,24 +8,29 @@ keywords: - searching tools: - "mmseqs": - description: "MMseqs2: ultra fast and sensitive sequence search and clustering suite" + description: "MMseqs2: ultra fast and sensitive sequence search and clustering + suite" homepage: "https://github.com/soedinglab/MMseqs2" documentation: "https://mmseqs.com/latest/userguide.pdf" tool_dev_url: "https://github.com/soedinglab/MMseqs2" doi: "10.1093/bioinformatics/btw006" licence: ["GPL v3"] + identifier: biotools:mmseqs input: - - database: - type: string - description: Database available through the mmseqs2 databases interface - see https://github.com/soedinglab/MMseqs2/wiki#downloading-databases for details + - - database: + type: string + description: Database available through the mmseqs2 databases interface - see + https://github.com/soedinglab/MMseqs2/wiki#downloading-databases for details output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - database: - type: directory - description: Directory containing processed mmseqs database + - ${prefix}/: + type: directory + description: Directory containing processed mmseqs database + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@prototaxites" maintainers: diff --git a/modules/nf-core/mmseqs/databases/tests/main.nf.test b/modules/nf-core/mmseqs/databases/tests/main.nf.test new file mode 100644 index 00000000..3fe5d200 --- /dev/null +++ b/modules/nf-core/mmseqs/databases/tests/main.nf.test @@ -0,0 +1,55 @@ + +nextflow_process { + + name "Test Process MMSEQS_DATABASES" + script "../main.nf" + process "MMSEQS_DATABASES" + + tag "modules" + tag "modules_nfcore" + tag "mmseqs" + tag "mmseqs/databases" + + test("test-mmseqs-databases") { + + when { + process { + """ + input[0] = "SILVA" + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + file(process.out.database[0]).listFiles().collect { it.name }.toSorted(), // unstable + process.out.versions + ).match() + } + ) + } + } + + test("test-mmseqs-databases-stub") { + options '-stub' + when { + process { + """ + input[0] = "SILVA" + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/mmseqs/databases/tests/main.nf.test.snap b/modules/nf-core/mmseqs/databases/tests/main.nf.test.snap new file mode 100644 index 00000000..2805e1c0 --- /dev/null +++ b/modules/nf-core/mmseqs/databases/tests/main.nf.test.snap @@ -0,0 +1,74 @@ +{ + "test-mmseqs-databases": { + "content": [ + [ + "database", + "database.dbtype", + "database.index", + "database.lookup", + "database.source", + "database.version", + "database_h", + "database_h.dbtype", + "database_h.index", + "database_mapping", + "database_taxonomy" + ], + [ + "versions.yml:md5,387bbb2d1d6bac273e8158743af4c856" + ] + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.3" + }, + "timestamp": "2025-01-20T15:32:42.284982" + }, + "test-mmseqs-databases-stub": { + "content": [ + { + "0": [ + [ + "database:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.lookup:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.source:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.version:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_h:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_h.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_h.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_mapping:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_taxonomy:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,49082428ec974e4ddb09a6ca2e9f21b3" + ], + "database": [ + [ + "database:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.lookup:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.source:md5,d41d8cd98f00b204e9800998ecf8427e", + "database.version:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_h:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_h.dbtype:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_h.index:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_mapping:md5,d41d8cd98f00b204e9800998ecf8427e", + "database_taxonomy:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,49082428ec974e4ddb09a6ca2e9f21b3" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-05T17:00:20.527628" + } +} \ No newline at end of file diff --git a/modules/nf-core/mmseqs/taxonomy/environment.yml b/modules/nf-core/mmseqs/taxonomy/environment.yml index fa40c277..69afa609 100644 --- a/modules/nf-core/mmseqs/taxonomy/environment.yml +++ b/modules/nf-core/mmseqs/taxonomy/environment.yml @@ -1,9 +1,7 @@ --- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json -name: "mmseqs_taxonomy" channels: - conda-forge - bioconda - - defaults dependencies: - - "bioconda::mmseqs2=15.6f452" + - bioconda::mmseqs2=17.b804f diff --git a/modules/nf-core/mmseqs/taxonomy/main.nf b/modules/nf-core/mmseqs/taxonomy/main.nf index 54849885..d73bf03f 100644 --- a/modules/nf-core/mmseqs/taxonomy/main.nf +++ b/modules/nf-core/mmseqs/taxonomy/main.nf @@ -4,8 +4,8 @@ process MMSEQS_TAXONOMY { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mmseqs2:15.6f452--pl5321h6a68c12_0': - 'biocontainers/mmseqs2:15.6f452--pl5321h6a68c12_0' }" + 'https://depot.galaxyproject.org/singularity/mmseqs2:17.b804f--hd6d6fdc_1': + 'biocontainers/mmseqs2:17.b804f--hd6d6fdc_1' }" input: tuple val(meta), path(db_query) @@ -38,8 +38,7 @@ process MMSEQS_TAXONOMY { ${prefix}_taxonomy/${prefix} \\ tmp1 \\ $args \\ - --threads ${task.cpus} \\ - --compressed 1 + --threads ${task.cpus} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/mmseqs/taxonomy/meta.yml b/modules/nf-core/mmseqs/taxonomy/meta.yml index d836029c..15756feb 100644 --- a/modules/nf-core/mmseqs/taxonomy/meta.yml +++ b/modules/nf-core/mmseqs/taxonomy/meta.yml @@ -1,7 +1,7 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json name: "mmseqs_taxonomy" -description: Computes the lowest common ancestor by identifying the query sequence homologs against the target database. +description: Computes the lowest common ancestor by identifying the query sequence + homologs against the target database. keywords: - protein sequence - nucleotide sequence @@ -11,37 +11,41 @@ keywords: - mmseqs2 tools: - "mmseqs": - description: "MMseqs2: ultra fast and sensitive sequence search and clustering suite" + description: "MMseqs2: ultra fast and sensitive sequence search and clustering + suite" homepage: "https://github.com/soedinglab/MMseqs2" documentation: "https://mmseqs.com/latest/userguide.pdf" tool_dev_url: "https://github.com/soedinglab/MMseqs2" doi: "10.1093/bioinformatics/btw006" licence: ["GPL v3"] + identifier: biotools:mmseqs input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - db_query: - type: directory - description: An MMseqs2 database with query data - - db_target: - type: directory - description: an MMseqs2 database with target data including the taxonomy classification + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - db_query: + type: directory + description: An MMseqs2 database with query data + - - db_target: + type: directory + description: an MMseqs2 database with target data including the taxonomy classification output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - db_taxonomy: - type: directory - description: An MMseqs2 database with target data including the taxonomy classification + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - ${prefix}_taxonomy: + type: directory + description: An MMseqs2 database with target data including the taxonomy classification - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@darcy220606" maintainers: diff --git a/modules/nf-core/mmseqs/taxonomy/tests/main.nf.test.snap b/modules/nf-core/mmseqs/taxonomy/tests/main.nf.test.snap index 225680ac..4402c731 100644 --- a/modules/nf-core/mmseqs/taxonomy/tests/main.nf.test.snap +++ b/modules/nf-core/mmseqs/taxonomy/tests/main.nf.test.snap @@ -8,14 +8,14 @@ "test_query.index" ], [ - "versions.yml:md5,a8f24dca956a1c84099ff129f826c63f" + "versions.yml:md5,d86f3223ff4a4d664228707b581dca8a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-08-09T10:11:53.632751" + "timestamp": "2025-01-20T16:28:40.091017" }, "mmseqs/taxonomy - bacteroides_fragilis - genome_nt - stub": { "content": [ @@ -59,7 +59,7 @@ ] ], "1": [ - "versions.yml:md5,a8f24dca956a1c84099ff129f826c63f" + "versions.yml:md5,d86f3223ff4a4d664228707b581dca8a" ], "db_taxonomy": [ [ @@ -100,14 +100,14 @@ ] ], "versions": [ - "versions.yml:md5,a8f24dca956a1c84099ff129f826c63f" + "versions.yml:md5,d86f3223ff4a4d664228707b581dca8a" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-08-09T10:12:00.148815" + "timestamp": "2025-01-20T16:48:57.634552" } } \ No newline at end of file diff --git a/modules/nf-core/multiqc/environment.yml b/modules/nf-core/multiqc/environment.yml index 0eb9d9c9..c3b3413f 100644 --- a/modules/nf-core/multiqc/environment.yml +++ b/modules/nf-core/multiqc/environment.yml @@ -1,7 +1,7 @@ -name: multiqc +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::multiqc=1.24 + - bioconda::multiqc=1.27 diff --git a/modules/nf-core/multiqc/main.nf b/modules/nf-core/multiqc/main.nf index 9790c23c..58d9313c 100644 --- a/modules/nf-core/multiqc/main.nf +++ b/modules/nf-core/multiqc/main.nf @@ -3,8 +3,8 @@ process MULTIQC { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/multiqc:1.24--pyhdfd78af_0' : - 'biocontainers/multiqc:1.24--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/multiqc:1.27--pyhdfd78af_0' : + 'biocontainers/multiqc:1.27--pyhdfd78af_0' }" input: path multiqc_files, stageAs: "?/*" @@ -28,7 +28,7 @@ process MULTIQC { def prefix = task.ext.prefix ? "--filename ${task.ext.prefix}.html" : '' def config = multiqc_config ? "--config $multiqc_config" : '' def extra_config = extra_multiqc_config ? "--config $extra_multiqc_config" : '' - def logo = multiqc_logo ? /--cl-config 'custom_logo: "${multiqc_logo}"'/ : '' + def logo = multiqc_logo ? "--cl-config 'custom_logo: \"${multiqc_logo}\"'" : '' def replace = replace_names ? "--replace-names ${replace_names}" : '' def samples = sample_names ? "--sample-names ${sample_names}" : '' """ @@ -52,7 +52,7 @@ process MULTIQC { stub: """ mkdir multiqc_data - touch multiqc_plots + mkdir multiqc_plots touch multiqc_report.html cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/multiqc/meta.yml b/modules/nf-core/multiqc/meta.yml index 382c08cb..b16c1879 100644 --- a/modules/nf-core/multiqc/meta.yml +++ b/modules/nf-core/multiqc/meta.yml @@ -1,5 +1,6 @@ name: multiqc -description: Aggregate results from bioinformatics analyses across many samples into a single report +description: Aggregate results from bioinformatics analyses across many samples into + a single report keywords: - QC - bioinformatics tools @@ -12,53 +13,59 @@ tools: homepage: https://multiqc.info/ documentation: https://multiqc.info/docs/ licence: ["GPL-3.0-or-later"] + identifier: biotools:multiqc input: - - multiqc_files: - type: file - description: | - List of reports / files recognised by MultiQC, for example the html and zip output of FastQC - - multiqc_config: - type: file - description: Optional config yml for MultiQC - pattern: "*.{yml,yaml}" - - extra_multiqc_config: - type: file - description: Second optional config yml for MultiQC. Will override common sections in multiqc_config. - pattern: "*.{yml,yaml}" - - multiqc_logo: - type: file - description: Optional logo file for MultiQC - pattern: "*.{png}" - - replace_names: - type: file - description: | - Optional two-column sample renaming file. First column a set of - patterns, second column a set of corresponding replacements. Passed via - MultiQC's `--replace-names` option. - pattern: "*.{tsv}" - - sample_names: - type: file - description: | - Optional TSV file with headers, passed to the MultiQC --sample_names - argument. - pattern: "*.{tsv}" + - - multiqc_files: + type: file + description: | + List of reports / files recognised by MultiQC, for example the html and zip output of FastQC + - - multiqc_config: + type: file + description: Optional config yml for MultiQC + pattern: "*.{yml,yaml}" + - - extra_multiqc_config: + type: file + description: Second optional config yml for MultiQC. Will override common sections + in multiqc_config. + pattern: "*.{yml,yaml}" + - - multiqc_logo: + type: file + description: Optional logo file for MultiQC + pattern: "*.{png}" + - - replace_names: + type: file + description: | + Optional two-column sample renaming file. First column a set of + patterns, second column a set of corresponding replacements. Passed via + MultiQC's `--replace-names` option. + pattern: "*.{tsv}" + - - sample_names: + type: file + description: | + Optional TSV file with headers, passed to the MultiQC --sample_names + argument. + pattern: "*.{tsv}" output: - report: - type: file - description: MultiQC report file - pattern: "multiqc_report.html" + - "*multiqc_report.html": + type: file + description: MultiQC report file + pattern: "multiqc_report.html" - data: - type: directory - description: MultiQC data dir - pattern: "multiqc_data" + - "*_data": + type: directory + description: MultiQC data dir + pattern: "multiqc_data" - plots: - type: file - description: Plots created by MultiQC - pattern: "*_data" + - "*_plots": + type: file + description: Plots created by MultiQC + pattern: "*_data" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@bunop" diff --git a/modules/nf-core/multiqc/tests/main.nf.test b/modules/nf-core/multiqc/tests/main.nf.test index 6aa27f4c..33316a7d 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test +++ b/modules/nf-core/multiqc/tests/main.nf.test @@ -8,6 +8,8 @@ nextflow_process { tag "modules_nfcore" tag "multiqc" + config "./nextflow.config" + test("sarscov2 single-end [fastqc]") { when { diff --git a/modules/nf-core/multiqc/tests/main.nf.test.snap b/modules/nf-core/multiqc/tests/main.nf.test.snap index ef35f6d5..7b7c1322 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test.snap +++ b/modules/nf-core/multiqc/tests/main.nf.test.snap @@ -2,14 +2,14 @@ "multiqc_versions_single": { "content": [ [ - "versions.yml:md5,0c5c5c2a79011c26b34b0b0e80b7c8e2" + "versions.yml:md5,8f3b8c1cec5388cf2708be948c9fa42f" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-10T12:41:34.562023" + "timestamp": "2025-01-27T09:29:57.631982377" }, "multiqc_stub": { "content": [ @@ -17,25 +17,25 @@ "multiqc_report.html", "multiqc_data", "multiqc_plots", - "versions.yml:md5,0c5c5c2a79011c26b34b0b0e80b7c8e2" + "versions.yml:md5,8f3b8c1cec5388cf2708be948c9fa42f" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-10T11:27:11.933869532" + "timestamp": "2025-01-27T09:30:34.743726958" }, "multiqc_versions_config": { "content": [ [ - "versions.yml:md5,0c5c5c2a79011c26b34b0b0e80b7c8e2" + "versions.yml:md5,8f3b8c1cec5388cf2708be948c9fa42f" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-10T11:26:56.709849369" + "timestamp": "2025-01-27T09:30:21.44383553" } -} +} \ No newline at end of file diff --git a/modules/nf-core/multiqc/tests/nextflow.config b/modules/nf-core/multiqc/tests/nextflow.config new file mode 100644 index 00000000..c537a6a3 --- /dev/null +++ b/modules/nf-core/multiqc/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'MULTIQC' { + ext.prefix = null + } +} diff --git a/modules/nf-core/prodigal/environment.yml b/modules/nf-core/prodigal/environment.yml index 85746534..b2c7efcf 100644 --- a/modules/nf-core/prodigal/environment.yml +++ b/modules/nf-core/prodigal/environment.yml @@ -1,8 +1,8 @@ -name: prodigal +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::prodigal=2.6.3 - conda-forge::pigz=2.6 diff --git a/modules/nf-core/prodigal/meta.yml b/modules/nf-core/prodigal/meta.yml index a5d15d58..7d3d459e 100644 --- a/modules/nf-core/prodigal/meta.yml +++ b/modules/nf-core/prodigal/meta.yml @@ -1,55 +1,78 @@ name: prodigal -description: Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program +description: Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a + microbial (bacterial and archaeal) gene finding program keywords: - prokaryotes - gene finding - microbial tools: - prodigal: - description: Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program + description: Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) + is a microbial (bacterial and archaeal) gene finding program homepage: https://github.com/hyattpd/Prodigal documentation: https://github.com/hyattpd/prodigal/wiki tool_dev_url: https://github.com/hyattpd/Prodigal doi: "10.1186/1471-2105-11-119" licence: ["GPL v3"] + identifier: biotools:prodigal input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - genome: - type: file - description: fasta/fasta.gz file - - output_format: - type: string - description: Output format ("gbk"/"gff"/"sqn"/"sco") + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - genome: + type: file + description: fasta/fasta.gz file + - - output_format: + type: string + description: Output format ("gbk"/"gff"/"sqn"/"sco") output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - gene_annotations: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${output_format}.gz: + type: file + description: gene annotations in output_format given as input + pattern: "*.{output_format}" - nucleotide_fasta: - type: file - description: nucleotide sequences file - pattern: "*.{fna}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.fna.gz: + type: file + description: nucleotide sequences file + pattern: "*.{fna}" - amino_acid_fasta: - type: file - description: protein translations file - pattern: "*.{faa}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.faa.gz: + type: file + description: protein translations file + pattern: "*.{faa}" - all_gene_annotations: - type: file - description: complete starts file - pattern: "*.{_all.txt}" - - gene_annotations: - type: file - description: gene annotations in output_format given as input - pattern: "*.{output_format}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}_all.txt.gz: + type: file + description: complete starts file + pattern: "*.{_all.txt}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@grst" maintainers: diff --git a/modules/nf-core/prokka/environment.yml b/modules/nf-core/prokka/environment.yml index d7c44d5a..b4687037 100644 --- a/modules/nf-core/prokka/environment.yml +++ b/modules/nf-core/prokka/environment.yml @@ -1,7 +1,8 @@ -name: prokka +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::prokka=1.14.6 + - conda-forge::openjdk=8.0.412 diff --git a/modules/nf-core/prokka/main.nf b/modules/nf-core/prokka/main.nf index adfda037..bf5e64fc 100644 --- a/modules/nf-core/prokka/main.nf +++ b/modules/nf-core/prokka/main.nf @@ -1,11 +1,11 @@ process PROKKA { - tag "$meta.id" + tag "${meta.id}" label 'process_low' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/prokka:1.14.6--pl5321hdfd78af_4' : - 'biocontainers/prokka:1.14.6--pl5321hdfd78af_4' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/3a/3af46b047c8fe84112adeaecf300878217c629b97f111f923ecf327656ddd141/data' : + 'community.wave.seqera.io/library/prokka_openjdk:10546cadeef11472' }" input: tuple val(meta), path(fasta) @@ -31,18 +31,49 @@ process PROKKA { task.ext.when == null || task.ext.when script: - def args = task.ext.args ?: '' - prefix = task.ext.prefix ?: "${meta.id}" - def proteins_opt = proteins ? "--proteins ${proteins[0]}" : "" - def prodigal_tf = prodigal_tf ? "--prodigaltf ${prodigal_tf[0]}" : "" + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + def input = fasta.toString() - ~/\.gz$/ + def decompress = fasta.getExtension() == "gz" ? "gunzip -c ${fasta} > ${input}" : "" + def cleanup = fasta.getExtension() == "gz" ? "rm ${input}" : "" + def proteins_opt = proteins ? "--proteins ${proteins}" : "" + def prodigal_tf_in = prodigal_tf ? "--prodigaltf ${prodigal_tf}" : "" """ + ${decompress} + prokka \\ - $args \\ - --cpus $task.cpus \\ - --prefix $prefix \\ - $proteins_opt \\ - $prodigal_tf \\ - $fasta + ${args} \\ + --cpus ${task.cpus} \\ + --prefix ${prefix} \\ + ${proteins_opt} \\ + ${prodigal_tf_in} \\ + ${input} + + ${cleanup} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + prokka: \$(echo \$(prokka --version 2>&1) | sed 's/^.*prokka //') + END_VERSIONS + """ + + stub: + prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir ${prefix} + touch ${prefix}/${prefix}.gff + touch ${prefix}/${prefix}.gbk + touch ${prefix}/${prefix}.fna + touch ${prefix}/${prefix}.faa + touch ${prefix}/${prefix}.ffn + touch ${prefix}/${prefix}.sqn + touch ${prefix}/${prefix}.fsa + touch ${prefix}/${prefix}.tbl + touch ${prefix}/${prefix}.err + touch ${prefix}/${prefix}.log + touch ${prefix}/${prefix}.txt + touch ${prefix}/${prefix}.tsv + touch ${prefix}/${prefix}.gff cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/prokka/meta.yml b/modules/nf-core/prokka/meta.yml index 9d82ffac..90745735 100644 --- a/modules/nf-core/prokka/meta.yml +++ b/modules/nf-core/prokka/meta.yml @@ -10,80 +10,151 @@ tools: homepage: https://github.com/tseemann/prokka doi: "10.1093/bioinformatics/btu153" licence: ["GPL v2"] + identifier: biotools:prokka input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: | - FASTA file to be annotated. Has to contain at least a non-empty string dummy value. - - proteins: - type: file - description: FASTA file of trusted proteins to first annotate from (optional) - - prodigal_tf: - type: file - description: Training file to use for Prodigal (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: | + FASTA file to be annotated. Has to contain at least a non-empty string dummy value. + - - proteins: + type: file + description: FASTA file of trusted proteins to first annotate from (optional) + - - prodigal_tf: + type: file + description: Training file to use for Prodigal (optional) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - gff: - type: file - description: annotation in GFF3 format, containing both sequences and annotations - pattern: "*.{gff}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.gff: + type: file + description: annotation in GFF3 format, containing both sequences and annotations + pattern: "*.{gff}" - gbk: - type: file - description: annotation in GenBank format, containing both sequences and annotations - pattern: "*.{gbk}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.gbk: + type: file + description: annotation in GenBank format, containing both sequences and annotations + pattern: "*.{gbk}" - fna: - type: file - description: nucleotide FASTA file of the input contig sequences - pattern: "*.{fna}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.fna: + type: file + description: nucleotide FASTA file of the input contig sequences + pattern: "*.{fna}" - faa: - type: file - description: protein FASTA file of the translated CDS sequences - pattern: "*.{faa}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.faa: + type: file + description: protein FASTA file of the translated CDS sequences + pattern: "*.{faa}" - ffn: - type: file - description: nucleotide FASTA file of all the prediction transcripts (CDS, rRNA, tRNA, tmRNA, misc_RNA) - pattern: "*.{ffn}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.ffn: + type: file + description: nucleotide FASTA file of all the prediction transcripts (CDS, rRNA, + tRNA, tmRNA, misc_RNA) + pattern: "*.{ffn}" - sqn: - type: file - description: an ASN1 format "Sequin" file for submission to Genbank - pattern: "*.{sqn}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.sqn: + type: file + description: an ASN1 format "Sequin" file for submission to Genbank + pattern: "*.{sqn}" - fsa: - type: file - description: nucleotide FASTA file of the input contig sequences, used by "tbl2asn" to create the .sqn file - pattern: "*.{fsa}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.fsa: + type: file + description: nucleotide FASTA file of the input contig sequences, used by "tbl2asn" + to create the .sqn file + pattern: "*.{fsa}" - tbl: - type: file - description: feature Table file, used by "tbl2asn" to create the .sqn file - pattern: "*.{tbl}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.tbl: + type: file + description: feature Table file, used by "tbl2asn" to create the .sqn file + pattern: "*.{tbl}" - err: - type: file - description: unacceptable annotations - the NCBI discrepancy report. - pattern: "*.{err}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.err: + type: file + description: unacceptable annotations - the NCBI discrepancy report. + pattern: "*.{err}" - log: - type: file - description: contains all the output that Prokka produced during its run - pattern: "*.{log}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.log: + type: file + description: contains all the output that Prokka produced during its run + pattern: "*.{log}" - txt: - type: file - description: statistics relating to the annotated features found - pattern: "*.{txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.txt: + type: file + description: statistics relating to the annotated features found + pattern: "*.{txt}" - tsv: - type: file - description: tab-separated file of all features (locus_tag,ftype,len_bp,gene,EC_number,COG,product) - pattern: "*.{tsv}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*.tsv: + type: file + description: tab-separated file of all features (locus_tag,ftype,len_bp,gene,EC_number,COG,product) + pattern: "*.{tsv}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@rpetit3" maintainers: diff --git a/modules/nf-core/prokka/tests/main.nf.test b/modules/nf-core/prokka/tests/main.nf.test index dca19bba..68150b33 100644 --- a/modules/nf-core/prokka/tests/main.nf.test +++ b/modules/nf-core/prokka/tests/main.nf.test @@ -47,4 +47,69 @@ nextflow_process { } + test("Prokka - sarscov2 - genome.fasta.gz") { + + when { + process { + """ + input[0] = Channel.fromList([ + tuple([ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true)) + ]) + input[1] = [] + input[2] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.gbk.get(0).get(1)).exists() }, + { assert path(process.out.log.get(0).get(1)).exists() }, + { assert path(process.out.sqn.get(0).get(1)).exists() }, + { assert snapshot( + process.out.gff, + process.out.fna, + process.out.faa, + process.out.ffn, + process.out.fsa, + process.out.tbl, + process.out.err, + process.out.txt, + process.out.tsv, + process.out.versions + ).match() + } + ) + } + + } + + test("Prokka - sarscov2 - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.fromList([ + tuple([ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true)) + ]) + input[1] = [] + input[2] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + } diff --git a/modules/nf-core/prokka/tests/main.nf.test.snap b/modules/nf-core/prokka/tests/main.nf.test.snap index 874c989d..35713a8f 100644 --- a/modules/nf-core/prokka/tests/main.nf.test.snap +++ b/modules/nf-core/prokka/tests/main.nf.test.snap @@ -91,5 +91,331 @@ "nextflow": "24.04.3" }, "timestamp": "2024-07-30T12:34:20.447734" + }, + "Prokka - sarscov2 - genome.fasta.gz": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.gff:md5,5dbfb8fcf2db020564c16045976a0933" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.fna:md5,787307f29a263e5657cc276ebbf7e2b3" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.faa:md5,a4ceda83262b3c222a6b1f508fb9e24b" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.ffn:md5,80f474b5367b7ea5ed23791935f65e34" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.fsa:md5,71bbefcb7f12046bcd3263f58cfd5404" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.tbl:md5,d8f816a066ced94b62d9618b13fb8add" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.err:md5,b3daedc646fddd422824e2b3e5e9229d" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.txt:md5,b40e485ffc8eaf1feacf8d79d9751a33" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,da7c720c3018c5081d6a70b517b7d450" + ] + ], + [ + "versions.yml:md5,e83a22fe02167e290d90853b45650db9" + ] + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.3" + }, + "timestamp": "2024-12-19T09:48:05.110188714" + }, + "Prokka - sarscov2 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.gff:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + "test.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + "versions.yml:md5,e83a22fe02167e290d90853b45650db9" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fna:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.faa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test.ffn:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.sqn:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fsa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tbl:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + [ + { + "id": "test", + "single_end": false + }, + "test.err:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "9": [ + [ + { + "id": "test", + "single_end": false + }, + "test.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "err": [ + [ + { + "id": "test", + "single_end": false + }, + "test.err:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "faa": [ + [ + { + "id": "test", + "single_end": false + }, + "test.faa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "ffn": [ + [ + { + "id": "test", + "single_end": false + }, + "test.ffn:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fna": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fna:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fsa": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fsa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "gbk": [ + [ + { + "id": "test", + "single_end": false + }, + "test.gbk:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "gff": [ + [ + { + "id": "test", + "single_end": false + }, + "test.gff:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "sqn": [ + [ + { + "id": "test", + "single_end": false + }, + "test.sqn:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbl": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tbl:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tsv": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "txt": [ + [ + { + "id": "test", + "single_end": false + }, + "test.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e83a22fe02167e290d90853b45650db9" + ] + } + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.3" + }, + "timestamp": "2025-01-06T10:51:57.362187225" } } \ No newline at end of file diff --git a/modules/nf-core/pyrodigal/environment.yml b/modules/nf-core/pyrodigal/environment.yml index 3e538e8c..26be9f32 100644 --- a/modules/nf-core/pyrodigal/environment.yml +++ b/modules/nf-core/pyrodigal/environment.yml @@ -1,10 +1,8 @@ --- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json -name: "pyrodigal" channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::pyrodigal=3.3.0 + - bioconda::pyrodigal=3.6.3 - conda-forge::pigz=2.8 diff --git a/modules/nf-core/pyrodigal/main.nf b/modules/nf-core/pyrodigal/main.nf index 7cb97594..9cbe8fcc 100644 --- a/modules/nf-core/pyrodigal/main.nf +++ b/modules/nf-core/pyrodigal/main.nf @@ -1,11 +1,11 @@ process PYRODIGAL { tag "$meta.id" - label 'process_single' + label 'process_medium' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mulled-v2-2fe9a8ce513c91df34b43a6610df94c3a2eb3bd0:47e7d40834619419f202394563267d74cef857be-0': - 'biocontainers/mulled-v2-2fe9a8ce513c91df34b43a6610df94c3a2eb3bd0:47e7d40834619419f202394563267d74cef857be-0' }" + 'https://depot.galaxyproject.org/singularity/mulled-v2-2fe9a8ce513c91df34b43a6610df94c3a2eb3bd0:da1134ad604a59a6f439bdcc3f6df690eba47e9a-0': + 'biocontainers/mulled-v2-2fe9a8ce513c91df34b43a6610df94c3a2eb3bd0:da1134ad604a59a6f439bdcc3f6df690eba47e9a-0' }" input: tuple val(meta), path(fasta) @@ -28,6 +28,7 @@ process PYRODIGAL { pigz -cdf ${fasta} > pigz_fasta.fna pyrodigal \\ + -j ${task.cpus} \\ $args \\ -i pigz_fasta.fna \\ -f $output_format \\ diff --git a/modules/nf-core/pyrodigal/meta.yml b/modules/nf-core/pyrodigal/meta.yml index 0967606f..d8394d07 100644 --- a/modules/nf-core/pyrodigal/meta.yml +++ b/modules/nf-core/pyrodigal/meta.yml @@ -1,5 +1,6 @@ name: "pyrodigal" -description: Pyrodigal is a Python module that provides bindings to Prodigal, a fast, reliable protein-coding gene prediction for prokaryotic genomes. +description: Pyrodigal is a Python module that provides bindings to Prodigal, a fast, + reliable protein-coding gene prediction for prokaryotic genomes. keywords: - sort - annotation @@ -7,52 +8,75 @@ keywords: - prokaryote tools: - "pyrodigal": - description: "Pyrodigal is a Python module that provides bindings to Prodigal (ORF finder for microbial sequences) using Cython." + description: "Pyrodigal is a Python module that provides bindings to Prodigal + (ORF finder for microbial sequences) using Cython." homepage: "https://pyrodigal.readthedocs.org/" documentation: "https://pyrodigal.readthedocs.org/" tool_dev_url: "https://github.com/althonos/pyrodigal/" doi: "10.21105/joss.04296" licence: ["GPL v3"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: FASTA file - pattern: "*.{fasta.gz,fa.gz,fna.gz}" - - output_format: - type: string - description: Output format - pattern: "{gbk,gff}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: FASTA file + pattern: "*.{fasta.gz,fa.gz,fna.gz}" + - - output_format: + type: string + description: Output format + pattern: "{gbk,gff}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - annotations: - type: file - description: Gene annotations. The file format is specified via input channel "output_format". - pattern: "*.{gbk,gff}.gz" - - faa: - type: file - description: protein translations file - pattern: "*.{faa.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.${output_format}.gz": + type: file + description: Gene annotations. The file format is specified via input channel + "output_format". + pattern: "*.{gbk,gff}.gz" - fna: - type: file - description: nucleotide sequences file - pattern: "*.{fna.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fna.gz": + type: file + description: nucleotide sequences file + pattern: "*.{fna.gz}" + - faa: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.faa.gz": + type: file + description: protein translations file + pattern: "*.{faa.gz}" - score: - type: file - description: all potential genes (with scores) - pattern: "*.{score.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.score.gz": + type: file + description: all potential genes (with scores) + pattern: "*.{score.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@louperelo" maintainers: diff --git a/modules/nf-core/pyrodigal/tests/main.nf.test.snap b/modules/nf-core/pyrodigal/tests/main.nf.test.snap index 827fdaaa..3d56b9f1 100644 --- a/modules/nf-core/pyrodigal/tests/main.nf.test.snap +++ b/modules/nf-core/pyrodigal/tests/main.nf.test.snap @@ -5,21 +5,21 @@ "test.fna.gz", "test.faa.gz", "test.score.gz", - "versions.yml:md5,4aab54554829148e01cc0dc7bf6cb5d3" + "versions.yml:md5,296cc4ed71c8eb16bbc6978fe6299b77" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-03-18T15:42:12.012112014" + "timestamp": "2024-12-02T15:17:12.218638993" }, "pyrodigal - sarscov2 - gbk": { "content": [ [ " CDS 310..13476", " /codon_start=1", - " /inference=\"ab initio prediction:pyrodigal:3.3.0\"", + " /inference=\"ab initio prediction:pyrodigal:3.6.3\"", " /locus_tag=\"MT192765.1_1\"", " /transl_table=11", " /translation=\"MPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGVLP", @@ -51,18 +51,18 @@ "id": "test", "single_end": false }, - "test.score.gz:md5,c0703a9e662ae0b21c7bbb082ef3fb5f" + "test.score.gz:md5,63e6975e705be1fe749eb54bd4ea478e" ] ], [ - "versions.yml:md5,4aab54554829148e01cc0dc7bf6cb5d3" + "versions.yml:md5,296cc4ed71c8eb16bbc6978fe6299b77" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-07-30T06:09:40.289778252" + "timestamp": "2024-12-02T15:17:01.228814939" }, "pyrodigal - sarscov2 - gff": { "content": [ @@ -73,7 +73,7 @@ "id": "test", "single_end": false }, - "test.gff.gz:md5,8fcd2d93131cf9fb0c82b81db059ad27" + "test.gff.gz:md5,898c1e24e71fa108981597b8bb32110f" ] ], "1": [ @@ -100,11 +100,11 @@ "id": "test", "single_end": false }, - "test.score.gz:md5,c0703a9e662ae0b21c7bbb082ef3fb5f" + "test.score.gz:md5,63e6975e705be1fe749eb54bd4ea478e" ] ], "4": [ - "versions.yml:md5,4aab54554829148e01cc0dc7bf6cb5d3" + "versions.yml:md5,296cc4ed71c8eb16bbc6978fe6299b77" ], "annotations": [ [ @@ -112,7 +112,7 @@ "id": "test", "single_end": false }, - "test.gff.gz:md5,8fcd2d93131cf9fb0c82b81db059ad27" + "test.gff.gz:md5,898c1e24e71fa108981597b8bb32110f" ] ], "faa": [ @@ -139,19 +139,19 @@ "id": "test", "single_end": false }, - "test.score.gz:md5,c0703a9e662ae0b21c7bbb082ef3fb5f" + "test.score.gz:md5,63e6975e705be1fe749eb54bd4ea478e" ] ], "versions": [ - "versions.yml:md5,4aab54554829148e01cc0dc7bf6cb5d3" + "versions.yml:md5,296cc4ed71c8eb16bbc6978fe6299b77" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-03-18T15:41:55.822235843" + "timestamp": "2024-12-02T15:16:49.907998584" }, "pyrodigal - sarscov2 - gbk - stub": { "content": [ @@ -159,13 +159,13 @@ "test.fna.gz", "test.faa.gz", "test.score.gz", - "versions.yml:md5,4aab54554829148e01cc0dc7bf6cb5d3" + "versions.yml:md5,296cc4ed71c8eb16bbc6978fe6299b77" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-03-18T15:42:19.81157751" + "timestamp": "2024-12-02T15:17:22.681680508" } } \ No newline at end of file diff --git a/modules/nf-core/rgi/cardannotation/environment.yml b/modules/nf-core/rgi/cardannotation/environment.yml index f1c5872a..a3169324 100644 --- a/modules/nf-core/rgi/cardannotation/environment.yml +++ b/modules/nf-core/rgi/cardannotation/environment.yml @@ -1,7 +1,7 @@ -name: rgi_cardannotation +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::rgi=6.0.3 diff --git a/modules/nf-core/rgi/cardannotation/meta.yml b/modules/nf-core/rgi/cardannotation/meta.yml index 97e6911d..8aff020f 100644 --- a/modules/nf-core/rgi/cardannotation/meta.yml +++ b/modules/nf-core/rgi/cardannotation/meta.yml @@ -1,37 +1,46 @@ name: rgi_cardannotation -description: Preprocess the CARD database for RGI to predict antibiotic resistance from protein or nucleotide data +description: Preprocess the CARD database for RGI to predict antibiotic resistance + from protein or nucleotide data keywords: - bacteria - fasta - antibiotic resistance tools: - rgi: - description: This module preprocesses the downloaded Comprehensive Antibiotic Resistance Database (CARD) which can then be used as input for RGI. + description: This module preprocesses the downloaded Comprehensive Antibiotic + Resistance Database (CARD) which can then be used as input for RGI. homepage: https://card.mcmaster.ca documentation: https://github.com/arpcard/rgi tool_dev_url: https://github.com/arpcard/rgi doi: "10.1093/nar/gkz935" licence: ["https://card.mcmaster.ca/about"] + identifier: "" input: - - card: - type: directory - description: Directory containing the CARD database - pattern: "*/" + - - card: + type: directory + description: Directory containing the CARD database + pattern: "*/" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - db: - type: directory - description: Directory containing the processed CARD database files - pattern: "*/" + - card_database_processed: + type: directory + description: Directory containing the processed CARD database files + pattern: "*/" - tool_version: - type: string - description: The version of the tool in string format (useful for downstream tools such as hAMRronization) + - RGI_VERSION: + type: string + description: The version of the tool in string format (useful for downstream + tools such as hAMRronization) - db_version: - type: string - description: The version of the used database in string format (useful for downstream tools such as hAMRronization) + - DB_VERSION: + type: string + description: The version of the used database in string format (useful for downstream + tools such as hAMRronization) + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@rpetit3" - "@jfy133" diff --git a/modules/nf-core/rgi/cardannotation/tests/main.nf.test.snap b/modules/nf-core/rgi/cardannotation/tests/main.nf.test.snap index 5d58124d..74a2f428 100644 --- a/modules/nf-core/rgi/cardannotation/tests/main.nf.test.snap +++ b/modules/nf-core/rgi/cardannotation/tests/main.nf.test.snap @@ -46,73 +46,73 @@ "0": [ [ "CARD-Download-README.txt:md5,ca330e1d89e3a97ac6f50c86a8ca5c34", - "aro_categories.tsv:md5,ba2f33c43b199cd62ae5663125ce316e", - "aro_categories_index.tsv:md5,39f995f2356b6a0cb5fd34e3c6ffc8e1", - "aro_index.tsv:md5,b7250ed3208c8497ec2371527a689eeb", - "card.json:md5,e2cb53b1706a602d5265d2284a1fcdd5", - "card_database_v3.2.9.fasta:md5,0839d4447860694782a5db5cd6eae085", - "card_database_v3.2.9_all.fasta:md5,5295875faf06bef62ea954fef40958c3", - "nucleotide_fasta_protein_homolog_model.fasta:md5,ebcd48a6c9e14f339ffd9d2673eed803", + "aro_categories.tsv:md5,cdefc6d0169bc7a077020022be68e38b", + "aro_categories_index.tsv:md5,f99f2fed0cf357c7c3e7e39e4b880ca2", + "aro_index.tsv:md5,3052f507daff81356f4e985025928217", + "card.json:md5,c9550768ded14c01a56c98e3c4931176", + "card_database_v3.3.0.fasta:md5,b3fd50f7946aed8009c131a3c1454728", + "card_database_v3.3.0_all.fasta:md5,81ffb872759695abd1023c0b5f8fe0d5", + "nucleotide_fasta_protein_homolog_model.fasta:md5,93fcfd413dda3056612f725d5bc06356", "nucleotide_fasta_protein_knockout_model.fasta:md5,ff476b358ef70da53acf4602568a9b9b", "nucleotide_fasta_protein_overexpression_model.fasta:md5,68937e587c880153400fa8203f6a90d5", - "nucleotide_fasta_protein_variant_model.fasta:md5,1ff9cbaf0d640e2084f13751309f8176", - "nucleotide_fasta_rRNA_gene_variant_model.fasta:md5,b88fbe1d6de44b2ff2819ee63d001d75", - "protein_fasta_protein_homolog_model.fasta:md5,130a0947c60d18ef2e7d0ab886f80af3", + "nucleotide_fasta_protein_variant_model.fasta:md5,58a4644e05df59af7a918f25b61e5a22", + "nucleotide_fasta_rRNA_gene_variant_model.fasta:md5,bd53f46d630f652c9f6b7584c2126e1f", + "protein_fasta_protein_homolog_model.fasta:md5,63a89932339a665c390cebd50627f19b", "protein_fasta_protein_knockout_model.fasta:md5,6b259399e3eae3f23eaa421bbba6ba25", "protein_fasta_protein_overexpression_model.fasta:md5,758b753b821789147cdd795c654940ad", - "protein_fasta_protein_variant_model.fasta:md5,ec46ea3d9dc7ab01ec22cf265e410c88", - "shortname_antibiotics.tsv:md5,9d20abb9f6d37ed0cecc1573867ca49a", - "shortname_pathogens.tsv:md5,ae267113de686bc8f58eab5845cc343b", - "snps.txt:md5,ee6dfbe7a65f3ffdb6968822c47e4550" + "protein_fasta_protein_variant_model.fasta:md5,7fb7bbf0001837a59504d406ece90807", + "shortname_antibiotics.tsv:md5,86eaefabf930b91bf08d3630abdd0a3b", + "shortname_pathogens.tsv:md5,4a69150eeec95693727f0cc178c0770a", + "snps.txt:md5,2f7a6bea480a7e3a6fc7f7f763c4b3fe" ] ], "1": [ "6.0.3" ], "2": [ - "3.2.9" + "3.3.0" ], "3": [ - "versions.yml:md5,43f331ec71ec01a1bae10e30f4ce4f26" + "versions.yml:md5,51bd8e4be5e532c5bdcfbb67c06dd808" ], "db": [ [ "CARD-Download-README.txt:md5,ca330e1d89e3a97ac6f50c86a8ca5c34", - "aro_categories.tsv:md5,ba2f33c43b199cd62ae5663125ce316e", - "aro_categories_index.tsv:md5,39f995f2356b6a0cb5fd34e3c6ffc8e1", - "aro_index.tsv:md5,b7250ed3208c8497ec2371527a689eeb", - "card.json:md5,e2cb53b1706a602d5265d2284a1fcdd5", - "card_database_v3.2.9.fasta:md5,0839d4447860694782a5db5cd6eae085", - "card_database_v3.2.9_all.fasta:md5,5295875faf06bef62ea954fef40958c3", - "nucleotide_fasta_protein_homolog_model.fasta:md5,ebcd48a6c9e14f339ffd9d2673eed803", + "aro_categories.tsv:md5,cdefc6d0169bc7a077020022be68e38b", + "aro_categories_index.tsv:md5,f99f2fed0cf357c7c3e7e39e4b880ca2", + "aro_index.tsv:md5,3052f507daff81356f4e985025928217", + "card.json:md5,c9550768ded14c01a56c98e3c4931176", + "card_database_v3.3.0.fasta:md5,b3fd50f7946aed8009c131a3c1454728", + "card_database_v3.3.0_all.fasta:md5,81ffb872759695abd1023c0b5f8fe0d5", + "nucleotide_fasta_protein_homolog_model.fasta:md5,93fcfd413dda3056612f725d5bc06356", "nucleotide_fasta_protein_knockout_model.fasta:md5,ff476b358ef70da53acf4602568a9b9b", "nucleotide_fasta_protein_overexpression_model.fasta:md5,68937e587c880153400fa8203f6a90d5", - "nucleotide_fasta_protein_variant_model.fasta:md5,1ff9cbaf0d640e2084f13751309f8176", - "nucleotide_fasta_rRNA_gene_variant_model.fasta:md5,b88fbe1d6de44b2ff2819ee63d001d75", - "protein_fasta_protein_homolog_model.fasta:md5,130a0947c60d18ef2e7d0ab886f80af3", + "nucleotide_fasta_protein_variant_model.fasta:md5,58a4644e05df59af7a918f25b61e5a22", + "nucleotide_fasta_rRNA_gene_variant_model.fasta:md5,bd53f46d630f652c9f6b7584c2126e1f", + "protein_fasta_protein_homolog_model.fasta:md5,63a89932339a665c390cebd50627f19b", "protein_fasta_protein_knockout_model.fasta:md5,6b259399e3eae3f23eaa421bbba6ba25", "protein_fasta_protein_overexpression_model.fasta:md5,758b753b821789147cdd795c654940ad", - "protein_fasta_protein_variant_model.fasta:md5,ec46ea3d9dc7ab01ec22cf265e410c88", - "shortname_antibiotics.tsv:md5,9d20abb9f6d37ed0cecc1573867ca49a", - "shortname_pathogens.tsv:md5,ae267113de686bc8f58eab5845cc343b", - "snps.txt:md5,ee6dfbe7a65f3ffdb6968822c47e4550" + "protein_fasta_protein_variant_model.fasta:md5,7fb7bbf0001837a59504d406ece90807", + "shortname_antibiotics.tsv:md5,86eaefabf930b91bf08d3630abdd0a3b", + "shortname_pathogens.tsv:md5,4a69150eeec95693727f0cc178c0770a", + "snps.txt:md5,2f7a6bea480a7e3a6fc7f7f763c4b3fe" ] ], "db_version": [ - "3.2.9" + "3.3.0" ], "tool_version": [ "6.0.3" ], "versions": [ - "versions.yml:md5,43f331ec71ec01a1bae10e30f4ce4f26" + "versions.yml:md5,51bd8e4be5e532c5bdcfbb67c06dd808" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-02-19T23:33:06.962413561" + "timestamp": "2024-12-17T19:00:14.248138522" } } \ No newline at end of file diff --git a/modules/nf-core/rgi/main/environment.yml b/modules/nf-core/rgi/main/environment.yml index f229cc21..a3169324 100644 --- a/modules/nf-core/rgi/main/environment.yml +++ b/modules/nf-core/rgi/main/environment.yml @@ -1,7 +1,7 @@ -name: rgi_main +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::rgi=6.0.3 diff --git a/modules/nf-core/rgi/main/main.nf b/modules/nf-core/rgi/main/main.nf index ba05358a..287875fa 100644 --- a/modules/nf-core/rgi/main/main.nf +++ b/modules/nf-core/rgi/main/main.nf @@ -25,7 +25,7 @@ process RGI_MAIN { script: def args = task.ext.args ?: '' // This customizes the command: rgi load - def args2 = task.ext.args ?: '' // This customizes the command: rgi main + def args2 = task.ext.args2 ?: '' // This customizes the command: rgi main def prefix = task.ext.prefix ?: "${meta.id}" def load_wildcard = "" diff --git a/modules/nf-core/rgi/main/meta.yml b/modules/nf-core/rgi/main/meta.yml index 7e444c8b..9d9836c0 100644 --- a/modules/nf-core/rgi/main/meta.yml +++ b/modules/nf-core/rgi/main/meta.yml @@ -6,59 +6,86 @@ keywords: - antibiotic resistance tools: - rgi: - description: This tool provides a preliminary annotation of your DNA sequence(s) based upon the data available in The Comprehensive Antibiotic Resistance Database (CARD). Hits to genes tagged with Antibiotic Resistance ontology terms will be highlighted. As CARD expands to include more pathogens, genomes, plasmids, and ontology terms this tool will grow increasingly powerful in providing first-pass detection of antibiotic resistance associated genes. See license at CARD website + description: This tool provides a preliminary annotation of your DNA sequence(s) + based upon the data available in The Comprehensive Antibiotic Resistance Database + (CARD). Hits to genes tagged with Antibiotic Resistance ontology terms will + be highlighted. As CARD expands to include more pathogens, genomes, plasmids, + and ontology terms this tool will grow increasingly powerful in providing first-pass + detection of antibiotic resistance associated genes. See license at CARD website homepage: https://card.mcmaster.ca documentation: https://github.com/arpcard/rgi tool_dev_url: https://github.com/arpcard/rgi doi: "10.1093/nar/gkz935" licence: ["https://card.mcmaster.ca/about"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Nucleotide or protein sequences in FASTA format - pattern: "*.{fasta,fasta.gz,fa,fa.gz,fna,fna.gz,faa,faa.gz}" - - card: - type: directory - description: Directory containing the CARD database. This is expected to be the unarchived but otherwise unaltered download folder (see RGI documentation for download instructions). - pattern: "*/" - - wildcard: - type: directory - description: Directory containing the WildCARD database (optional). This is expected to be the unarchived but otherwise unaltered download folder (see RGI documentation for download instructions). - pattern: "*/" - + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Nucleotide or protein sequences in FASTA format + pattern: "*.{fasta,fasta.gz,fa,fa.gz,fna,fna.gz,faa,faa.gz}" + - - card: + type: directory + description: Directory containing the CARD database. This is expected to be + the unarchived but otherwise unaltered download folder (see RGI documentation + for download instructions). + pattern: "*/" + - - wildcard: + type: directory + description: Directory containing the WildCARD database (optional). This is + expected to be the unarchived but otherwise unaltered download folder (see + RGI documentation for download instructions). + pattern: "*/" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - json: - type: file - description: JSON formatted file with RGI results - pattern: "*.{json}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: JSON formatted file with RGI results + pattern: "*.{json}" - tsv: - type: file - description: Tab-delimited file with RGI results - pattern: "*.{txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.txt": + type: file + description: Tab-delimited file with RGI results + pattern: "*.{txt}" - tmp: - type: directory - description: Directory containing various intermediate files - pattern: "temp/" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - temp/: + type: directory + description: Directory containing various intermediate files + pattern: "temp/" - tool_version: - type: string - description: The version of the tool in string format (useful for downstream tools such as hAMRronization) + - RGI_VERSION: + type: string + description: The version of the tool in string format (useful for downstream + tools such as hAMRronization) - db_version: - type: string - description: The version of the used database in string format (useful for downstream tools such as hAMRronization) + - DB_VERSION: + type: string + description: The version of the used database in string format (useful for downstream + tools such as hAMRronization) + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@rpetit3" - "@jfy133" diff --git a/modules/nf-core/rgi/main/tests/main.nf.test.snap b/modules/nf-core/rgi/main/tests/main.nf.test.snap index a8dc1d61..35b4f7cf 100644 --- a/modules/nf-core/rgi/main/tests/main.nf.test.snap +++ b/modules/nf-core/rgi/main/tests/main.nf.test.snap @@ -81,15 +81,15 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-02-19T22:51:36.047807514" + "timestamp": "2024-12-10T13:33:59.209306934" }, "rgi/main - haemophilus_influenzae - genome_fna_gz": { "content": [ [ - "versions.yml:md5,a9f89e3bebd538efa07bcbe9fe1ba37a" + "versions.yml:md5,306dec3569e66a74bff07184f2f801ec" ], [ [ @@ -131,13 +131,13 @@ "6.0.3" ], [ - "3.2.9" + "3.3.0" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.2" }, - "timestamp": "2024-02-19T22:51:14.372178941" + "timestamp": "2024-12-10T13:33:40.479165988" } } \ No newline at end of file diff --git a/modules/nf-core/seqkit/seq/environment.yml b/modules/nf-core/seqkit/seq/environment.yml index 74e0dd76..b26fb1eb 100644 --- a/modules/nf-core/seqkit/seq/environment.yml +++ b/modules/nf-core/seqkit/seq/environment.yml @@ -1,9 +1,7 @@ --- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json -name: "seqkit_seq" channels: - conda-forge - bioconda - - defaults dependencies: - - "bioconda::seqkit=2.8.1" + - bioconda::seqkit=2.9.0 diff --git a/modules/nf-core/seqkit/seq/main.nf b/modules/nf-core/seqkit/seq/main.nf index d7d38fc8..9d76da21 100644 --- a/modules/nf-core/seqkit/seq/main.nf +++ b/modules/nf-core/seqkit/seq/main.nf @@ -5,8 +5,8 @@ process SEQKIT_SEQ { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/seqkit:2.8.1--h9ee0642_0': - 'biocontainers/seqkit:2.8.1--h9ee0642_0' }" + 'https://depot.galaxyproject.org/singularity/seqkit:2.9.0--h9ee0642_0': + 'biocontainers/seqkit:2.9.0--h9ee0642_0' }" input: tuple val(meta), path(fastx) diff --git a/modules/nf-core/seqkit/seq/meta.yml b/modules/nf-core/seqkit/seq/meta.yml index 8d4e2b16..7d32aba5 100644 --- a/modules/nf-core/seqkit/seq/meta.yml +++ b/modules/nf-core/seqkit/seq/meta.yml @@ -1,7 +1,7 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json name: "seqkit_seq" -description: Transforms sequences (extract ID, filter by length, remove gaps, reverse complement...) +description: Transforms sequences (extract ID, filter by length, remove gaps, reverse + complement...) keywords: - genomics - fasta @@ -18,30 +18,33 @@ tools: tool_dev_url: "https://github.com/shenwei356/seqkit" doi: "10.1371/journal.pone.0163962" licence: ["MIT"] + identifier: biotools:seqkit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'sample1' ]` - - fastx: - type: file - description: Input fasta/fastq file - pattern: "*.{fsa,fas,fa,fasta,fastq,fq,fsa.gz,fas.gz,fa.gz,fasta.gz,fastq.gz,fq.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1' ]` + - fastx: + type: file + description: Input fasta/fastq file + pattern: "*.{fsa,fas,fa,fasta,fastq,fq,fsa.gz,fas.gz,fa.gz,fasta.gz,fastq.gz,fq.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'sample1' ]` - fastx: - type: file - description: Output fasta/fastq file - pattern: "*.{fasta,fasta.gz,fastq,fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1' ]` + - ${prefix}.*: + type: file + description: Output fasta/fastq file + pattern: "*.{fasta,fasta.gz,fastq,fastq.gz}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GallVp" maintainers: diff --git a/modules/nf-core/seqkit/seq/tests/main.nf.test.snap b/modules/nf-core/seqkit/seq/tests/main.nf.test.snap index e6910966..68171935 100644 --- a/modules/nf-core/seqkit/seq/tests/main.nf.test.snap +++ b/modules/nf-core/seqkit/seq/tests/main.nf.test.snap @@ -11,7 +11,7 @@ ] ], "1": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ], "fastx": [ [ @@ -22,15 +22,15 @@ ] ], "versions": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-05-08T08:52:18.220051903" + "timestamp": "2025-01-15T15:13:34.513457" }, "sarscov2-test_1_fastq_gz": { "content": [ @@ -44,7 +44,7 @@ ] ], "1": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ], "fastx": [ [ @@ -55,15 +55,15 @@ ] ], "versions": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-05-08T08:51:55.607826581" + "timestamp": "2025-01-15T15:13:27.316329" }, "sarscov2-genome_fasta": { "content": [ @@ -77,7 +77,7 @@ ] ], "1": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ], "fastx": [ [ @@ -88,15 +88,15 @@ ] ], "versions": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-05-08T08:51:27.717072933" + "timestamp": "2025-01-15T15:13:18.463038" }, "sarscov2-genome_fasta_gz": { "content": [ @@ -110,7 +110,7 @@ ] ], "1": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ], "fastx": [ [ @@ -121,14 +121,14 @@ ] ], "versions": [ - "versions.yml:md5,34894c4efa5e10a923e78975a3d260dd" + "versions.yml:md5,eeb475e557ef671d4b58e11f82d2448e" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-05-08T08:51:37.917560104" + "timestamp": "2025-01-15T15:13:22.960973" } } \ No newline at end of file diff --git a/modules/nf-core/tabix/bgzip/environment.yml b/modules/nf-core/tabix/bgzip/environment.yml index 56cc0fb1..fe48f542 100644 --- a/modules/nf-core/tabix/bgzip/environment.yml +++ b/modules/nf-core/tabix/bgzip/environment.yml @@ -1,8 +1,9 @@ -name: tabix_bgzip +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults + dependencies: - - bioconda::tabix=1.11 - bioconda::htslib=1.20 + - bioconda::tabix=1.11 diff --git a/modules/nf-core/tabix/bgzip/meta.yml b/modules/nf-core/tabix/bgzip/meta.yml index 621d49ea..131e92cf 100644 --- a/modules/nf-core/tabix/bgzip/meta.yml +++ b/modules/nf-core/tabix/bgzip/meta.yml @@ -13,33 +13,42 @@ tools: documentation: http://www.htslib.org/doc/bgzip.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:tabix input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: file to compress or to decompress + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: file to compress or to decompress output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - output: - type: file - description: Output compressed/decompressed file - pattern: "*." + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${output}: + type: file + description: Output compressed/decompressed file + pattern: "*." - gzi: - type: file - description: Optional gzip index file for compressed inputs - pattern: "*.gzi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${output}.gzi: + type: file + description: Optional gzip index file for compressed inputs + pattern: "*.gzi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/untar/environment.yml b/modules/nf-core/untar/environment.yml index 4f498244..9b926b1f 100644 --- a/modules/nf-core/untar/environment.yml +++ b/modules/nf-core/untar/environment.yml @@ -1,9 +1,12 @@ -name: untar +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: + - conda-forge::coreutils=9.5 - conda-forge::grep=3.11 + - conda-forge::gzip=1.13 + - conda-forge::lbzip2=2.5 - conda-forge::sed=4.8 - conda-forge::tar=1.34 diff --git a/modules/nf-core/untar/main.nf b/modules/nf-core/untar/main.nf index c651bdad..e712ebe6 100644 --- a/modules/nf-core/untar/main.nf +++ b/modules/nf-core/untar/main.nf @@ -1,46 +1,46 @@ process UNTAR { - tag "$archive" + tag "${archive}" label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ubuntu:20.04' : - 'nf-core/ubuntu:20.04' }" + container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/52/52ccce28d2ab928ab862e25aae26314d69c8e38bd41ca9431c67ef05221348aa/data' + : 'community.wave.seqera.io/library/coreutils_grep_gzip_lbzip2_pruned:838ba80435a629f8'}" input: tuple val(meta), path(archive) output: - tuple val(meta), path("$prefix"), emit: untar - path "versions.yml" , emit: versions + tuple val(meta), path("${prefix}"), emit: untar + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when script: - def args = task.ext.args ?: '' + def args = task.ext.args ?: '' def args2 = task.ext.args2 ?: '' - prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.baseName.toString().replaceFirst(/\.tar$/, "")) + prefix = task.ext.prefix ?: (meta.id ? "${meta.id}" : archive.baseName.toString().replaceFirst(/\.tar$/, "")) """ - mkdir $prefix + mkdir ${prefix} ## Ensures --strip-components only applied when top level of tar contents is a directory ## If just files or multiple directories, place all in prefix if [[ \$(tar -taf ${archive} | grep -o -P "^.*?\\/" | uniq | wc -l) -eq 1 ]]; then tar \\ - -C $prefix --strip-components 1 \\ + -C ${prefix} --strip-components 1 \\ -xavf \\ - $args \\ - $archive \\ - $args2 + ${args} \\ + ${archive} \\ + ${args2} else tar \\ - -C $prefix \\ + -C ${prefix} \\ -xavf \\ - $args \\ - $archive \\ - $args2 + ${args} \\ + ${archive} \\ + ${args2} fi cat <<-END_VERSIONS > versions.yml @@ -50,7 +50,7 @@ process UNTAR { """ stub: - prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.toString().replaceFirst(/\.[^\.]+(.gz)?$/, "")) + prefix = task.ext.prefix ?: (meta.id ? "${meta.id}" : archive.toString().replaceFirst(/\.[^\.]+(.gz)?$/, "")) """ mkdir ${prefix} ## Dry-run untaring the archive to get the files and place all in prefix diff --git a/modules/nf-core/untar/meta.yml b/modules/nf-core/untar/meta.yml index a9a2110f..3a37bb35 100644 --- a/modules/nf-core/untar/meta.yml +++ b/modules/nf-core/untar/meta.yml @@ -10,30 +10,36 @@ tools: Extract tar.gz files. documentation: https://www.gnu.org/software/tar/manual/ licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - archive: - type: file - description: File to be untar - pattern: "*.{tar}.{gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - archive: + type: file + description: File to be untar + pattern: "*.{tar}.{gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - untar: - type: directory - description: Directory containing contents of archive - pattern: "*/" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*/" + - ${prefix}: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*/" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/untar/untar.diff b/modules/nf-core/untar/untar.diff deleted file mode 100644 index 457dd66d..00000000 --- a/modules/nf-core/untar/untar.diff +++ /dev/null @@ -1,16 +0,0 @@ -Changes in module 'nf-core/untar' ---- modules/nf-core/untar/main.nf -+++ modules/nf-core/untar/main.nf -@@ -4,8 +4,8 @@ - - conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? -- 'https://depot.galaxyproject.org/singularity/ubuntu:22.04' : -- 'nf-core/ubuntu:22.04' }" -+ 'https://depot.galaxyproject.org/singularity/ubuntu:20.04' : -+ 'nf-core/ubuntu:20.04' }" - - input: - tuple val(meta), path(archive) - -************************************************************ diff --git a/nextflow.config b/nextflow.config index dea546e0..2ea2baf6 100644 --- a/nextflow.config +++ b/nextflow.config @@ -10,358 +10,347 @@ params { // Input options - input = null + input = null // MultiQC options - multiqc_config = null - multiqc_title = null - multiqc_logo = null - max_multiqc_email_size = '25.MB' - multiqc_methods_description = null + multiqc_config = null + multiqc_title = null + multiqc_logo = null + max_multiqc_email_size = '25.MB' + multiqc_methods_description = null // Boilerplate options - outdir = null - publish_dir_mode = 'copy' - email = null - email_on_fail = null - plaintext_email = false - monochrome_logs = false - hook_url = null - help = false - version = false - pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' - - // To stop the random warning coming from nf-validation, remove on upgrade to nf-schema - monochromeLogs = null + outdir = null + publish_dir_mode = 'copy' + email = null + email_on_fail = null + plaintext_email = false + monochrome_logs = false + hook_url = null + help = false + help_full = false + show_hidden = false + version = false + pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' + trace_report_suffix = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') + // Config options // Taxonomy classification options - run_taxa_classification = false - taxa_classification_tool = 'mmseqs2' - - taxa_classification_mmseqs_db = null - taxa_classification_mmseqs_db_id = 'Kalamari' - taxa_classification_mmseqs_db_savetmp = false - - taxa_classification_mmseqs_taxonomy_savetmp = false - taxa_classification_mmseqs_taxonomy_searchtype = 2 - taxa_classification_mmseqs_taxonomy_lcaranks = 'kingdom,phylum,class,order,family,genus,species' - taxa_classification_mmseqs_taxonomy_taxlineage = 1 - taxa_classification_mmseqs_taxonomy_sensitivity = '5.0' - taxa_classification_mmseqs_taxonomy_orffilters = '2.0' - taxa_classification_mmseqs_taxonomy_lcamode = 3 - taxa_classification_mmseqs_taxonomy_votemode = 1 + run_taxa_classification = false + taxa_classification_tool = 'mmseqs2' + taxa_classification_mmseqs_compressed = 0 + + taxa_classification_mmseqs_db = null + taxa_classification_mmseqs_db_id = 'Kalamari' + taxa_classification_mmseqs_db_savetmp = false + + taxa_classification_mmseqs_taxonomy_savetmp = false + taxa_classification_mmseqs_taxonomy_searchtype = 2 + taxa_classification_mmseqs_taxonomy_lcaranks = 'kingdom,phylum,class,order,family,genus,species' + taxa_classification_mmseqs_taxonomy_taxlineage = 1 + taxa_classification_mmseqs_taxonomy_sensitivity = 5.0 + taxa_classification_mmseqs_taxonomy_orffilters = 2.0 + taxa_classification_mmseqs_taxonomy_lcamode = 3 + taxa_classification_mmseqs_taxonomy_votemode = 1 // Annotation options - annotation_tool = 'pyrodigal' - save_annotations = false - - annotation_prodigal_singlemode = false - annotation_prodigal_closed = false - annotation_prodigal_transtable = 11 - annotation_prodigal_forcenonsd = false - - annotation_pyrodigal_singlemode = false - annotation_pyrodigal_closed = false - annotation_pyrodigal_transtable = 11 - annotation_pyrodigal_forcenonsd = false - - annotation_bakta_db = null - annotation_bakta_db_downloadtype = 'full' - annotation_bakta_singlemode = false - annotation_bakta_mincontiglen = 1 - annotation_bakta_translationtable = 11 - annotation_bakta_gram = '?' - annotation_bakta_complete = false - annotation_bakta_renamecontigheaders = false - annotation_bakta_compliant = false - annotation_bakta_trna = false - annotation_bakta_tmrna = false - annotation_bakta_rrna = false - annotation_bakta_ncrna = false - annotation_bakta_ncrnaregion = false - annotation_bakta_crispr = false - annotation_bakta_skipcds = false - annotation_bakta_pseudo = false - annotation_bakta_skipsorf = false - annotation_bakta_gap = false - annotation_bakta_ori = false - annotation_bakta_activate_plot = false - - annotation_prokka_singlemode = false - annotation_prokka_rawproduct = false - annotation_prokka_kingdom = 'Bacteria' - annotation_prokka_gcode = 11 - annotation_prokka_cdsrnaolap = false - annotation_prokka_rnammer = false - annotation_prokka_mincontiglen = 1 - annotation_prokka_evalue = 0.000001 - annotation_prokka_coverage = 80 - annotation_prokka_compliant = true - annotation_prokka_addgenes = false - annotation_prokka_retaincontigheaders = false + annotation_tool = 'pyrodigal' + save_annotations = false + + annotation_prodigal_singlemode = false + annotation_prodigal_closed = false + annotation_prodigal_transtable = 11 + annotation_prodigal_forcenonsd = false + + annotation_pyrodigal_singlemode = false + annotation_pyrodigal_closed = false + annotation_pyrodigal_transtable = 11 + annotation_pyrodigal_forcenonsd = false + annotation_pyrodigal_usespecialstopcharacter = false + + annotation_bakta_db = null + annotation_bakta_db_downloadtype = 'full' + annotation_bakta_singlemode = false + annotation_bakta_mincontiglen = 1 + annotation_bakta_translationtable = 11 + annotation_bakta_gram = '?' + annotation_bakta_complete = false + annotation_bakta_renamecontigheaders = false + annotation_bakta_compliant = false + annotation_bakta_trna = false + annotation_bakta_tmrna = false + annotation_bakta_rrna = false + annotation_bakta_ncrna = false + annotation_bakta_ncrnaregion = false + annotation_bakta_crispr = false + annotation_bakta_skipcds = false + annotation_bakta_pseudo = false + annotation_bakta_skipsorf = false + annotation_bakta_gap = false + annotation_bakta_ori = false + annotation_bakta_activate_plot = false + annotation_bakta_hmms = null + + annotation_prokka_singlemode = false + annotation_prokka_rawproduct = false + annotation_prokka_kingdom = 'Bacteria' + annotation_prokka_gcode = 11 + annotation_prokka_cdsrnaolap = false + annotation_prokka_rnammer = false + annotation_prokka_mincontiglen = 1 + annotation_prokka_evalue = 0.000001 + annotation_prokka_coverage = 80 + annotation_prokka_compliant = true + annotation_prokka_addgenes = false + annotation_prokka_retaincontigheaders = false + + // Protein annotation options + run_protein_annotation = false + protein_annotation_tool = 'InterProScan' + protein_annotation_interproscan_db = null + protein_annotation_interproscan_db_url = 'https://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.72-103.0/interproscan-5.72-103.0-64-bit.tar.gz' + protein_annotation_interproscan_applications = 'PANTHER,ProSiteProfiles,ProSitePatterns,Pfam' + protein_annotation_interproscan_enableprecalc = false // Database downloading options - save_db = false + save_db = false // AMP options - run_amp_screening = false - - amp_skip_amplify = false - - amp_skip_macrel = false - - amp_skip_ampir = false - amp_ampir_model = 'precursor' - amp_ampir_minlength = 10 - - amp_run_hmmsearch = false - amp_hmmsearch_models = null - amp_hmmsearch_savealignments = false - amp_hmmsearch_savetargets = false - amp_hmmsearch_savedomains = false - - amp_ampcombi_db = null - amp_ampcombi_parsetables_cutoff = 0.6 - amp_ampcombi_parsetables_ampir = '.ampir.tsv' - amp_ampcombi_parsetables_amplify = '.amplify.tsv' - amp_ampcombi_parsetables_macrel = '.macrel.prediction' - amp_ampcombi_parsetables_hmmsearch = '.hmmer_hmmsearch.txt' - amp_ampcombi_parsetables_aalength = 100 - amp_ampcombi_parsetables_dbevalue = 5 - amp_ampcombi_parsetables_hmmevalue = 0.06 - amp_ampcombi_parsetables_windowstopcodon = 60 - amp_ampcombi_parsetables_windowtransport = 11 - amp_ampcombi_parsetables_removehitswostopcodons = false - amp_ampcombi_cluster_covmode = 0 - amp_ampcombi_cluster_mode = 1 - amp_ampcombi_cluster_coverage = 0.8 - amp_ampcombi_cluster_seqid = 0.4 - amp_ampcombi_cluster_sensitivity = 4.0 - amp_ampcombi_cluster_removesingletons = false - amp_ampcombi_cluster_minmembers = 0 + run_amp_screening = false + + amp_skip_amplify = false + + amp_skip_macrel = false + + amp_skip_ampir = false + amp_ampir_model = 'precursor' + amp_ampir_minlength = 10 + + amp_run_hmmsearch = false + amp_hmmsearch_models = null + amp_hmmsearch_savealignments = false + amp_hmmsearch_savetargets = false + amp_hmmsearch_savedomains = false + + amp_ampcombi_db_id = 'DRAMP' + amp_ampcombi_db = null + amp_ampcombi_parsetables_cutoff = 0.6 + amp_ampcombi_parsetables_ampir = '.ampir.tsv' + amp_ampcombi_parsetables_amplify = '.amplify.tsv' + amp_ampcombi_parsetables_macrel = '.macrel.prediction' + amp_ampcombi_parsetables_hmmsearch = '.hmmer_hmmsearch.txt' + amp_ampcombi_parsetables_aalength = 120 + amp_ampcombi_parsetables_dbevalue = 5 + amp_ampcombi_parsetables_hmmevalue = 0.06 + amp_ampcombi_parsetables_windowstopcodon = 60 + amp_ampcombi_parsetables_windowtransport = 11 + amp_ampcombi_parsetables_removehitswostopcodons = false + amp_ampcombi_cluster_covmode = 0 + amp_ampcombi_cluster_mode = 1 + amp_ampcombi_cluster_coverage = 0.8 + amp_ampcombi_cluster_seqid = 0.4 + amp_ampcombi_cluster_sensitivity = 4.0 + amp_ampcombi_cluster_removesingletons = false + amp_ampcombi_cluster_minmembers = 0 // ARG options - run_arg_screening = false - - arg_skip_fargene = false - arg_fargene_hmmmodel = 'class_a,class_b_1_2,class_b_3,class_c,class_d_1,class_d_2,qnr,tet_efflux,tet_rpg,tet_enzyme' - arg_fargene_savetmpfiles = false - arg_fargene_minorflength = 90 - arg_fargene_score = null - arg_fargene_translationformat = 'pearson' - arg_fargene_orffinder = false - - arg_skip_rgi = false - arg_rgi_db = null - arg_rgi_savejson = false - arg_rgi_savetmpfiles = false - arg_rgi_alignmenttool = 'BLAST' - arg_rgi_includeloose = false - arg_rgi_includenudge = false - arg_rgi_lowquality = false - arg_rgi_data = 'NA' - arg_rgi_split_prodigal_jobs = true - - arg_skip_amrfinderplus = false - arg_amrfinderplus_db = null - arg_amrfinderplus_identmin = -1 - arg_amrfinderplus_coveragemin = 0.5 - arg_amrfinderplus_translationtable = 11 - arg_amrfinderplus_plus = false - arg_amrfinderplus_name = false - - arg_skip_deeparg = false - arg_deeparg_db = null - arg_deeparg_db_version = 2 // Make sure to update on module version bump! - arg_deeparg_model = 'LS' - arg_deeparg_minprob = 0.8 - arg_deeparg_alignmentidentity = 50 - arg_deeparg_alignmentevalue = 1e-10 - arg_deeparg_alignmentoverlap = 0.8 - arg_deeparg_numalignmentsperentry = 1000 - - arg_skip_abricate = false - arg_abricate_db_id = 'ncbi' - arg_abricate_db = null - arg_abricate_minid = 80 - arg_abricate_mincov = 80 - - arg_hamronization_summarizeformat = 'tsv' - - arg_skip_argnorm = false + run_arg_screening = false + + arg_skip_fargene = false + arg_fargene_hmmmodel = 'class_a,class_b_1_2,class_b_3,class_c,class_d_1,class_d_2,qnr,tet_efflux,tet_rpg,tet_enzyme' + arg_fargene_savetmpfiles = false + arg_fargene_minorflength = 90 + arg_fargene_score = null + arg_fargene_translationformat = 'pearson' + arg_fargene_orffinder = false + + arg_skip_rgi = false + arg_rgi_db = null + arg_rgi_savejson = false + arg_rgi_savetmpfiles = false + arg_rgi_alignmenttool = 'BLAST' + arg_rgi_includeloose = false + arg_rgi_includenudge = false + arg_rgi_lowquality = false + arg_rgi_data = 'NA' + arg_rgi_split_prodigal_jobs = true + + arg_skip_amrfinderplus = false + arg_amrfinderplus_db = null + arg_amrfinderplus_identmin = -1 + arg_amrfinderplus_coveragemin = 0.5 + arg_amrfinderplus_translationtable = 11 + arg_amrfinderplus_plus = false + arg_amrfinderplus_name = false + + arg_skip_deeparg = false + arg_deeparg_db = null + arg_deeparg_db_version = 2 // Make sure to update on module version bump! + arg_deeparg_model = 'LS' + arg_deeparg_minprob = 0.8 + arg_deeparg_alignmentidentity = 50 + arg_deeparg_alignmentevalue = 1e-10 + arg_deeparg_alignmentoverlap = 0.8 + arg_deeparg_numalignmentsperentry = 1000 + + arg_skip_abricate = false + arg_abricate_db_id = 'ncbi' + arg_abricate_db = null + arg_abricate_minid = 80 + arg_abricate_mincov = 80 + + arg_hamronization_summarizeformat = 'tsv' + + arg_skip_argnorm = false // BGC options - run_bgc_screening = false - - bgc_mincontiglength = 3000 - bgc_savefilteredcontigs = false - - bgc_skip_antismash = false - bgc_antismash_db = null - bgc_antismash_installdir = null - bgc_antismash_cbgeneral = false - bgc_antismash_cbknownclusters = false - bgc_antismash_cbsubclusters = false - bgc_antismash_smcogtrees = false - bgc_antismash_ccmibig = false - bgc_antismash_contigminlength = 3000 - bgc_antismash_hmmdetectionstrictness = 'relaxed' - bgc_antismash_pfam2go = false - bgc_antismash_rre = false - bgc_antismash_taxon = 'bacteria' - bgc_antismash_tfbs = false - - bgc_skip_deepbgc = false - bgc_deepbgc_db = null - bgc_deepbgc_score = 0.5 - bgc_deepbgc_prodigalsinglemode = false - bgc_deepbgc_mergemaxproteingap = 0 - bgc_deepbgc_mergemaxnuclgap = 0 - bgc_deepbgc_minnucl = 1 - bgc_deepbgc_minproteins = 1 - bgc_deepbgc_mindomains = 1 - bgc_deepbgc_minbiodomains = 0 - bgc_deepbgc_classifierscore = 0.5 - - bgc_skip_gecco = false - bgc_gecco_cds = 3 - bgc_gecco_threshold = 0.8 - bgc_gecco_pfilter = 0.000000001 - bgc_gecco_edgedistance = 0 - bgc_gecco_mask = false - - bgc_run_hmmsearch = false - bgc_hmmsearch_models = null - bgc_hmmsearch_savealignments = false - bgc_hmmsearch_savetargets = false - bgc_hmmsearch_savedomains = false + run_bgc_screening = false + + bgc_mincontiglength = 3000 + bgc_savefilteredcontigs = false + + bgc_skip_antismash = false + bgc_antismash_db = null + bgc_antismash_installdir = null + bgc_antismash_cbgeneral = false + bgc_antismash_cbknownclusters = false + bgc_antismash_cbsubclusters = false + bgc_antismash_smcogtrees = false + bgc_antismash_ccmibig = false + bgc_antismash_contigminlength = 3000 + bgc_antismash_hmmdetectionstrictness = 'relaxed' + bgc_antismash_pfam2go = false + bgc_antismash_rre = false + bgc_antismash_taxon = 'bacteria' + bgc_antismash_tfbs = false + + bgc_skip_deepbgc = false + bgc_deepbgc_db = null + bgc_deepbgc_score = 0.5 + bgc_deepbgc_prodigalsinglemode = false + bgc_deepbgc_mergemaxproteingap = 0 + bgc_deepbgc_mergemaxnuclgap = 0 + bgc_deepbgc_minnucl = 1 + bgc_deepbgc_minproteins = 1 + bgc_deepbgc_mindomains = 1 + bgc_deepbgc_minbiodomains = 0 + bgc_deepbgc_classifierscore = 0.5 + + bgc_skip_gecco = false + bgc_gecco_cds = 3 + bgc_gecco_threshold = 0.8 + bgc_gecco_pfilter = 0.000000001 + bgc_gecco_edgedistance = 0 + bgc_gecco_mask = false + + bgc_run_hmmsearch = false + bgc_hmmsearch_models = null + bgc_hmmsearch_savealignments = false + bgc_hmmsearch_savetargets = false + bgc_hmmsearch_savedomains = false // Config options - config_profile_name = null - config_profile_description = null - custom_config_version = 'master' - custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" - config_profile_contact = null - config_profile_url = null - - // Max resource options - // Defaults only, expecting to be overwritten - max_memory = '128.GB' - max_cpus = 16 - max_time = '240.h' + config_profile_name = null + config_profile_description = null - // Schema validation default options - validationFailUnrecognisedParams = false - validationLenientMode = false - validationSchemaIgnoreParams = 'genomes,igenomes_base,fasta,monochromeLogs' - validationShowHiddenParams = false - validate_params = true + custom_config_version = 'master' + custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" + config_profile_contact = null + config_profile_url = null + // Schema validation default options + validate_params = true } // Load base.config by default for all pipelines includeConfig 'conf/base.config' -// Load nf-core custom profiles from different Institutions -try { - includeConfig "${params.custom_config_base}/nfcore_custom.config" -} catch (Exception e) { - System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config") -} - -// Load nf-core/funcscan custom profiles from different institutions. -try { - includeConfig "${params.custom_config_base}/pipeline/funcscan.config" -} catch (Exception e) { - System.err.println("WARNING: Could not load nf-core/config/funcscan profiles: ${params.custom_config_base}/pipeline/funcscan.config") -} - profiles { debug { - dumpHashes = true - process.beforeScript = 'echo $HOSTNAME' - cleanup = false - nextflow.enable.configProcessNamesValidation = true + dumpHashes = true + process.beforeScript = 'echo $HOSTNAME' + cleanup = false + nextflow.enable.configProcessNamesValidation = true } conda { - conda.enabled = true - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - conda.channels = ['conda-forge', 'bioconda', 'defaults'] - apptainer.enabled = false + conda.enabled = true + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + conda.channels = ['conda-forge', 'bioconda'] + apptainer.enabled = false } mamba { - conda.enabled = true - conda.useMamba = true - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + conda.enabled = true + conda.useMamba = true + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } docker { - docker.enabled = true - conda.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false - docker.runOptions = '-u $(id -u):$(id -g)' + docker.enabled = true + conda.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + docker.runOptions = '-u $(id -u):$(id -g)' } arm { - docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' + docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' } singularity { - singularity.enabled = true - singularity.autoMounts = true - conda.enabled = false - docker.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + singularity.enabled = true + singularity.autoMounts = true + conda.enabled = false + docker.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } podman { - podman.enabled = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + podman.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } shifter { - shifter.enabled = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - podman.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + shifter.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } charliecloud { - charliecloud.enabled = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - apptainer.enabled = false + charliecloud.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + apptainer.enabled = false } apptainer { - apptainer.enabled = true - apptainer.autoMounts = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false + apptainer.enabled = true + apptainer.autoMounts = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false } wave { apptainer.ociAutoPull = true @@ -371,37 +360,72 @@ profiles { wave.strategy = 'conda,container' } gitpod { - executor.name = 'local' - executor.cpus = 4 - executor.memory = 8.GB + executor.name = 'local' + executor.cpus = 4 + executor.memory = 8.GB + process { + resourceLimits = [ + memory: 8.GB, + cpus: 4, + time: 1.h, + ] + } + } + test { + includeConfig 'conf/test.config' + } + test_bakta { + includeConfig 'conf/test_bakta.config' + } + test_prokka { + includeConfig 'conf/test_prokka.config' + } + test_bgc_bakta { + includeConfig 'conf/test_bgc_bakta.config' + } + test_bgc_prokka { + includeConfig 'conf/test_bgc_prokka.config' + } + test_bgc_pyrodigal { + includeConfig 'conf/test_bgc_pyrodigal.config' + } + test_taxonomy_bakta { + includeConfig 'conf/test_taxonomy_bakta.config' + } + test_taxonomy_prokka { + includeConfig 'conf/test_taxonomy_prokka.config' + } + test_taxonomy_pyrodigal { + includeConfig 'conf/test_taxonomy_pyrodigal.config' + } + test_full { + includeConfig 'conf/test_full.config' + } + test_minimal { + includeConfig 'conf/test_minimal.config' + } + test_preannotated { + includeConfig 'conf/test_preannotated.config' + } + test_preannotated_bgc { + includeConfig 'conf/test_preannotated_bgc.config' } - test { includeConfig 'conf/test.config' } - test_bakta { includeConfig 'conf/test_bakta.config' } - test_prokka { includeConfig 'conf/test_prokka.config' } - test_bgc_bakta { includeConfig 'conf/test_bgc_bakta.config' } - test_bgc_prokka { includeConfig 'conf/test_bgc_prokka.config' } - test_bgc_pyrodigal { includeConfig 'conf/test_bgc_pyrodigal.config' } - test_taxonomy_bakta { includeConfig 'conf/test_taxonomy_bakta.config' } - test_taxonomy_prokka { includeConfig 'conf/test_taxonomy_prokka.config' } - test_taxonomy_pyrodigal { includeConfig 'conf/test_taxonomy_pyrodigal.config' } - test_full { includeConfig 'conf/test_full.config' } - test_nothing { includeConfig 'conf/test_nothing.config' } - test_preannotated { includeConfig 'conf/test_preannotated.config' } - test_preannotated_bgc { includeConfig 'conf/test_preannotated_bgc.config' } } -// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile -// Will not be used unless Apptainer / Docker / Podman / Singularity are enabled +// Load nf-core custom profiles from different Institutions +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" + +// Load nf-core/funcscan custom profiles from different institutions. +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/funcscan.config" : "/dev/null" + +// Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile +// Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled // Set to your registry if you have a mirror of containers -apptainer.registry = 'quay.io' -docker.registry = 'quay.io' -podman.registry = 'quay.io' +apptainer.registry = 'quay.io' +docker.registry = 'quay.io' +podman.registry = 'quay.io' singularity.registry = 'quay.io' - -// Nextflow plugins -plugins { - id 'nf-validation@1.1.3' // Validation of pipeline parameters and creation of an input channel from a sample sheet -} +charliecloud.registry = 'quay.io' // Export these variables to prevent local Python/R libraries from conflicting with those in the container // The JULIA depot path has been adjusted to a fixed path `/usr/local/share/julia` that needs to be used for packages in the container. @@ -414,73 +438,135 @@ env { JULIA_DEPOT_PATH = "/usr/local/share/julia" } -// Capture exit codes from upstream processes when piping -process.shell = ['/bin/bash', '-euo', 'pipefail'] +// Set bash options +process.shell = [ + "bash", + "-C", // No clobber - prevent output redirection from overwriting files. + "-e", // Exit if a tool returns a non-zero status/exit code + "-u", // Treat unset variables and parameters as an error + "-o", // Returns the status of the last command to exit.. + "pipefail" // ..with a non-zero status or zero if all successfully execute +] // Disable process selector warnings by default. Use debug profile to enable warnings. nextflow.enable.configProcessNamesValidation = false -def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') timeline { enabled = true - file = "${params.outdir}/pipeline_info/execution_timeline_${trace_timestamp}.html" + file = "${params.outdir}/pipeline_info/execution_timeline_${params.trace_report_suffix}.html" } report { enabled = true - file = "${params.outdir}/pipeline_info/execution_report_${trace_timestamp}.html" + file = "${params.outdir}/pipeline_info/execution_report_${params.trace_report_suffix}.html" } trace { enabled = true - file = "${params.outdir}/pipeline_info/execution_trace_${trace_timestamp}.txt" + file = "${params.outdir}/pipeline_info/execution_trace_${params.trace_report_suffix}.txt" } dag { enabled = true - file = "${params.outdir}/pipeline_info/pipeline_dag_${trace_timestamp}.html" + file = "${params.outdir}/pipeline_info/pipeline_dag_${params.trace_report_suffix}.html" } manifest { name = 'nf-core/funcscan' author = """Jasmin Frangenberg, Anan Ibrahim, Louisa Perelo, Moritz E. Beber, James A. Fellows Yates""" + // The author field is deprecated from Nextflow version 24.10.0, use contributors instead + contributors = [ + [ + name: 'Jasmin Frangenberg', + affiliation: 'Leibniz Institute for Natural Product Research and Infection Biology Hans Knöll Institute, Jena, Germany', + email: 'jasmin.frangenberg@leibniz-hki.de', + github: 'https://github.com/jasmezz', + contribution: ['author', 'maintainer'], + orcid: 'https://orcid.org/0009-0004-5961-4709', + ], + [ + name: 'Anan Ibrahim', + affiliation: 'Leibniz Institute for Natural Product Research and Infection Biology Hans Knöll Institute, Jena, Germany', + email: '', + github: 'https://github.com/darcy220606', + contribution: ['author', 'maintainer'], + orcid: 'https://orcid.org/0000-0003-3719-901X', + ], + [ + name: 'Louisa Perelo', + affiliation: 'Quantitative Biology Center (QBiC), University of Tübingen, Tübingen, Germany', + email: '', + github: 'https://github.com/louperelo', + contribution: ['author', 'contributor'], + orcid: 'https://orcid.org/0000-0002-7307-062X', + ], + [ + name: 'Moritz E. Beber', + affiliation: 'Unseen Bio ApS, Copenhagen, Denmark', + email: '', + github: 'Midnighter', + contribution: ['author', 'contributor'], + orcid: 'https://orcid.org/0000-0003-2406-1978', + ], + [ + name: 'James A. Fellows Yates', + affiliation: 'Leibniz Institute for Natural Product Research and Infection Biology Hans Knöll Institute, Jena, Germany; Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany', + email: '', + github: 'https://github.com/jfy133', + contribution: ['author', 'maintainer'], + orcid: 'https://orcid.org/0000-0001-5585-6277', + ], + [ + name: 'Vedanth Ramji', + affiliation: 'Queensland University of Technology, Brisbane, Australia', + email: '', + github: 'https://github.com/Vedanth-Ramji', + contribution: ['contributor'], + orcid: 'https://orcid.org/0009-0001-5311-7611', + ], + ] homePage = 'https://github.com/nf-core/funcscan' description = """Pipeline for screening for functional components of assembled contigs""" mainScript = 'main.nf' - nextflowVersion = '!>=23.04.0' - version = '2.0.0' + defaultBranch = 'master' + nextflowVersion = '!>=24.04.2' + version = '2.1.0' doi = '10.5281/zenodo.7643099' } -// Load modules.config for DSL2 module specific options -includeConfig 'conf/modules.config' +// Nextflow plugins +plugins { + id 'nf-schema@2.3.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet +} -// Function to ensure that resource requirements don't go beyond -// a maximum limit -def check_max(obj, type) { - if (type == 'memory') { - try { - if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1) - return params.max_memory as nextflow.util.MemoryUnit - else - return obj - } catch (all) { - println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj" - return obj - } - } else if (type == 'time') { - try { - if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1) - return params.max_time as nextflow.util.Duration - else - return obj - } catch (all) { - println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj" - return obj - } - } else if (type == 'cpus') { - try { - return Math.min( obj, params.max_cpus as int ) - } catch (all) { - println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj" - return obj - } +validation { + defaultIgnoreParams = ["genomes"] + monochromeLogs = params.monochrome_logs + help { + enabled = true + command = "nextflow run nf-core/funcscan -profile --input samplesheet.csv --outdir " + fullParameter = "help_full" + showHiddenParameter = "show_hidden" + beforeText = """ +-\033[2m----------------------------------------------------\033[0m- + \033[0;32m,--.\033[0;30m/\033[0;32m,-.\033[0m +\033[0;34m ___ __ __ __ ___ \033[0;32m/,-._.--~\'\033[0m +\033[0;34m |\\ | |__ __ / ` / \\ |__) |__ \033[0;33m} {\033[0m +\033[0;34m | \\| | \\__, \\__/ | \\ |___ \033[0;32m\\`-._,-`-,\033[0m + \033[0;32m`._,._,\'\033[0m +\033[0;35m nf-core/funcscan ${manifest.version}\033[0m +-\033[2m----------------------------------------------------\033[0m- +""" + afterText = """${manifest.doi ? "\n* The pipeline\n" : ""}${manifest.doi.tokenize(",").collect { " https://doi.org/${it.trim().replace('https://doi.org/', '')}" }.join("\n")}${manifest.doi ? "\n" : ""} +* The nf-core framework + https://doi.org/10.1038/s41587-020-0439-x + +* Software dependencies + https://github.com/nf-core/funcscan/blob/master/CITATIONS.md +""" + } + summary { + beforeText = validation.help.beforeText + afterText = validation.help.afterText } } + +// Load modules.config for DSL2 module specific options +includeConfig 'conf/modules.config' diff --git a/nextflow_schema.json b/nextflow_schema.json index ca01d496..8558748d 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -1,10 +1,10 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/nf-core/funcscan/master/nextflow_schema.json", "title": "nf-core/funcscan pipeline parameters", "description": "Pipeline for screening for functional components of assembled contigs", "type": "object", - "definitions": { + "$defs": { "input_output_options": { "title": "Input/output options", "type": "object", @@ -66,7 +66,7 @@ "fa_icon": "fas fa-check-circle" } }, - "fa_icon": "fa fa-list-ol" + "fa_icon": "fa fa-list" }, "taxonomic_classification_general_options": { "title": "Taxonomic classification: general options", @@ -85,7 +85,16 @@ "default": "mmseqs2", "help_text": "This flag specifies which tool for taxonomic classification should be activated. At the moment only 'MMseqs2' is incorporated in the pipeline.", "description": "Specifies the tool used for taxonomic classification.", - "fa_icon": "fas fa-tools" + "fa_icon": "fas fa-tools", + "enum": ["mmseqs2"] + }, + "taxa_classification_mmseqs_compressed": { + "type": "integer", + "default": 0, + "enum": [0, 1], + "help_text": "To compress MMseqs2 output files, choose `1`, otherwise leave to `0`. Compressing output files can lead to errors when the output is actually empty. In that case, just leave this parameter to its default value. More details can be found in the [documentation (GitHub)](https://github.com/soedinglab/MMseqs2?tab=readme-ov-file#memory-requirements).\n\n> Modifies tool parameter(s):\n> - mmseqs createdb --compressed <0|1>\n> - mmseqs createtsv --compressed <0|1>\n> - mmseqs databases --compressed <0|1>\n> - mmseqs taxonomy --compressed <0|1>", + "description": "If MMseqs2 is chosen as taxonomic classification tool: Specifies if the output of all MMseqs2 subcommands shall be compressed.", + "fa_icon": "fas fa-file-archive" } }, "fa_icon": "fas fa-tag" @@ -152,15 +161,15 @@ "fa_icon": "fab fa-audible" }, "taxa_classification_mmseqs_taxonomy_sensitivity": { - "type": "string", - "default": "5.0", - "help_text": "This flag specifies the speed and sensitivity of the taxonomic search. It stands for how many kmers should be produced during the preliminary seeding stage. A very fast search requires a low value e.g. '1.0' and a a very sensitive search requires e.g. '7.0'. More details can be found in the [documentation](https://mmseqs.com/latest/userguide.pdf).\n\n> Modifies tool parameter(s):\n> - mmseqs taxonomy: `--s`", + "type": "number", + "default": 5.0, + "help_text": "This flag specifies the speed and sensitivity of the taxonomic search. It stands for how many kmers should be produced during the preliminary seeding stage. A very fast search requires a low value e.g. '1.0' and a a very sensitive search requires e.g. '7.0'. More details can be found in the [documentation](https://mmseqs.com/latest/userguide.pdf).\n\n> Modifies tool parameter(s):\n> - mmseqs taxonomy: `-s`", "description": "Specify the speed and sensitivity for taxonomy assignment.", "fa_icon": "fas fa-history" }, "taxa_classification_mmseqs_taxonomy_orffilters": { - "type": "string", - "default": "2.0", + "type": "number", + "default": 2.0, "help_text": "This flag specifies the sensitivity used for prefiltering the query ORF. Before the taxonomy-assigning step, MMseqs2 searches the predicted ORFs against the provided database. This value influences the speed with which the search is carried out. More details can be found in the [documentation](https://mmseqs.com/latest/userguide.pdf).\n\n> Modifies tool parameter(s):\n> - mmseqs taxonomy: `--orf-filter-s`", "description": "Specify the ORF search sensitivity in the prefilter step.", "fa_icon": "fas fa-history" @@ -344,6 +353,12 @@ "fa_icon": "fas fa-chart-pie", "description": "Activate generation of circular genome plots.", "help_text": "Activate this flag to generate genome plots (might be memory-intensive).\n\n> Modifies tool parameter(s):\n> - BAKTA: `--skip-plot`" + }, + "annotation_bakta_hmms": { + "type": "string", + "description": "Supply a path of an HMM file of trusted hidden markov models in HMMER format for CDS annotation", + "help_text": "Bakta accepts user-provided trusted HMMs via `--hmms` in HMMER's text format. If set, Bakta will adhere to the *trusted cutoff* specified in the HMM header. In addition, a max. evalue threshold of 1e-6 is applied. For more info please refer to the BAKTA [documentation](https://github.com/oschwengers/bakta).\n\n> Modifies tool parameter(s):\n> - BAKTA: `--hmms`", + "fa_icon": "fa-regular fa-square-check" } }, "fa_icon": "fas fa-file-signature" @@ -504,8 +519,64 @@ "fa_icon": "fas fa-barcode", "description": "Forces Pyrodigal to scan for motifs.", "help_text": "Forces Pyrodigal to a full scan for motifs rather than activating the Shine-Dalgarno RBS finder, the default scanner for Pyrodigal to train for motifs.\n\nFor more information check the Pyrodigal [documentation](https://pyrodigal.readthedocs.io).\n\n> Modifies tool parameter(s):\n> - PYRODIGAL: `-n`" + }, + "annotation_pyrodigal_usespecialstopcharacter": { + "type": "boolean", + "fa_icon": "fa fa-star-of-life", + "description": "This forces Pyrodigal to append asterisks (`*`) as stop codon indicators. Do not use when running AMP workflow.", + "help_text": "Some downstream tools like AMPlify cannot process sequences containing non-sequence characters like the stop codon indicator `*`. Thus, this flag is deactivated by default. Activate this flag to revert the behaviour and have Pyrodigal append `*` as stop codon indicator to annotated sequences.\n\nFor more information check the Pyrodigal [documentation](https://pyrodigal.readthedocs.io).\n\n> Modifies tool parameter(s):\n> - PYRODIGAL: `--no-stop-codon`" + } + }, + "fa_icon": "fas fa-file-signature" + }, + "protein_annotation": { + "title": "Protein Annotation: INTERPROSCAN", + "type": "object", + "description": "Functionally annotates all annotated coding regions.", + "default": "", + "properties": { + "run_protein_annotation": { + "type": "boolean", + "description": "Activates the functional annotation of annotated coding regions to provide more information about the codon regions classified.", + "help_text": "Activates the annotation of annotated coding regions. " + }, + "protein_annotation_tool": { + "type": "string", + "default": "InterProScan", + "help_text": "This flag specifies which tool for protein annotation should be activated.\nAt the moment only [InterProScan](https://github.com/ebi-pf-team/interproscan) is incorporated in the pipeline. This annotates the locus tags to protein and domain levels according to the InterPro databases.\n\nMore details can be found in the tool [documentation](https://interproscan-docs.readthedocs.io/en/latest/index.html).", + "description": "Specifies the tool used for further protein annotation.", + "fa_icon": "fas fa-tools", + "enum": ["InterProScan"] + }, + "protein_annotation_interproscan_db_url": { + "type": "string", + "default": "https://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.72-103.0/interproscan-5.72-103.0-64-bit.tar.gz", + "help_text": "This allows the user to change the InterProScan database version that the pipeline will download for you automatically. To instead use a pre-downloaded database, please supply its path to `--protein_annotation_interproscan_db`. Changing this URL allows for the use of the latest database release. By default this is set to `http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.72-103.0/interproscan-5.72-103.0-64-bit.tar.gz`. ", + "description": "Change the database version used for annotation.", + "fa_icon": "fas fa-database" + }, + "protein_annotation_interproscan_db": { + "type": "string", + "help_text": "Use this to supply the path to a pre-downloaded InterProScan database. This can be any unzipped InterProScan version.\n\nFor more details on where to find different InterProScan databases see tool [documentation](https://interproscan-docs.readthedocs.io/en/latest/UserDocs.html#obtaining-a-copy-of-interproscan).\n", + "description": "Path to pre-downloaded InterProScan database.", + "fa_icon": "fas fa-database" + }, + "protein_annotation_interproscan_applications": { + "type": "string", + "default": "PANTHER,ProSiteProfiles,ProSitePatterns,Pfam", + "pattern": "^\\w+(,\\w+)*", + "help_text": "A comma-separated string specifying the database(s) to be used to annotate the coding regions annotated during the contig annotation workflow of the pipeline. By default these include `PANTHER,ProSiteProfiles,ProSitePatterns,Pfam`.\n- PANTHER (Protein ANalysis THrough Evolutionary Relationships): genes classified by their functions, using published scientific experimental evidence and evolutionary relationships.\n- PROSITE: protein domains, families, functional sites and specific patterns and profiles to identify them.\n- PFAM: protein families, represented by multiple sequence alignments and hidden Markov models (HMMs).\n\nThese databases were chosen based on the AMP workflow and therefore, with only these databases, do we guarantee the integration of the results to the AMPcombi final summary.\n\nNOTE: Currently, no integration of the results are implemented for the BGC and the ARG final summary tables.\n\nFor more information about all possible databases see the tool [documentation](https://interproscan-docs.readthedocs.io/en/v5/HowToRun.html).\n\n> Modifies tool parameter(s):\n> - InterProScan: `--applications`", + "description": "Assigns the database(s) to be used to annotate the coding regions.", + "fa_icon": "fas fa-database" + }, + "protein_annotation_interproscan_enableprecalc": { + "type": "boolean", + "help_text": "This increases the speed of functional annotation with InterProScan by pre-calculating matches found in the UniProtKB, thereby identifying unique matches in the query sequences for faster annotation. By default this is turned off.\n\nFor more information about this flag see the tool [documentation](https://interproscan-docs.readthedocs.io/en/latest/HowToRun.html).\n\n> Modifies tool parameter(s):\n> - InterProScan: `---diasable-precalc`", + "description": "Pre-calculates residue mutual matches.", + "fa_icon": "fas fa-clock" } }, + "help_text": "This subworkflow adds additional protein annotations to all annotated coding regions. Currently, only annotation with InterProScan is integrated in the subworkflow.", "fa_icon": "fas fa-file-signature" }, "database_downloading_options": { @@ -535,7 +606,7 @@ "fa_icon": "fas fa-ban" } }, - "fa_icon": "fa fa-plus-square" + "fa_icon": "fas fa-plus" }, "amp_ampir": { "title": "AMP: ampir", @@ -564,7 +635,7 @@ "fa_icon": "fas fa-ruler-horizontal" } }, - "fa_icon": "fa fa-plus-square" + "fa_icon": "fas fa-plus" }, "amp_hmmsearch": { "title": "AMP: hmmsearch", @@ -603,7 +674,7 @@ "fa_icon": "fas fa-save" } }, - "fa_icon": "fa fa-plus-square", + "fa_icon": "fas fa-plus", "help_text": "HMMER/hmmsearch is used for searching sequence databases for sequence homologs, and for making sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). `hmmsearch` is used to search one or more profiles against a sequence database.\n\nFor more information check the HMMER [documentation](http://hmmer.org/).\n\n" }, "amp_macrel": { @@ -618,7 +689,7 @@ "fa_icon": "fas fa-ban" } }, - "fa_icon": "fa fa-plus-square" + "fa_icon": "fas fa-plus" }, "amp_ampcombi2_parsetables": { "title": "AMP: ampcombi2 parsetables", @@ -626,11 +697,18 @@ "description": "Antimicrobial peptides parsing, filtering, and annotating submodule of AMPcombi2. More info: https://github.com/Darcy220606/AMPcombi", "default": "", "properties": { + "amp_ampcombi_db_id": { + "type": "string", + "description": "The name of the database used to classify the AMPs.", + "help_text": "AMPcombi can use three different AMP databases to classify the recovered AMPS. These can either be: \n\n- [DRAMP database](http://dramp.cpu-bioinfor.org/downloads/): Only general AMPs are downloaded and filtered to remove any entry that has an instance of non amino acid residues in their sequence.\n\n- [APD](https://aps.unmc.edu/): Only experimentally validated AMPs are present.\n\n- [UniRef100](https://academic.oup.com/bioinformatics/article/23/10/1282/197795): Combines a more general protein dataset including curated and non curated AMPs. Helpful for identifying the clusters to remove any potential false positives. Beware: If the thresholds are for ampcombi are not strict enough, alignment with this database can take a long time. \n\nBy default this is set to 'DRAMP'. Other valid options include 'APD' or 'UniRef100'.\n\nFor more information check the AMPcombi [documentation](https://ampcombi.readthedocs.io/en/main/usage.html#parse-tables).", + "fa_icon": "fas fa-address-book", + "default": "DRAMP", + "enum": ["DRAMP", "APD", "UniRef100"] + }, "amp_ampcombi_db": { "type": "string", - "description": "Path to AMPcombi reference database directory (DRAMP).", - "help_text": "AMPcombi uses the 'general AMPs' dataset of the [DRAMP database](http://dramp.cpu-bioinfor.org/downloads/) for taxonomic classification. If you have a local version of it, you can provide the path to the directory(!) that contains the following reference database files:\n1. fasta file with `.fasta` file extension\n2. the corresponding table with with functional and taxonomic classifications in `.tsv` file extension.\n\nThe contents of the directory should have files such as `*.dmnd` and `*.fasta` in the top level.\n\nFor more information check the AMPcombi [documentation](https://github.com/Darcy220606/AMPcombi).", - "fa_icon": "fas fa-address-book" + "description": "The path to the folder containing the reference database files.", + "help_text": "The path to the folder containing the reference database files (`*.fasta` and `*.tsv`); a fasta file and the corresponding table with structural, functional and if reported taxonomic classifications. AMPcombi will then generate the corresponding `mmseqs2` directory, in which all binary files are prepared for the downstream alignment of teh recovered AMPs with [MMseqs2](https://github.com/soedinglab/MMseqs2). These can also be provided by the user by setting up an mmseqs2 compatible database using `mmseqs createdb *.fasta` in a directory called `mmseqs2`.\n\nExample file structure for the reference database supplied by the user:\n\n```bash\namp_DRAMP_database/\n\u251c\u2500\u2500 general_amps_2024_11_13.fasta\n\u251c\u2500\u2500 general_amps_2024_11_13.txt\n\u2514\u2500\u2500 mmseqs2\n \u251c\u2500\u2500 ref_DB\n \u251c\u2500\u2500 ref_DB.dbtype\n \u251c\u2500\u2500 ref_DB_h\n \u251c\u2500\u2500 ref_DB_h.dbtype\n \u251c\u2500\u2500 ref_DB_h.index\n \u251c\u2500\u2500 ref_DB.index\n \u251c\u2500\u2500 ref_DB.lookup\n \u2514\u2500\u2500 ref_DB.source\n\nFor more information check the AMPcombi [documentation](https://ampcombi.readthedocs.io/en/main/usage.html#parse-tables)." }, "amp_ampcombi_parsetables_cutoff": { "type": "number", @@ -641,7 +719,7 @@ }, "amp_ampcombi_parsetables_aalength": { "type": "integer", - "default": 100, + "default": 120, "description": "Filter out all amino acid fragments shorter than this number.", "help_text": "Any AMP hit that does not satisfy this length cut-off will be removed from the final AMPcombi2 summary table.\n\n> Modifies tool parameter(s):\n> - AMPCOMBI: `--aminoacid_length`", "fa_icon": "fas fa-ruler-horizontal" @@ -709,7 +787,7 @@ "fa_icon": "fas fa-address-card" } }, - "fa_icon": "fa fa-plus-square" + "fa_icon": "fas fa-plus" }, "amp_ampcombi2_cluster": { "title": "AMP: ampcombi2 cluster", @@ -766,7 +844,7 @@ "fa_icon": "fas fa-book-dead" } }, - "fa_icon": "fa fa-plus-square" + "fa_icon": "fas fa-plus" }, "arg_amrfinderplus": { "title": "ARG: AMRFinderPlus", @@ -840,7 +918,7 @@ "type": "string", "fa_icon": "fas fa-database", "description": "Specify the path to the DeepARG database.", - "help_text": "Specify the path to a local version of the DeepARG database (see the pipelines' usage [documentation](https://nf-co.re/funcscan/dev/docs/usage#databases-and-reference-files)).\n\nThe contents of the directory should include directories such as `database`, `moderl`, and files such as `deeparg.gz` etc. in the top level.\n\nIf no input is given, the module will download the database for you, however this is not recommended, as the database is large and this will take time.\n\n> Modifies tool parameter(s):\n> - DeepARG: `--data-path`" + "help_text": "Specify the path to a local version of the DeepARG database (see the pipelines' usage [documentation](usage#databases-and-reference-files)).\n\nThe contents of the directory should include directories such as `database`, `moderl`, and files such as `deeparg.gz` etc. in the top level.\n\nIf no input is given, the module will download the database for you, however this is not recommended, as the database is large and this will take time.\n\n> Modifies tool parameter(s):\n> - DeepARG: `--data-path`" }, "arg_deeparg_db_version": { "type": "integer", @@ -1106,7 +1184,7 @@ "type": "object", "description": "These parameters influence general BGC settings like minimum input sequence length.", "default": "", - "fa_icon": "fa fa-sliders", + "fa_icon": "fas fa-angle-double-right", "properties": { "bgc_mincontiglength": { "type": "integer", @@ -1192,14 +1270,12 @@ }, "bgc_antismash_pfam2go": { "type": "boolean", - "default": false, "description": "Run Pfam to Gene Ontology mapping module.", "help_text": "This maps the proteins to Pfam database to annotate BGC modules with functional information based on the protein families they contain. For more information see the antiSMASH [documentation](https://docs.antismash.secondarymetabolites.org/).\n\n> Modifies tool parameter(s):\n> - antiSMASH: `--pfam2go`", "fa_icon": "fas fa-search" }, "bgc_antismash_rre": { "type": "boolean", - "default": false, "description": "Run RREFinder precision mode on all RiPP gene clusters.", "help_text": "This enables the prediction of regulatory elements on the BGC that help in the control of protein expression. For more information see the antiSMASH [documentation](https://docs.antismash.secondarymetabolites.org/).\n\n> Modifies tool parameter(s):\n> - antiSMASH: `--rre`", "fa_icon": "fas fa-search" @@ -1214,13 +1290,12 @@ }, "bgc_antismash_tfbs": { "type": "boolean", - "default": false, "description": "Run TFBS finder on all gene clusters.", "help_text": "This enables the prediction of transcription factor binding sites which control the gene expression. For more information see the antiSMASH [documentation](https://docs.antismash.secondarymetabolites.org/).\n\n> Modifies tool parameter(s):\n> - antiSMASH: `--tfbs`", "fa_icon": "fas fa-search" } }, - "fa_icon": "fa fa-sliders" + "fa_icon": "fas fa-angle-double-right" }, "bgc_deepbgc": { "title": "BGC: DeepBGC", @@ -1302,7 +1377,7 @@ "help_text": "DeepBGC classification score threshold for assigning classes to BGCs.\n\nFor more information see the DeepBGC [documentation](https://github.com/Merck/deepbgc).\n\n> Modifies tool parameter(s)\n> - DeepBGC: `--classifier-score`" } }, - "fa_icon": "fa fa-sliders" + "fa_icon": "fas fa-angle-double-right" }, "bgc_gecco": { "title": "BGC: GECCO", @@ -1350,7 +1425,7 @@ "fa_icon": "fas fa-ruler-horizontal" } }, - "fa_icon": "fa fa-sliders" + "fa_icon": "fas fa-angle-double-right" }, "bgc_hmmsearch": { "title": "BGC: hmmsearch", @@ -1389,7 +1464,7 @@ "fa_icon": "fas fa-save" } }, - "fa_icon": "fa fa-sliders" + "fa_icon": "fas fa-angle-double-right" }, "institutional_config_options": { "title": "Institutional config options", @@ -1439,41 +1514,6 @@ } } }, - "max_job_request_options": { - "title": "Max job request options", - "type": "object", - "fa_icon": "fab fa-acquisitions-incorporated", - "description": "Set the top limit for requested resources for any single job.", - "help_text": "If you are running on a smaller system, a pipeline step requesting more resources than are available may cause the Nextflow to stop the run with an error. These options allow you to cap the maximum resources requested by any single job so that the pipeline will run on your system.\n\nNote that you can not _increase_ the resources requested by any job using these options. For that you will need your own configuration file. See [the nf-core website](https://nf-co.re/usage/configuration) for details.", - "properties": { - "max_cpus": { - "type": "integer", - "description": "Maximum number of CPUs that can be requested for any single job.", - "default": 16, - "fa_icon": "fas fa-microchip", - "hidden": true, - "help_text": "Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. `--max_cpus 1`" - }, - "max_memory": { - "type": "string", - "description": "Maximum amount of memory that can be requested for any single job.", - "default": "128.GB", - "fa_icon": "fas fa-memory", - "pattern": "^\\d+(\\.\\d+)?\\.?\\s*(K|M|G|T)?B$", - "hidden": true, - "help_text": "Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`" - }, - "max_time": { - "type": "string", - "description": "Maximum amount of time that can be requested for any single job.", - "default": "240.h", - "fa_icon": "far fa-clock", - "pattern": "^(\\d+\\.?\\s*(s|m|h|d|day)\\s*)+$", - "hidden": true, - "help_text": "Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`" - } - } - }, "generic_options": { "title": "Generic options", "type": "object", @@ -1481,12 +1521,6 @@ "description": "Less common options for the pipeline, typically set in a config file.", "help_text": "These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\n\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.", "properties": { - "help": { - "type": "boolean", - "description": "Display help text.", - "fa_icon": "fas fa-question-circle", - "hidden": true - }, "version": { "type": "boolean", "description": "Display version and exit.", @@ -1562,133 +1596,118 @@ "fa_icon": "fas fa-check-square", "hidden": true }, - "validationShowHiddenParams": { - "type": "boolean", - "fa_icon": "far fa-eye-slash", - "description": "Show all params when using `--help`", - "hidden": true, - "help_text": "By default, parameters set as _hidden_ in the schema are not shown on the command line when a user runs with `--help`. Specifying this option will tell the pipeline to show all parameters." - }, - "validationFailUnrecognisedParams": { - "type": "boolean", - "fa_icon": "far fa-check-circle", - "description": "Validation of parameters fails when an unrecognised parameter is found.", - "hidden": true, - "help_text": "By default, when an unrecognised parameter is found, it returns a warinig." - }, - "validationLenientMode": { - "type": "boolean", - "fa_icon": "far fa-check-circle", - "description": "Validation of parameters in lenient more.", - "hidden": true, - "help_text": "Allows string values that are parseable as numbers or booleans. For further information see [JSONSchema docs](https://github.com/everit-org/json-schema#lenient-mode)." - }, "pipelines_testdata_base_path": { "type": "string", "fa_icon": "far fa-check-circle", "description": "Base URL or local path to location of pipeline test dataset files", "default": "https://raw.githubusercontent.com/nf-core/test-datasets/", "hidden": true + }, + "trace_report_suffix": { + "type": "string", + "fa_icon": "far calendar", + "description": "Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.", + "hidden": true } } } }, "allOf": [ { - "$ref": "#/definitions/input_output_options" + "$ref": "#/$defs/input_output_options" }, { - "$ref": "#/definitions/screening_type_activation" + "$ref": "#/$defs/screening_type_activation" }, { - "$ref": "#/definitions/taxonomic_classification_general_options" + "$ref": "#/$defs/taxonomic_classification_general_options" }, { - "$ref": "#/definitions/taxonomic_classification_mmseqs2_databases" + "$ref": "#/$defs/taxonomic_classification_mmseqs2_databases" }, { - "$ref": "#/definitions/taxonomic_classification_mmseqs2_taxonomy" + "$ref": "#/$defs/taxonomic_classification_mmseqs2_taxonomy" }, { - "$ref": "#/definitions/annotation_general_options" + "$ref": "#/$defs/annotation_general_options" }, { - "$ref": "#/definitions/annotation_bakta" + "$ref": "#/$defs/annotation_bakta" }, { - "$ref": "#/definitions/annotation_prokka" + "$ref": "#/$defs/annotation_prokka" }, { - "$ref": "#/definitions/annotation_prodigal" + "$ref": "#/$defs/annotation_prodigal" }, { - "$ref": "#/definitions/annotation_pyrodigal" + "$ref": "#/$defs/annotation_pyrodigal" }, { - "$ref": "#/definitions/database_downloading_options" + "$ref": "#/$defs/protein_annotation" }, { - "$ref": "#/definitions/amp_amplify" + "$ref": "#/$defs/database_downloading_options" }, { - "$ref": "#/definitions/amp_ampir" + "$ref": "#/$defs/amp_amplify" }, { - "$ref": "#/definitions/amp_hmmsearch" + "$ref": "#/$defs/amp_ampir" }, { - "$ref": "#/definitions/amp_macrel" + "$ref": "#/$defs/amp_hmmsearch" }, { - "$ref": "#/definitions/amp_ampcombi2_parsetables" + "$ref": "#/$defs/amp_macrel" }, { - "$ref": "#/definitions/amp_ampcombi2_cluster" + "$ref": "#/$defs/amp_ampcombi2_parsetables" }, { - "$ref": "#/definitions/arg_amrfinderplus" + "$ref": "#/$defs/amp_ampcombi2_cluster" }, { - "$ref": "#/definitions/arg_deeparg" + "$ref": "#/$defs/arg_amrfinderplus" }, { - "$ref": "#/definitions/arg_fargene" + "$ref": "#/$defs/arg_deeparg" }, { - "$ref": "#/definitions/arg_rgi" + "$ref": "#/$defs/arg_fargene" }, { - "$ref": "#/definitions/arg_abricate" + "$ref": "#/$defs/arg_rgi" }, { - "$ref": "#/definitions/arg_hamronization" + "$ref": "#/$defs/arg_abricate" }, { - "$ref": "#/definitions/arg_argnorm" + "$ref": "#/$defs/arg_hamronization" }, { - "$ref": "#/definitions/bgc_general_options" + "$ref": "#/$defs/arg_argnorm" }, { - "$ref": "#/definitions/bgc_antismash" + "$ref": "#/$defs/bgc_general_options" }, { - "$ref": "#/definitions/bgc_deepbgc" + "$ref": "#/$defs/bgc_antismash" }, { - "$ref": "#/definitions/bgc_gecco" + "$ref": "#/$defs/bgc_deepbgc" }, { - "$ref": "#/definitions/bgc_hmmsearch" + "$ref": "#/$defs/bgc_gecco" }, { - "$ref": "#/definitions/institutional_config_options" + "$ref": "#/$defs/bgc_hmmsearch" }, { - "$ref": "#/definitions/max_job_request_options" + "$ref": "#/$defs/institutional_config_options" }, { - "$ref": "#/definitions/generic_options" + "$ref": "#/$defs/generic_options" } ] } diff --git a/ro-crate-metadata.json b/ro-crate-metadata.json new file mode 100644 index 00000000..0134e1b0 --- /dev/null +++ b/ro-crate-metadata.json @@ -0,0 +1,373 @@ +{ + "@context": [ + "https://w3id.org/ro/crate/1.1/context", + { + "GithubService": "https://w3id.org/ro/terms/test#GithubService", + "JenkinsService": "https://w3id.org/ro/terms/test#JenkinsService", + "PlanemoEngine": "https://w3id.org/ro/terms/test#PlanemoEngine", + "TestDefinition": "https://w3id.org/ro/terms/test#TestDefinition", + "TestInstance": "https://w3id.org/ro/terms/test#TestInstance", + "TestService": "https://w3id.org/ro/terms/test#TestService", + "TestSuite": "https://w3id.org/ro/terms/test#TestSuite", + "TravisService": "https://w3id.org/ro/terms/test#TravisService", + "definition": "https://w3id.org/ro/terms/test#definition", + "engineVersion": "https://w3id.org/ro/terms/test#engineVersion", + "instance": "https://w3id.org/ro/terms/test#instance", + "resource": "https://w3id.org/ro/terms/test#resource", + "runsOn": "https://w3id.org/ro/terms/test#runsOn" + } + ], + "@graph": [ + { + "@id": "./", + "@type": "Dataset", + "creativeWorkStatus": "Stable", + "datePublished": "2025-02-13T18:40:39+00:00", + "description": "

\n \n \n \"nf-core/funcscan\"\n \n

\n\n[![GitHub Actions CI Status](https://github.com/nf-core/funcscan/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-core/funcscan/actions/workflows/ci.yml)\n[![GitHub Actions Linting Status](https://github.com/nf-core/funcscan/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/funcscan/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/funcscan/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.7643099-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.7643099)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.2-23aa62.svg)](https://www.nextflow.io/)\n[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/funcscan)\n\n[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23funcscan-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/funcscan)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)\n\n## Introduction\n\n**nf-core/funcscan** is a bioinformatics best-practice analysis pipeline for the screening of nucleotide sequences such as assembled contigs for functional genes. It currently features mining for antimicrobial peptides, antibiotic resistance genes and biosynthetic gene clusters.\n\nThe pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!\n\nOn release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/funcscan/results).\n\nThe nf-core/funcscan AWS full test dataset are contigs generated by the MGnify service from the ENA. We used contigs generated from assemblies of chicken cecum shotgun metagenomes (study accession: MGYS00005631).\n\n## Pipeline summary\n\n1. Quality control of input sequences with [`SeqKit`](https://bioinf.shenwei.me/seqkit/)\n2. Taxonomic classification of contigs of **prokaryotic origin** with [`MMseqs2`](https://github.com/soedinglab/MMseqs2)\n3. Annotation of assembled prokaryotic contigs with [`Prodigal`](https://github.com/hyattpd/Prodigal), [`Pyrodigal`](https://github.com/althonos/pyrodigal), [`Prokka`](https://github.com/tseemann/prokka), or [`Bakta`](https://github.com/oschwengers/bakta)\n4. Annotation of coding sequences from 3. to obtain general protein families and domains with [`InterProScan`](https://github.com/ebi-pf-team/interproscan)\n5. Screening contigs for antimicrobial peptide-like sequences with [`ampir`](https://cran.r-project.org/web/packages/ampir/index.html), [`Macrel`](https://github.com/BigDataBiology/macrel), [`HMMER`](http://hmmer.org/), [`AMPlify`](https://github.com/bcgsc/AMPlify)\n6. Screening contigs for antibiotic resistant gene-like sequences with [`ABRicate`](https://github.com/tseemann/abricate), [`AMRFinderPlus`](https://github.com/ncbi/amr), [`fARGene`](https://github.com/fannyhb/fargene), [`RGI`](https://card.mcmaster.ca/analyze/rgi), [`DeepARG`](https://bench.cs.vt.edu/deeparg). [`argNorm`](https://github.com/BigDataBiology/argNorm) is used to map the outputs of `DeepARG`, `AMRFinderPlus`, and `ABRicate` to the [`Antibiotic Resistance Ontology`](https://www.ebi.ac.uk/ols4/ontologies/aro) for consistent ARG classification terms.\n7. Screening contigs for biosynthetic gene cluster-like sequences with [`antiSMASH`](https://antismash.secondarymetabolites.org), [`DeepBGC`](https://github.com/Merck/deepbgc), [`GECCO`](https://gecco.embl.de/), [`HMMER`](http://hmmer.org/)\n8. Creating aggregated reports for all samples across the workflows with [`AMPcombi`](https://github.com/Darcy220606/AMPcombi) for AMPs, [`hAMRonization`](https://github.com/pha4ge/hAMRonization) for ARGs, and [`comBGC`](https://raw.githubusercontent.com/nf-core/funcscan/master/bin/comBGC.py) for BGCs\n9. Software version and methods text reporting with [`MultiQC`](http://multiqc.info/)\n\n![funcscan metro workflow](docs/images/funcscan_metro_workflow.png)\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.\n\nFirst, prepare a samplesheet with your input data that looks as follows:\n\n`samplesheet.csv`:\n\n```csv\nsample,fasta\nCONTROL_REP1,AEG588A1_001.fasta\nCONTROL_REP2,AEG588A1_002.fasta\nCONTROL_REP3,AEG588A1_003.fasta\n```\n\nEach row represents a (multi-)fasta file of assembled contig sequences.\n\nNow, you can run the pipeline using:\n\n```bash\nnextflow run nf-core/funcscan \\\n -profile \\\n --input samplesheet.csv \\\n --outdir \\\n --run_amp_screening \\\n --run_arg_screening \\\n --run_bgc_screening\n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).\n\nFor more details and further functionality, please refer to the [usage documentation](https://nf-co.re/funcscan/usage) and the [parameter documentation](https://nf-co.re/funcscan/parameters).\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/funcscan/results) tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\n[output documentation](https://nf-co.re/funcscan/output).\n\n## Credits\n\nnf-core/funcscan was originally written by Jasmin Frangenberg, Anan Ibrahim, Louisa Perelo, Moritz E. Beber, James A. Fellows Yates.\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\nAdam Talbot, Alexandru Mizeranschi, Hugo Tavares, J\u00falia Mir Pedrol, Martin Klapper, Mehrdad Jaberi, Robert Syme, Rosa Herbst, Vedanth Ramji, @Microbion.\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).\n\nFor further information or help, don't hesitate to get in touch on the [Slack `#funcscan` channel](https://nfcore.slack.com/channels/funcscan) (you can join with [this invite](https://nf-co.re/join/slack)).\n\n## Citations\n\nIf you use nf-core/funcscan for your analysis, please cite it using the following doi: [10.5281/zenodo.7643099](https://doi.org/10.5281/zenodo.7643099)\n\nAn extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.\n\nYou can cite the `nf-core` publication as follows:\n\n> **The nf-core framework for community-curated bioinformatics pipelines.**\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).\n", + "hasPart": [ + { + "@id": "main.nf" + }, + { + "@id": "assets/" + }, + { + "@id": "bin/" + }, + { + "@id": "conf/" + }, + { + "@id": "docs/" + }, + { + "@id": "docs/images/" + }, + { + "@id": "modules/" + }, + { + "@id": "modules/local/" + }, + { + "@id": "modules/nf-core/" + }, + { + "@id": "workflows/" + }, + { + "@id": "subworkflows/" + }, + { + "@id": "nextflow.config" + }, + { + "@id": "README.md" + }, + { + "@id": "nextflow_schema.json" + }, + { + "@id": "CHANGELOG.md" + }, + { + "@id": "LICENSE" + }, + { + "@id": "CODE_OF_CONDUCT.md" + }, + { + "@id": "CITATIONS.md" + }, + { + "@id": "modules.json" + }, + { + "@id": "docs/usage.md" + }, + { + "@id": "docs/output.md" + }, + { + "@id": ".nf-core.yml" + }, + { + "@id": ".pre-commit-config.yaml" + }, + { + "@id": ".prettierignore" + } + ], + "isBasedOn": "https://github.com/nf-core/funcscan", + "license": "MIT", + "mainEntity": { + "@id": "main.nf" + }, + "mentions": [ + { + "@id": "#cac746e8-fa6f-43e9-9c1a-502818d58a70" + } + ], + "name": "nf-core/funcscan" + }, + { + "@id": "ro-crate-metadata.json", + "@type": "CreativeWork", + "about": { + "@id": "./" + }, + "conformsTo": [ + { + "@id": "https://w3id.org/ro/crate/1.1" + }, + { + "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0" + } + ] + }, + { + "@id": "main.nf", + "@type": [ + "File", + "SoftwareSourceCode", + "ComputationalWorkflow" + ], + "creator": [ + { + "@id": "#jfy133@gmail.com" + }, + { + "@id": "https://orcid.org/0009-0004-5961-4709" + }, + { + "@id": "#jfy133@gmail.com" + } + ], + "dateCreated": "", + "dateModified": "2025-02-13T19:40:39Z", + "dct:conformsTo": "https://bioschemas.org/profiles/ComputationalWorkflow/1.0-RELEASE/", + "keywords": [ + "nf-core", + "nextflow", + "amp", + "amr", + "antibiotic-resistance", + "antimicrobial-peptides", + "antimicrobial-resistance-genes", + "arg", + "assembly", + "bgc", + "biosynthetic-gene-clusters", + "contigs", + "function", + "metagenomics", + "natural-products", + "screening", + "secondary-metabolites" + ], + "license": [ + "MIT" + ], + "maintainer": [ + { + "@id": "#jfy133@gmail.com" + }, + { + "@id": "https://orcid.org/0009-0004-5961-4709" + } + ], + "name": [ + "nf-core/funcscan" + ], + "programmingLanguage": { + "@id": "https://w3id.org/workflowhub/workflow-ro-crate#nextflow" + }, + "sdPublisher": { + "@id": "https://nf-co.re/" + }, + "url": [ + "https://github.com/nf-core/funcscan", + "https://nf-co.re/funcscan/2.1.0/" + ], + "version": [ + "2.1.0" + ] + }, + { + "@id": "https://w3id.org/workflowhub/workflow-ro-crate#nextflow", + "@type": "ComputerLanguage", + "identifier": { + "@id": "https://www.nextflow.io/" + }, + "name": "Nextflow", + "url": { + "@id": "https://www.nextflow.io/" + }, + "version": "!>=24.04.2" + }, + { + "@id": "#cac746e8-fa6f-43e9-9c1a-502818d58a70", + "@type": "TestSuite", + "instance": [ + { + "@id": "#1e3852e9-78ed-464a-94a3-b3406cd7c32f" + } + ], + "mainEntity": { + "@id": "main.nf" + }, + "name": "Test suite for nf-core/funcscan" + }, + { + "@id": "#1e3852e9-78ed-464a-94a3-b3406cd7c32f", + "@type": "TestInstance", + "name": "GitHub Actions workflow for testing nf-core/funcscan", + "resource": "repos/nf-core/funcscan/actions/workflows/ci.yml", + "runsOn": { + "@id": "https://w3id.org/ro/terms/test#GithubService" + }, + "url": "https://api.github.com" + }, + { + "@id": "https://w3id.org/ro/terms/test#GithubService", + "@type": "TestService", + "name": "Github Actions", + "url": { + "@id": "https://github.com" + } + }, + { + "@id": "assets/", + "@type": "Dataset", + "description": "Additional files" + }, + { + "@id": "bin/", + "@type": "Dataset", + "description": "Scripts that must be callable from a pipeline process" + }, + { + "@id": "conf/", + "@type": "Dataset", + "description": "Configuration files" + }, + { + "@id": "docs/", + "@type": "Dataset", + "description": "Markdown files for documenting the pipeline" + }, + { + "@id": "docs/images/", + "@type": "Dataset", + "description": "Images for the documentation files" + }, + { + "@id": "modules/", + "@type": "Dataset", + "description": "Modules used by the pipeline" + }, + { + "@id": "modules/local/", + "@type": "Dataset", + "description": "Pipeline-specific modules" + }, + { + "@id": "modules/nf-core/", + "@type": "Dataset", + "description": "nf-core modules" + }, + { + "@id": "workflows/", + "@type": "Dataset", + "description": "Main pipeline workflows to be executed in main.nf" + }, + { + "@id": "subworkflows/", + "@type": "Dataset", + "description": "Smaller subworkflows" + }, + { + "@id": "nextflow.config", + "@type": "File", + "description": "Main Nextflow configuration file" + }, + { + "@id": "README.md", + "@type": "File", + "description": "Basic pipeline usage information" + }, + { + "@id": "nextflow_schema.json", + "@type": "File", + "description": "JSON schema for pipeline parameter specification" + }, + { + "@id": "CHANGELOG.md", + "@type": "File", + "description": "Information on changes made to the pipeline" + }, + { + "@id": "LICENSE", + "@type": "File", + "description": "The license - should be MIT" + }, + { + "@id": "CODE_OF_CONDUCT.md", + "@type": "File", + "description": "The nf-core code of conduct" + }, + { + "@id": "CITATIONS.md", + "@type": "File", + "description": "Citations needed when using the pipeline" + }, + { + "@id": "modules.json", + "@type": "File", + "description": "Version information for modules from nf-core/modules" + }, + { + "@id": "docs/usage.md", + "@type": "File", + "description": "Usage documentation" + }, + { + "@id": "docs/output.md", + "@type": "File", + "description": "Output documentation" + }, + { + "@id": ".nf-core.yml", + "@type": "File", + "description": "nf-core configuration file, configuring template features and linting rules" + }, + { + "@id": ".pre-commit-config.yaml", + "@type": "File", + "description": "Configuration file for pre-commit hooks" + }, + { + "@id": ".prettierignore", + "@type": "File", + "description": "Ignore file for prettier" + }, + { + "@id": "https://nf-co.re/", + "@type": "Organization", + "name": "nf-core", + "url": "https://nf-co.re/" + }, + { + "@id": "#jfy133@gmail.com", + "@type": "Person", + "email": "jfy133@gmail.com", + "name": "James Fellows Yates" + }, + { + "@id": "https://orcid.org/0009-0004-5961-4709", + "@type": "Person", + "email": "73216762+jasmezz@users.noreply.github.com", + "name": "Jasmin Frangenberg" + } + ] +} \ No newline at end of file diff --git a/subworkflows/local/amp.nf b/subworkflows/local/amp.nf index 88f75393..0b97a7d9 100644 --- a/subworkflows/local/amp.nf +++ b/subworkflows/local/amp.nf @@ -6,7 +6,7 @@ include { MACREL_CONTIGS } from '.. include { HMMER_HMMSEARCH as AMP_HMMER_HMMSEARCH } from '../../modules/nf-core/hmmer/hmmsearch/main' include { AMPLIFY_PREDICT } from '../../modules/nf-core/amplify/predict/main' include { AMPIR } from '../../modules/nf-core/ampir/main' -include { DRAMP_DOWNLOAD } from '../../modules/local/dramp_download' +include { AMP_DATABASE_DOWNLOAD } from '../../modules/local/amp_database_download' include { AMPCOMBI2_PARSETABLES } from '../../modules/nf-core/ampcombi2/parsetables' include { AMPCOMBI2_COMPLETE } from '../../modules/nf-core/ampcombi2/complete' include { AMPCOMBI2_CLUSTER } from '../../modules/nf-core/ampcombi2/cluster' @@ -17,18 +17,18 @@ include { MERGE_TAXONOMY_AMPCOMBI } from '.. workflow AMP { take: - fastas // tuple val(meta), path(contigs) - faas // tuple val(meta), path(PROKKA/PRODIGAL.out.faa) - tsvs // tuple val(meta), path(MMSEQS_CREATETSV.out.tsv) - gbks // tuple val(meta), path(ANNOTATION_ANNOTATION_TOOL.out.gbk) + fastas // tuple val(meta), path(contigs) + faas // tuple val(meta), path(PROKKA/PRODIGAL.out.faa) + tsvs // tuple val(meta), path(MMSEQS_CREATETSV.out.tsv) + gbks // tuple val(meta), path(ANNOTATION_ANNOTATION_TOOL.out.gbk) + tsvs_interpro // tuple val(meta), path(INTERPROSCAN.out.tsv)' main: ch_versions = Channel.empty() ch_ampresults_for_ampcombi = Channel.empty() - ch_ampcombi_summaries = Channel.empty() ch_macrel_faa = Channel.empty() - ch_ampcombi_complete = Channel.empty() - ch_ampcombi_for_cluster = Channel.empty() + ch_ampcombi_summaries = Channel.empty() + ch_ampcombi_complete = null // When adding new tool that requires FAA, make sure to update conditions // in funcscan.nf around annotation and AMP subworkflow execution @@ -38,6 +38,7 @@ workflow AMP { ch_faa_for_ampir = faas ch_faa_for_ampcombi = faas ch_gbk_for_ampcombi = gbks + ch_interpro_for_ampcombi = tsvs_interpro // AMPLIFY if ( !params.amp_skip_amplify ) { @@ -104,30 +105,40 @@ workflow AMP { .groupTuple() .join( ch_faa_for_ampcombi ) .join( ch_gbk_for_ampcombi ) + .join( ch_interpro_for_ampcombi ) .multiMap{ input: [ it[0], it[1] ] faa: it[2] gbk: it[3] + interpro: it [4] } + // AMPCOMBI2::PARSETABLES if ( params.amp_ampcombi_db != null ) { - AMPCOMBI2_PARSETABLES ( ch_input_for_ampcombi.input, ch_input_for_ampcombi.faa, ch_input_for_ampcombi.gbk, params.amp_ampcombi_db ) - } else { - DRAMP_DOWNLOAD() - ch_versions = ch_versions.mix( DRAMP_DOWNLOAD.out.versions ) - ch_ampcombi_input_db = DRAMP_DOWNLOAD.out.db - AMPCOMBI2_PARSETABLES ( ch_input_for_ampcombi.input, ch_input_for_ampcombi.faa, ch_input_for_ampcombi.gbk, ch_ampcombi_input_db ) - } + AMPCOMBI2_PARSETABLES ( ch_input_for_ampcombi.input, ch_input_for_ampcombi.faa, ch_input_for_ampcombi.gbk, params.amp_ampcombi_db_id, params.amp_ampcombi_db, ch_input_for_ampcombi.interpro ) + } else { + AMP_DATABASE_DOWNLOAD( params.amp_ampcombi_db_id ) + ch_versions = ch_versions.mix( AMP_DATABASE_DOWNLOAD.out.versions ) + ch_ampcombi_input_db = AMP_DATABASE_DOWNLOAD.out.db + AMPCOMBI2_PARSETABLES ( ch_input_for_ampcombi.input, ch_input_for_ampcombi.faa, ch_input_for_ampcombi.gbk, params.amp_ampcombi_db_id, ch_ampcombi_input_db, ch_input_for_ampcombi.interpro ) + } ch_versions = ch_versions.mix( AMPCOMBI2_PARSETABLES.out.versions ) ch_ampcombi_summaries = AMPCOMBI2_PARSETABLES.out.tsv.map{ it[1] }.collect() - AMPCOMBI2_COMPLETE ( ch_ampcombi_summaries ) - ch_versions = ch_versions.mix( AMPCOMBI2_COMPLETE.out.versions ) + // AMPCOMBI2::COMPLETE + ch_summary_count = ch_ampcombi_summaries.map { it.size() }.sum() - ch_ampcombi_complete = AMPCOMBI2_COMPLETE.out.tsv + if ( ch_summary_count == 0 || ch_summary_count == 1 ) { + log.warn("[nf-core/funcscan] AMPCOMBI2: ${ch_summary_count} file(s) passed. Skipping AMPCOMBI2_COMPLETE, AMPCOMBI2_CLUSTER, and TAXONOMY MERGING steps.") + } else { + AMPCOMBI2_COMPLETE(ch_ampcombi_summaries) + ch_versions = ch_versions.mix( AMPCOMBI2_COMPLETE.out.versions ) + ch_ampcombi_complete = AMPCOMBI2_COMPLETE.out.tsv .filter { file -> file.countLines() > 1 } + } + // AMPCOMBI2::CLUSTER if ( ch_ampcombi_complete != null ) { AMPCOMBI2_CLUSTER ( ch_ampcombi_complete ) ch_versions = ch_versions.mix( AMPCOMBI2_CLUSTER.out.versions ) diff --git a/subworkflows/local/annotation.nf b/subworkflows/local/annotation.nf index c1c8e332..a59fe561 100644 --- a/subworkflows/local/annotation.nf +++ b/subworkflows/local/annotation.nf @@ -19,72 +19,75 @@ workflow ANNOTATION { fasta // tuple val(meta), path(contigs) main: - ch_versions = Channel.empty() + ch_versions = Channel.empty() ch_multiqc_files = Channel.empty() - if ( params.annotation_tool == "pyrodigal" || ( params.annotation_tool == "prodigal" && params.run_bgc_screening == true && ( !params.bgc_skip_antismash || !params.bgc_skip_deepbgc || !params.bgc_skip_gecco ) ) || ( params.annotation_tool == "prodigal" && params.run_amp_screening == true ) ) { // Need to use Pyrodigal for most BGC tools and AMPcombi because Prodigal GBK annotation format is incompatible with them. + if (params.annotation_tool == "pyrodigal" || (params.annotation_tool == "prodigal" && params.run_bgc_screening == true && (!params.bgc_skip_antismash || !params.bgc_skip_deepbgc || !params.bgc_skip_gecco)) || (params.annotation_tool == "prodigal" && params.run_amp_screening == true)) { + // Need to use Pyrodigal for most BGC tools and AMPcombi because Prodigal GBK annotation format is incompatible with them. - if ( params.annotation_tool == "prodigal" && params.run_bgc_screening == true && ( !params.bgc_skip_antismash || !params.bgc_skip_deepbgc || !params.bgc_skip_gecco ) ) { - log.warn("[nf-core/funcscan] Switching annotation tool to: Pyrodigal. This is because Prodigal annotations (in GBK format) are incompatible with antiSMASH, DeepBGC, and GECCO. If you specifically wish to run Prodigal instead, please skip antiSMASH, DeepBGC, and GECCO or provide a pre-annotated GBK file in the samplesheet.") - } else if ( params.annotation_tool == "prodigal" && params.run_amp_screening == true ) { - log.warn("[nf-core/funcscan] Switching annotation tool to: Pyrodigal. This is because Prodigal annotations (in GBK format) are incompatible with AMPcombi. If you specifically wish to run Prodigal instead, please skip AMP workflow or provide a pre-annotated GBK file in the samplesheet.") - } - - PYRODIGAL ( fasta, "gbk" ) - GUNZIP_PYRODIGAL_FAA ( PYRODIGAL.out.faa ) - GUNZIP_PYRODIGAL_FNA ( PYRODIGAL.out.fna) - GUNZIP_PYRODIGAL_GBK ( PYRODIGAL.out.annotations ) - ch_versions = ch_versions.mix(PYRODIGAL.out.versions) - ch_versions = ch_versions.mix(GUNZIP_PYRODIGAL_FAA.out.versions) - ch_versions = ch_versions.mix(GUNZIP_PYRODIGAL_FNA.out.versions) - ch_versions = ch_versions.mix(GUNZIP_PYRODIGAL_GBK.out.versions) - ch_annotation_faa = GUNZIP_PYRODIGAL_FAA.out.gunzip - ch_annotation_fna = GUNZIP_PYRODIGAL_FNA.out.gunzip - ch_annotation_gbk = GUNZIP_PYRODIGAL_GBK.out.gunzip - - } else if ( params.annotation_tool == "prodigal" ) { - - PRODIGAL ( fasta, "gbk" ) - GUNZIP_PRODIGAL_FAA ( PRODIGAL.out.amino_acid_fasta ) - GUNZIP_PRODIGAL_FNA ( PRODIGAL.out.nucleotide_fasta) - GUNZIP_PRODIGAL_GBK ( PRODIGAL.out.gene_annotations ) - ch_versions = ch_versions.mix(PRODIGAL.out.versions) - ch_versions = ch_versions.mix(GUNZIP_PRODIGAL_FAA.out.versions) - ch_versions = ch_versions.mix(GUNZIP_PRODIGAL_FNA.out.versions) - ch_versions = ch_versions.mix(GUNZIP_PRODIGAL_GBK.out.versions) - ch_annotation_faa = GUNZIP_PRODIGAL_FAA.out.gunzip - ch_annotation_fna = GUNZIP_PRODIGAL_FNA.out.gunzip - ch_annotation_gbk = GUNZIP_PRODIGAL_GBK.out.gunzip - - } else if ( params.annotation_tool == "prokka" ) { + if (params.annotation_tool == "prodigal" && params.run_bgc_screening == true && (!params.bgc_skip_antismash || !params.bgc_skip_deepbgc || !params.bgc_skip_gecco)) { + log.warn("[nf-core/funcscan] Switching annotation tool to: Pyrodigal. This is because Prodigal annotations (in GBK format) are incompatible with antiSMASH, DeepBGC, and GECCO. If you specifically wish to run Prodigal instead, please skip antiSMASH, DeepBGC, and GECCO or provide a pre-annotated GBK file in the samplesheet.") + } + else if (params.annotation_tool == "prodigal" && params.run_amp_screening == true) { + log.warn("[nf-core/funcscan] Switching annotation tool to: Pyrodigal. This is because Prodigal annotations (in GBK format) are incompatible with AMPcombi. If you specifically wish to run Prodigal instead, please skip AMP workflow or provide a pre-annotated GBK file in the samplesheet.") + } - PROKKA ( fasta, [], [] ) - ch_versions = ch_versions.mix(PROKKA.out.versions) - ch_multiqc_files = PROKKA.out.txt.collect{it[1]}.ifEmpty([]) - ch_annotation_faa = PROKKA.out.faa - ch_annotation_fna = PROKKA.out.fna - ch_annotation_gbk = PROKKA.out.gbk + PYRODIGAL(fasta, "gbk") + GUNZIP_PYRODIGAL_FAA(PYRODIGAL.out.faa) + GUNZIP_PYRODIGAL_FNA(PYRODIGAL.out.fna) + GUNZIP_PYRODIGAL_GBK(PYRODIGAL.out.annotations) + ch_versions = ch_versions.mix(PYRODIGAL.out.versions) + ch_versions = ch_versions.mix(GUNZIP_PYRODIGAL_FAA.out.versions) + ch_versions = ch_versions.mix(GUNZIP_PYRODIGAL_FNA.out.versions) + ch_versions = ch_versions.mix(GUNZIP_PYRODIGAL_GBK.out.versions) + ch_annotation_faa = GUNZIP_PYRODIGAL_FAA.out.gunzip + ch_annotation_fna = GUNZIP_PYRODIGAL_FNA.out.gunzip + ch_annotation_gbk = GUNZIP_PYRODIGAL_GBK.out.gunzip + } + else if (params.annotation_tool == "prodigal") { - } else if ( params.annotation_tool == "bakta" ) { + PRODIGAL(fasta, "gbk") + GUNZIP_PRODIGAL_FAA(PRODIGAL.out.amino_acid_fasta) + GUNZIP_PRODIGAL_FNA(PRODIGAL.out.nucleotide_fasta) + GUNZIP_PRODIGAL_GBK(PRODIGAL.out.gene_annotations) + ch_versions = ch_versions.mix(PRODIGAL.out.versions) + ch_versions = ch_versions.mix(GUNZIP_PRODIGAL_FAA.out.versions) + ch_versions = ch_versions.mix(GUNZIP_PRODIGAL_FNA.out.versions) + ch_versions = ch_versions.mix(GUNZIP_PRODIGAL_GBK.out.versions) + ch_annotation_faa = GUNZIP_PRODIGAL_FAA.out.gunzip + ch_annotation_fna = GUNZIP_PRODIGAL_FNA.out.gunzip + ch_annotation_gbk = GUNZIP_PRODIGAL_GBK.out.gunzip + } + else if (params.annotation_tool == "prokka") { - // BAKTA prepare download - if ( params.annotation_bakta_db ) { - ch_bakta_db = Channel - .fromPath( params.annotation_bakta_db ) - .first() - } else { - BAKTA_BAKTADBDOWNLOAD ( ) - ch_versions = ch_versions.mix( BAKTA_BAKTADBDOWNLOAD.out.versions ) - ch_bakta_db = ( BAKTA_BAKTADBDOWNLOAD.out.db ) - } + PROKKA(fasta, [], []) + ch_versions = ch_versions.mix(PROKKA.out.versions) + ch_multiqc_files = PROKKA.out.txt.collect { it[1] }.ifEmpty([]) + ch_annotation_faa = PROKKA.out.faa + ch_annotation_fna = PROKKA.out.fna + ch_annotation_gbk = PROKKA.out.gbk + } + else if (params.annotation_tool == "bakta") { - BAKTA_BAKTA ( fasta, ch_bakta_db, [], [] ) - ch_versions = ch_versions.mix(BAKTA_BAKTA.out.versions) - ch_multiqc_files = BAKTA_BAKTA.out.txt.collect{it[1]}.ifEmpty([]) - ch_annotation_faa = BAKTA_BAKTA.out.faa - ch_annotation_fna = BAKTA_BAKTA.out.fna - ch_annotation_gbk = BAKTA_BAKTA.out.gbff + // BAKTA prepare download + if (params.annotation_bakta_db) { + ch_bakta_db = Channel + .fromPath(params.annotation_bakta_db, checkIfExists: true) + .first() } + else { + BAKTA_BAKTADBDOWNLOAD() + ch_versions = ch_versions.mix(BAKTA_BAKTADBDOWNLOAD.out.versions) + ch_bakta_db = BAKTA_BAKTADBDOWNLOAD.out.db + } + + BAKTA_BAKTA(fasta, ch_bakta_db, [], []) + ch_versions = ch_versions.mix(BAKTA_BAKTA.out.versions) + ch_multiqc_files = BAKTA_BAKTA.out.txt.collect { it[1] }.ifEmpty([]) + ch_annotation_faa = BAKTA_BAKTA.out.faa + ch_annotation_fna = BAKTA_BAKTA.out.fna + ch_annotation_gbk = BAKTA_BAKTA.out.gbff + } emit: versions = ch_versions diff --git a/subworkflows/local/arg.nf b/subworkflows/local/arg.nf index 81dffb72..182f9648 100644 --- a/subworkflows/local/arg.nf +++ b/subworkflows/local/arg.nf @@ -2,26 +2,26 @@ Run ARG screening tools */ -include { ABRICATE_RUN } from '../../modules/nf-core/abricate/run/main' -include { AMRFINDERPLUS_UPDATE } from '../../modules/nf-core/amrfinderplus/update/main' -include { AMRFINDERPLUS_RUN } from '../../modules/nf-core/amrfinderplus/run/main' -include { DEEPARG_DOWNLOADDATA } from '../../modules/nf-core/deeparg/downloaddata/main' -include { DEEPARG_PREDICT } from '../../modules/nf-core/deeparg/predict/main' -include { FARGENE } from '../../modules/nf-core/fargene/main' -include { RGI_CARDANNOTATION } from '../../modules/nf-core/rgi/cardannotation/main' -include { RGI_MAIN } from '../../modules/nf-core/rgi/main/main' -include { UNTAR as UNTAR_CARD } from '../../modules/nf-core/untar/main' -include { TABIX_BGZIP as ARG_TABIX_BGZIP } from '../../modules/nf-core/tabix/bgzip/main' -include { MERGE_TAXONOMY_HAMRONIZATION } from '../../modules/local/merge_taxonomy_hamronization' -include { HAMRONIZATION_RGI } from '../../modules/nf-core/hamronization/rgi/main' -include { HAMRONIZATION_FARGENE } from '../../modules/nf-core/hamronization/fargene/main' -include { HAMRONIZATION_SUMMARIZE } from '../../modules/nf-core/hamronization/summarize/main' -include { HAMRONIZATION_ABRICATE } from '../../modules/nf-core/hamronization/abricate/main' -include { HAMRONIZATION_DEEPARG } from '../../modules/nf-core/hamronization/deeparg/main' -include { HAMRONIZATION_AMRFINDERPLUS } from '../../modules/nf-core/hamronization/amrfinderplus/main' -include { ARGNORM as ARGNORM_DEEPARG } from '../../modules/nf-core/argnorm/main' -include { ARGNORM as ARGNORM_ABRICATE } from '../../modules/nf-core/argnorm/main' -include { ARGNORM as ARGNORM_AMRFINDERPLUS } from '../../modules/nf-core/argnorm/main' +include { ABRICATE_RUN } from '../../modules/nf-core/abricate/run/main' +include { AMRFINDERPLUS_UPDATE } from '../../modules/nf-core/amrfinderplus/update/main' +include { AMRFINDERPLUS_RUN } from '../../modules/nf-core/amrfinderplus/run/main' +include { DEEPARG_DOWNLOADDATA } from '../../modules/nf-core/deeparg/downloaddata/main' +include { DEEPARG_PREDICT } from '../../modules/nf-core/deeparg/predict/main' +include { FARGENE } from '../../modules/nf-core/fargene/main' +include { RGI_CARDANNOTATION } from '../../modules/nf-core/rgi/cardannotation/main' +include { RGI_MAIN } from '../../modules/nf-core/rgi/main/main' +include { UNTAR as UNTAR_CARD } from '../../modules/nf-core/untar/main' +include { TABIX_BGZIP as ARG_TABIX_BGZIP } from '../../modules/nf-core/tabix/bgzip/main' +include { MERGE_TAXONOMY_HAMRONIZATION } from '../../modules/local/merge_taxonomy_hamronization' +include { HAMRONIZATION_RGI } from '../../modules/nf-core/hamronization/rgi/main' +include { HAMRONIZATION_FARGENE } from '../../modules/nf-core/hamronization/fargene/main' +include { HAMRONIZATION_SUMMARIZE } from '../../modules/nf-core/hamronization/summarize/main' +include { HAMRONIZATION_ABRICATE } from '../../modules/nf-core/hamronization/abricate/main' +include { HAMRONIZATION_DEEPARG } from '../../modules/nf-core/hamronization/deeparg/main' +include { HAMRONIZATION_AMRFINDERPLUS } from '../../modules/nf-core/hamronization/amrfinderplus/main' +include { ARGNORM as ARGNORM_DEEPARG } from '../../modules/nf-core/argnorm/main' +include { ARGNORM as ARGNORM_ABRICATE } from '../../modules/nf-core/argnorm/main' +include { ARGNORM as ARGNORM_AMRFINDERPLUS } from '../../modules/nf-core/argnorm/main' workflow ARG { take: @@ -36,174 +36,179 @@ workflow ARG { ch_input_to_hamronization_summarize = Channel.empty() // AMRfinderplus run - // Prepare channel for database - if ( !params.arg_skip_amrfinderplus && params.arg_amrfinderplus_db ) { + // Prepare channel for database + if (!params.arg_skip_amrfinderplus && params.arg_amrfinderplus_db) { ch_amrfinderplus_db = Channel - .fromPath( params.arg_amrfinderplus_db ) + .fromPath(params.arg_amrfinderplus_db, checkIfExists: true) .first() - } else if ( !params.arg_skip_amrfinderplus && !params.arg_amrfinderplus_db ) { - AMRFINDERPLUS_UPDATE( ) - ch_versions = ch_versions.mix( AMRFINDERPLUS_UPDATE.out.versions ) + } + else if (!params.arg_skip_amrfinderplus && !params.arg_amrfinderplus_db) { + AMRFINDERPLUS_UPDATE() + ch_versions = ch_versions.mix(AMRFINDERPLUS_UPDATE.out.versions) ch_amrfinderplus_db = AMRFINDERPLUS_UPDATE.out.db } - if ( !params.arg_skip_amrfinderplus ) { - AMRFINDERPLUS_RUN ( fastas, ch_amrfinderplus_db ) - ch_versions = ch_versions.mix( AMRFINDERPLUS_RUN.out.versions ) - - // Reporting - HAMRONIZATION_AMRFINDERPLUS ( AMRFINDERPLUS_RUN.out.report, 'tsv', AMRFINDERPLUS_RUN.out.tool_version, AMRFINDERPLUS_RUN.out.db_version ) - ch_versions = ch_versions.mix( HAMRONIZATION_AMRFINDERPLUS.out.versions ) - ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix( HAMRONIZATION_AMRFINDERPLUS.out.tsv ) + if (!params.arg_skip_amrfinderplus) { + AMRFINDERPLUS_RUN(fastas, ch_amrfinderplus_db) + ch_versions = ch_versions.mix(AMRFINDERPLUS_RUN.out.versions) - if ( !params.arg_skip_argnorm ) { - ch_input_to_argnorm_amrfinderplus = HAMRONIZATION_AMRFINDERPLUS.out.tsv.filter{ meta, file -> !file.isEmpty() } - ARGNORM_AMRFINDERPLUS ( ch_input_to_argnorm_amrfinderplus, 'amrfinderplus', 'ncbi' ) - ch_versions = ch_versions.mix( ARGNORM_AMRFINDERPLUS.out.versions ) + // Reporting + HAMRONIZATION_AMRFINDERPLUS(AMRFINDERPLUS_RUN.out.report, 'tsv', AMRFINDERPLUS_RUN.out.tool_version, AMRFINDERPLUS_RUN.out.db_version) + ch_versions = ch_versions.mix(HAMRONIZATION_AMRFINDERPLUS.out.versions) + ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix(HAMRONIZATION_AMRFINDERPLUS.out.tsv) + + if (!params.arg_skip_argnorm) { + ch_input_to_argnorm_amrfinderplus = HAMRONIZATION_AMRFINDERPLUS.out.tsv.filter { meta, file -> !file.isEmpty() } + ARGNORM_AMRFINDERPLUS(ch_input_to_argnorm_amrfinderplus, 'amrfinderplus', 'ncbi') + ch_versions = ch_versions.mix(ARGNORM_AMRFINDERPLUS.out.versions) } } // fARGene run - if ( !params.arg_skip_fargene ) { - ch_fargene_classes = Channel.fromList( params.arg_fargene_hmmmodel.tokenize(',') ) + if (!params.arg_skip_fargene) { + ch_fargene_classes = Channel.fromList(params.arg_fargene_hmmmodel.tokenize(',')) ch_fargene_input = fastas - .combine( ch_fargene_classes ) - .map { - meta, fastas, hmm_class -> - def meta_new = meta.clone() - meta_new['hmm_class'] = hmm_class - [ meta_new, fastas, hmm_class ] - } - .multiMap { - fastas: [ it[0], it[1] ] - hmmclass: it[2] - } - - FARGENE ( ch_fargene_input.fastas, ch_fargene_input.hmmclass ) - ch_versions = ch_versions.mix( FARGENE.out.versions ) + .combine(ch_fargene_classes) + .map { meta, fastas, hmm_class -> + def meta_new = meta.clone() + meta_new['hmm_class'] = hmm_class + [meta_new, fastas, hmm_class] + } + .multiMap { + fastas: [it[0], it[1]] + hmmclass: it[2] + } + + FARGENE(ch_fargene_input.fastas, ch_fargene_input.hmmclass) + ch_versions = ch_versions.mix(FARGENE.out.versions) // Reporting // Note: currently hardcoding versions, has to be updated with every fARGene-update - HAMRONIZATION_FARGENE( FARGENE.out.hmm_genes.transpose(), 'tsv', '0.1', '0.1' ) - ch_versions = ch_versions.mix( HAMRONIZATION_FARGENE.out.versions ) - ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix( HAMRONIZATION_FARGENE.out.tsv ) + HAMRONIZATION_FARGENE(FARGENE.out.hmm_genes.transpose(), 'tsv', '0.1', '0.1') + ch_versions = ch_versions.mix(HAMRONIZATION_FARGENE.out.versions) + ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix(HAMRONIZATION_FARGENE.out.tsv) } // RGI run - if ( !params.arg_skip_rgi ) { + if (!params.arg_skip_rgi) { - if ( !params.arg_rgi_db ) { + if (!params.arg_rgi_db) { // Download and untar CARD - UNTAR_CARD ( [ [], file('https://card.mcmaster.ca/latest/data', checkIfExists: true) ] ) - ch_versions = ch_versions.mix( UNTAR_CARD.out.versions ) - rgi_db = UNTAR_CARD.out.untar.map{ it[1] } - - } else { + UNTAR_CARD([[], file('https://card.mcmaster.ca/latest/data', checkIfExists: true)]) + ch_versions = ch_versions.mix(UNTAR_CARD.out.versions) + rgi_db = UNTAR_CARD.out.untar.map { it[1] } + RGI_CARDANNOTATION(rgi_db) + card = RGI_CARDANNOTATION.out.db + ch_versions = ch_versions.mix(RGI_CARDANNOTATION.out.versions) + } + else { // Use user-supplied database - rgi_db = params.arg_rgi_db - + rgi_db = file(params.arg_rgi_db, checkIfExists: true) + if (!rgi_db.contains("card_database_processed")) { + RGI_CARDANNOTATION(rgi_db) + card = RGI_CARDANNOTATION.out.db + ch_versions = ch_versions.mix(RGI_CARDANNOTATION.out.versions) + } + else { + card = rgi_db + } } - RGI_CARDANNOTATION ( rgi_db ) - ch_versions = ch_versions.mix( RGI_CARDANNOTATION.out.versions ) - - RGI_MAIN ( fastas, RGI_CARDANNOTATION.out.db, [] ) - ch_versions = ch_versions.mix( RGI_MAIN.out.versions ) + RGI_MAIN(fastas, card, []) + ch_versions = ch_versions.mix(RGI_MAIN.out.versions) // Reporting - HAMRONIZATION_RGI ( RGI_MAIN.out.tsv, 'tsv', RGI_MAIN.out.tool_version, RGI_MAIN.out.db_version ) - ch_versions = ch_versions.mix( HAMRONIZATION_RGI.out.versions ) - ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix( HAMRONIZATION_RGI.out.tsv ) + HAMRONIZATION_RGI(RGI_MAIN.out.tsv, 'tsv', RGI_MAIN.out.tool_version, RGI_MAIN.out.db_version) + ch_versions = ch_versions.mix(HAMRONIZATION_RGI.out.versions) + ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix(HAMRONIZATION_RGI.out.tsv) } // DeepARG prepare download - if ( !params.arg_skip_deeparg && params.arg_deeparg_db ) { + if (!params.arg_skip_deeparg && params.arg_deeparg_db) { ch_deeparg_db = Channel - .fromPath( params.arg_deeparg_db ) + .fromPath(params.arg_deeparg_db, checkIfExists: true) .first() - } else if ( !params.arg_skip_deeparg && !params.arg_deeparg_db ) { - DEEPARG_DOWNLOADDATA( ) - ch_versions = ch_versions.mix( DEEPARG_DOWNLOADDATA.out.versions ) + } + else if (!params.arg_skip_deeparg && !params.arg_deeparg_db) { + DEEPARG_DOWNLOADDATA() + ch_versions = ch_versions.mix(DEEPARG_DOWNLOADDATA.out.versions) ch_deeparg_db = DEEPARG_DOWNLOADDATA.out.db } // DeepARG run - if ( !params.arg_skip_deeparg ) { + if (!params.arg_skip_deeparg) { annotations - .map { - it -> - def meta = it[0] - def anno = it[1] - def model = params.arg_deeparg_model + .map { it -> + def meta = it[0] + def anno = it[1] + def model = params.arg_deeparg_model - [ meta, anno, model ] - } - .set { ch_input_for_deeparg } + [meta, anno, model] + } + .set { ch_input_for_deeparg } - DEEPARG_PREDICT ( ch_input_for_deeparg, ch_deeparg_db ) - ch_versions = ch_versions.mix( DEEPARG_PREDICT.out.versions ) + DEEPARG_PREDICT(ch_input_for_deeparg, ch_deeparg_db) + ch_versions = ch_versions.mix(DEEPARG_PREDICT.out.versions) // Reporting // Note: currently hardcoding versions as unreported by DeepARG // Make sure to update on version bump. - ch_input_to_hamronization_deeparg = DEEPARG_PREDICT.out.arg.mix( DEEPARG_PREDICT.out.potential_arg ) - HAMRONIZATION_DEEPARG ( ch_input_to_hamronization_deeparg, 'tsv', '1.0.4', params.arg_deeparg_db_version ) - ch_versions = ch_versions.mix( HAMRONIZATION_DEEPARG.out.versions ) - ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix( HAMRONIZATION_DEEPARG.out.tsv ) - - if ( !params.arg_skip_argnorm ) { - ch_input_to_argnorm_deeparg = HAMRONIZATION_DEEPARG.out.tsv.filter{ meta, file -> !file.isEmpty() } - ARGNORM_DEEPARG ( ch_input_to_argnorm_deeparg, 'deeparg', 'deeparg' ) - ch_versions = ch_versions.mix( ARGNORM_DEEPARG.out.versions ) + ch_input_to_hamronization_deeparg = DEEPARG_PREDICT.out.arg.mix(DEEPARG_PREDICT.out.potential_arg) + HAMRONIZATION_DEEPARG(ch_input_to_hamronization_deeparg, 'tsv', '1.0.4', params.arg_deeparg_db_version) + ch_versions = ch_versions.mix(HAMRONIZATION_DEEPARG.out.versions) + ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix(HAMRONIZATION_DEEPARG.out.tsv) + + if (!params.arg_skip_argnorm) { + ch_input_to_argnorm_deeparg = HAMRONIZATION_DEEPARG.out.tsv.filter { meta, file -> !file.isEmpty() } + ARGNORM_DEEPARG(ch_input_to_argnorm_deeparg, 'deeparg', 'deeparg') + ch_versions = ch_versions.mix(ARGNORM_DEEPARG.out.versions) } } // ABRicate run - if ( !params.arg_skip_abricate ) { + if (!params.arg_skip_abricate) { abricate_dbdir = params.arg_abricate_db ? file(params.arg_abricate_db, checkIfExists: true) : [] - ABRICATE_RUN ( fastas, abricate_dbdir ) - ch_versions = ch_versions.mix( ABRICATE_RUN.out.versions ) - - HAMRONIZATION_ABRICATE ( ABRICATE_RUN.out.report, 'tsv', '1.0.1', '2021-Mar-27' ) - ch_versions = ch_versions.mix( HAMRONIZATION_ABRICATE.out.versions ) - ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix( HAMRONIZATION_ABRICATE.out.tsv ) - - if ( ( params.arg_abricate_db_id == 'ncbi' || - params.arg_abricate_db_id == 'resfinder' || - params.arg_abricate_db_id == 'argannot' || - params.arg_abricate_db_id == 'megares') && !params.arg_skip_argnorm ) { - ch_input_to_argnorm_abricate = HAMRONIZATION_ABRICATE.out.tsv.filter{ meta, file -> !file.isEmpty() } - ARGNORM_ABRICATE ( ch_input_to_argnorm_abricate, 'abricate', params.arg_abricate_db_id ) - ch_versions = ch_versions.mix( ARGNORM_ABRICATE.out.versions ) + ABRICATE_RUN(fastas, abricate_dbdir) + ch_versions = ch_versions.mix(ABRICATE_RUN.out.versions) + + HAMRONIZATION_ABRICATE(ABRICATE_RUN.out.report, 'tsv', '1.0.1', '2021-Mar-27') + ch_versions = ch_versions.mix(HAMRONIZATION_ABRICATE.out.versions) + ch_input_to_hamronization_summarize = ch_input_to_hamronization_summarize.mix(HAMRONIZATION_ABRICATE.out.tsv) + + if ((params.arg_abricate_db_id == 'ncbi' || params.arg_abricate_db_id == 'resfinder' || params.arg_abricate_db_id == 'argannot' || params.arg_abricate_db_id == 'megares') && !params.arg_skip_argnorm) { + ch_input_to_argnorm_abricate = HAMRONIZATION_ABRICATE.out.tsv.filter { meta, file -> !file.isEmpty() } + ARGNORM_ABRICATE(ch_input_to_argnorm_abricate, 'abricate', params.arg_abricate_db_id) + ch_versions = ch_versions.mix(ARGNORM_ABRICATE.out.versions) } } ch_input_to_hamronization_summarize - .map{ + .map { it[1] } .collect() .set { ch_input_for_hamronization_summarize } - HAMRONIZATION_SUMMARIZE( ch_input_for_hamronization_summarize, params.arg_hamronization_summarizeformat ) - ch_versions = ch_versions.mix( HAMRONIZATION_SUMMARIZE.out.versions ) + HAMRONIZATION_SUMMARIZE(ch_input_for_hamronization_summarize, params.arg_hamronization_summarizeformat) + ch_versions = ch_versions.mix(HAMRONIZATION_SUMMARIZE.out.versions) // MERGE_TAXONOMY - if ( params.run_taxa_classification ) { + if (params.run_taxa_classification) { - ch_mmseqs_taxonomy_list = tsvs.map{ it[1] }.collect() - MERGE_TAXONOMY_HAMRONIZATION( HAMRONIZATION_SUMMARIZE.out.tsv, ch_mmseqs_taxonomy_list ) - ch_versions = ch_versions.mix( MERGE_TAXONOMY_HAMRONIZATION.out.versions ) + ch_mmseqs_taxonomy_list = tsvs.map { it[1] }.collect() + MERGE_TAXONOMY_HAMRONIZATION(HAMRONIZATION_SUMMARIZE.out.tsv, ch_mmseqs_taxonomy_list) + ch_versions = ch_versions.mix(MERGE_TAXONOMY_HAMRONIZATION.out.versions) - ch_tabix_input = Channel.of( [ 'id':'hamronization_combined_report' ] ) + ch_tabix_input = Channel + .of(['id': 'hamronization_combined_report']) .combine(MERGE_TAXONOMY_HAMRONIZATION.out.tsv) - ARG_TABIX_BGZIP( ch_tabix_input ) - ch_versions = ch_versions.mix( ARG_TABIX_BGZIP.out.versions ) + ARG_TABIX_BGZIP(ch_tabix_input) + ch_versions = ch_versions.mix(ARG_TABIX_BGZIP.out.versions) } emit: diff --git a/subworkflows/local/bgc.nf b/subworkflows/local/bgc.nf index 0130205d..25b21daa 100644 --- a/subworkflows/local/bgc.nf +++ b/subworkflows/local/bgc.nf @@ -16,7 +16,6 @@ include { TABIX_BGZIP as BGC_TABIX_BGZIP } from '../../modules/nf-core include { MERGE_TAXONOMY_COMBGC } from '../../modules/local/merge_taxonomy_combgc' workflow BGC { - take: fastas // tuple val(meta), path(PREPPED_INPUT.out.fna) faas // tuple val(meta), path(.out.faa) @@ -24,7 +23,7 @@ workflow BGC { tsvs // tuple val(meta), path(MMSEQS_CREATETSV.out.tsv) main: - ch_versions = Channel.empty() + ch_versions = Channel.empty() ch_bgcresults_for_combgc = Channel.empty() // When adding new tool that requires FAA, make sure to update conditions @@ -33,168 +32,173 @@ workflow BGC { ch_faa_for_bgc_hmmsearch = faas // ANTISMASH - if ( !params.bgc_skip_antismash ) { + if (!params.bgc_skip_antismash) { // Check whether user supplies database and/or antismash directory. If not, obtain them via the module antismashlite/antismashlitedownloaddatabases. // Important for future maintenance: For CI tests, only the "else" option below is used. Both options should be tested locally whenever the antiSMASH module gets updated. - if ( params.bgc_antismash_db && params.bgc_antismash_installdir ) { + if (params.bgc_antismash_db && params.bgc_antismash_installdir) { ch_antismash_databases = Channel - .fromPath( params.bgc_antismash_db ) + .fromPath(params.bgc_antismash_db, checkIfExists: true) .first() ch_antismash_directory = Channel - .fromPath( params.bgc_antismash_installdir ) + .fromPath(params.bgc_antismash_installdir, checkIfExists: true) .first() - - } else if ( params.bgc_antismash_db && ( session.config.conda && session.config.conda.enabled ) ) { + } + else if (params.bgc_antismash_db && (session.config.conda && session.config.conda.enabled)) { ch_antismash_databases = Channel - .fromPath( params.bgc_antismash_db ) + .fromPath(params.bgc_antismash_db, checkIfExists: true) .first() ch_antismash_directory = [] - - } else { + } + else { // May need to update on each new version of antismash-lite due to changes to scripts inside these tars ch_css_for_antismash = "https://github.com/nf-core/test-datasets/raw/724737e23a53085129cd5e015acafbf7067822ca/data/delete_me/antismash/css.tar.gz" ch_detection_for_antismash = "https://github.com/nf-core/test-datasets/raw/c3174c50bf654e477bf329dbaf72acc8345f9b7a/data/delete_me/antismash/detection.tar.gz" ch_modules_for_antismash = "https://github.com/nf-core/test-datasets/raw/c3174c50bf654e477bf329dbaf72acc8345f9b7a/data/delete_me/antismash/modules.tar.gz" - UNTAR_CSS ( [ [], ch_css_for_antismash ] ) - ch_versions = ch_versions.mix( UNTAR_CSS.out.versions ) + UNTAR_CSS([[], ch_css_for_antismash]) + ch_versions = ch_versions.mix(UNTAR_CSS.out.versions) - UNTAR_DETECTION ( [ [], ch_detection_for_antismash ] ) - ch_versions = ch_versions.mix( UNTAR_DETECTION.out.versions ) + UNTAR_DETECTION([[], ch_detection_for_antismash]) + ch_versions = ch_versions.mix(UNTAR_DETECTION.out.versions) - UNTAR_MODULES ( [ [], ch_modules_for_antismash ] ) - ch_versions = ch_versions.mix( UNTAR_MODULES.out.versions ) + UNTAR_MODULES([[], ch_modules_for_antismash]) + ch_versions = ch_versions.mix(UNTAR_MODULES.out.versions) - ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES ( UNTAR_CSS.out.untar.map{ it[1] }, UNTAR_DETECTION.out.untar.map{ it[1] }, UNTAR_MODULES.out.untar.map{ it[1] } ) - ch_versions = ch_versions.mix( ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES.out.versions ) + ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES(UNTAR_CSS.out.untar.map { it[1] }, UNTAR_DETECTION.out.untar.map { it[1] }, UNTAR_MODULES.out.untar.map { it[1] }) + ch_versions = ch_versions.mix(ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES.out.versions) ch_antismash_databases = ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES.out.database ch_antismash_directory = ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES.out.antismash_dir } - ANTISMASH_ANTISMASHLITE ( gbks, ch_antismash_databases, ch_antismash_directory, [] ) + ANTISMASH_ANTISMASHLITE(gbks, ch_antismash_databases, ch_antismash_directory, []) - ch_versions = ch_versions.mix( ANTISMASH_ANTISMASHLITE.out.versions ) + ch_versions = ch_versions.mix(ANTISMASH_ANTISMASHLITE.out.versions) ch_antismashresults = ANTISMASH_ANTISMASHLITE.out.knownclusterblast_dir - .mix( ANTISMASH_ANTISMASHLITE.out.gbk_input ) - .groupTuple() - .map{ - meta, files -> - [ meta, files.flatten() ] - } + .mix(ANTISMASH_ANTISMASHLITE.out.gbk_input) + .groupTuple() + .map { meta, files -> + [meta, files.flatten()] + } // Filter out samples with no BGC hits ch_antismashresults_for_combgc = ch_antismashresults .join(fastas, remainder: false) .join(ANTISMASH_ANTISMASHLITE.out.gbk_results, remainder: false) - .map { - meta, gbk_input, fasta, gbk_results -> - [ meta, gbk_input ] + .map { meta, gbk_input, fasta, gbk_results -> + [meta, gbk_input] } - ch_bgcresults_for_combgc = ch_bgcresults_for_combgc.mix( ch_antismashresults_for_combgc ) + ch_bgcresults_for_combgc = ch_bgcresults_for_combgc.mix(ch_antismashresults_for_combgc) } // DEEPBGC - if ( !params.bgc_skip_deepbgc ) { - if ( params.bgc_deepbgc_db ) { + if (!params.bgc_skip_deepbgc) { + if (params.bgc_deepbgc_db) { ch_deepbgc_database = Channel - .fromPath( params.bgc_deepbgc_db ) + .fromPath(params.bgc_deepbgc_db, checkIfExists: true) .first() - } else { + } + else { DEEPBGC_DOWNLOAD() ch_deepbgc_database = DEEPBGC_DOWNLOAD.out.db - ch_versions = ch_versions.mix( DEEPBGC_DOWNLOAD.out.versions ) + ch_versions = ch_versions.mix(DEEPBGC_DOWNLOAD.out.versions) } - DEEPBGC_PIPELINE ( gbks, ch_deepbgc_database ) - ch_versions = ch_versions.mix( DEEPBGC_PIPELINE.out.versions ) - ch_bgcresults_for_combgc = ch_bgcresults_for_combgc.mix( DEEPBGC_PIPELINE.out.bgc_tsv ) + DEEPBGC_PIPELINE(gbks, ch_deepbgc_database) + ch_versions = ch_versions.mix(DEEPBGC_PIPELINE.out.versions) + ch_bgcresults_for_combgc = ch_bgcresults_for_combgc.mix(DEEPBGC_PIPELINE.out.bgc_tsv) } // GECCO - if ( !params.bgc_skip_gecco ) { - ch_gecco_input = gbks.groupTuple() - .multiMap { - fastas: [ it[0], it[1], [] ] - } - - GECCO_RUN ( ch_gecco_input, [] ) - ch_versions = ch_versions.mix( GECCO_RUN.out.versions ) + if (!params.bgc_skip_gecco) { + ch_gecco_input = gbks + .groupTuple() + .multiMap { + fastas: [it[0], it[1], []] + } + + GECCO_RUN(ch_gecco_input, []) + ch_versions = ch_versions.mix(GECCO_RUN.out.versions) ch_geccoresults_for_combgc = GECCO_RUN.out.gbk - .mix( GECCO_RUN.out.clusters ) + .mix(GECCO_RUN.out.clusters) .groupTuple() - .map{ - meta, files -> - [ meta, files.flatten() ] + .map { meta, files -> + [meta, files.flatten()] } - ch_bgcresults_for_combgc = ch_bgcresults_for_combgc.mix( ch_geccoresults_for_combgc ) + ch_bgcresults_for_combgc = ch_bgcresults_for_combgc.mix(ch_geccoresults_for_combgc) } // HMMSEARCH - if ( params.bgc_run_hmmsearch ) { - if ( params.bgc_hmmsearch_models ) { ch_bgc_hmm_models = Channel.fromPath( params.bgc_hmmsearch_models, checkIfExists: true ) } else { error('[nf-core/funcscan] error: hmm model files not found for --bgc_hmmsearch_models! Please check input.') } + if (params.bgc_run_hmmsearch) { + if (params.bgc_hmmsearch_models) { + ch_bgc_hmm_models = Channel.fromPath(params.bgc_hmmsearch_models, checkIfExists: true) + } + else { + error('[nf-core/funcscan] error: hmm model files not found for --bgc_hmmsearch_models! Please check input.') + } - ch_bgc_hmm_models_meta = ch_bgc_hmm_models - .map { - file -> - def meta = [:] - meta['id'] = file.extension == 'gz' ? file.name - '.hmm.gz' : file.name - '.hmm' + ch_bgc_hmm_models_meta = ch_bgc_hmm_models.map { file -> + def meta = [:] + meta['id'] = file.extension == 'gz' ? file.name - '.hmm.gz' : file.name - '.hmm' - [ meta, file ] - } + [meta, file] + } - ch_in_for_bgc_hmmsearch = ch_faa_for_bgc_hmmsearch.combine(ch_bgc_hmm_models_meta) - .map { - meta_faa, faa, meta_hmm, hmm -> - def meta_new = [:] - meta_new['id'] = meta_faa['id'] - meta_new['hmm_id'] = meta_hmm['id'] - [ meta_new, hmm, faa, params.bgc_hmmsearch_savealignments, params.bgc_hmmsearch_savetargets, params.bgc_hmmsearch_savedomains ] + ch_in_for_bgc_hmmsearch = ch_faa_for_bgc_hmmsearch + .combine(ch_bgc_hmm_models_meta) + .map { meta_faa, faa, meta_hmm, hmm -> + def meta_new = [:] + meta_new['id'] = meta_faa['id'] + meta_new['hmm_id'] = meta_hmm['id'] + [meta_new, hmm, faa, params.bgc_hmmsearch_savealignments, params.bgc_hmmsearch_savetargets, params.bgc_hmmsearch_savedomains] } - BGC_HMMER_HMMSEARCH ( ch_in_for_bgc_hmmsearch ) - ch_versions = ch_versions.mix( BGC_HMMER_HMMSEARCH.out.versions ) + BGC_HMMER_HMMSEARCH(ch_in_for_bgc_hmmsearch) + ch_versions = ch_versions.mix(BGC_HMMER_HMMSEARCH.out.versions) } // COMBGC ch_bgcresults_for_combgc .join(fastas, remainder: true) - .filter { - meta, bgcfile, fasta -> - if ( !bgcfile ) { log.warn("[nf-core/funcscan] BGC workflow: No hits found by BGC tools; comBGC summary tool will not be run for sample: ${meta.id}") } - return [meta, bgcfile, fasta] + .filter { meta, bgcfile, fasta -> + if (!bgcfile) { + log.warn("[nf-core/funcscan] BGC workflow: No hits found by BGC tools; comBGC summary tool will not be run for sample: ${meta.id}") + } + return [meta, bgcfile, fasta] } - COMBGC ( ch_bgcresults_for_combgc ) - ch_versions = ch_versions.mix( COMBGC.out.versions ) + COMBGC(ch_bgcresults_for_combgc) + ch_versions = ch_versions.mix(COMBGC.out.versions) // COMBGC concatenation - if ( !params.run_taxa_classification ) { - ch_combgc_summaries = COMBGC.out.tsv.map{ it[1] }.collectFile( name: 'combgc_complete_summary.tsv', storeDir: "${params.outdir}/reports/combgc", keepHeader:true ) - } else { - ch_combgc_summaries = COMBGC.out.tsv.map{ it[1] }.collectFile( name: 'combgc_complete_summary.tsv', keepHeader:true ) + if (!params.run_taxa_classification) { + ch_combgc_summaries = COMBGC.out.tsv.map { it[1] }.collectFile(name: 'combgc_complete_summary.tsv', storeDir: "${params.outdir}/reports/combgc", keepHeader: true) + } + else { + ch_combgc_summaries = COMBGC.out.tsv.map { it[1] }.collectFile(name: 'combgc_complete_summary.tsv', keepHeader: true) } // MERGE_TAXONOMY - if ( params.run_taxa_classification ) { + if (params.run_taxa_classification) { - ch_mmseqs_taxonomy_list = tsvs.map{ it[1] }.collect() - MERGE_TAXONOMY_COMBGC( ch_combgc_summaries, ch_mmseqs_taxonomy_list ) - ch_versions = ch_versions.mix( MERGE_TAXONOMY_COMBGC.out.versions ) + ch_mmseqs_taxonomy_list = tsvs.map { it[1] }.collect() + MERGE_TAXONOMY_COMBGC(ch_combgc_summaries, ch_mmseqs_taxonomy_list) + ch_versions = ch_versions.mix(MERGE_TAXONOMY_COMBGC.out.versions) - ch_tabix_input = Channel.of( [ 'id':'combgc_complete_summary_taxonomy' ] ) + ch_tabix_input = Channel + .of(['id': 'combgc_complete_summary_taxonomy']) .combine(MERGE_TAXONOMY_COMBGC.out.tsv) - BGC_TABIX_BGZIP( ch_tabix_input ) - ch_versions = ch_versions.mix( BGC_TABIX_BGZIP.out.versions ) + BGC_TABIX_BGZIP(ch_tabix_input) + ch_versions = ch_versions.mix(BGC_TABIX_BGZIP.out.versions) } emit: diff --git a/subworkflows/local/protein_annotation.nf b/subworkflows/local/protein_annotation.nf new file mode 100644 index 00000000..b73ada6c --- /dev/null +++ b/subworkflows/local/protein_annotation.nf @@ -0,0 +1,55 @@ +/* + RUN FUNCTIONAL CLASSIFICATION +*/ + +include { INTERPROSCAN_DATABASE } from '../../modules/local/interproscan_download' +include { INTERPROSCAN } from '../../modules/nf-core/interproscan/main' + +workflow PROTEIN_ANNOTATION { + take: + faas // tuple val(meta), path(PROKKA/PRODIGAL.out.faa) + + main: + ch_versions = Channel.empty() + ch_interproscan_tsv = Channel.empty() + ch_interproscan_db = Channel.empty() + ch_interproscan_tsv_modified = Channel.empty() + + ch_faa_for_interproscan = faas + + if ( params.protein_annotation_tool == 'InterProScan') { + + if ( params.protein_annotation_interproscan_db != null ) { + ch_interproscan_db = Channel + .fromPath( params.protein_annotation_interproscan_db ) + .first() + } else { + INTERPROSCAN_DATABASE ( params.protein_annotation_interproscan_db_url ) + ch_versions = ch_versions.mix( INTERPROSCAN_DATABASE.out.versions ) + ch_interproscan_db = ( INTERPROSCAN_DATABASE.out.db ) + } + + INTERPROSCAN( ch_faa_for_interproscan, ch_interproscan_db ) + ch_versions = ch_versions.mix( INTERPROSCAN.out.versions ) + ch_interproscan_tsv = ch_interproscan_tsv.mix( INTERPROSCAN.out.tsv ) + + // Current INTERPROSCAN version 5.72-103.0 only includes 13 columns and not 15 which ampcombi expects, so we added them here + ch_interproscan_tsv_modified = INTERPROSCAN.out.tsv + .map { meta, tsv_path -> + def modified_tsv_path = "${workflow.workDir}/tmp/${meta.id}_interproscan.faa.tsv" + + def modified_tsv_content = new File(tsv_path.toString()) + .readLines() + .collect { line -> (line.split('\t') + ['NA', 'NA']).join('\t') } + + new File(modified_tsv_path).text = modified_tsv_content.join('\n') + [meta, file(modified_tsv_path)] + } + + ch_versions = ch_versions.mix(INTERPROSCAN.out.versions) + } + + emit: + versions = ch_versions + tsv = ch_interproscan_tsv_modified // channel: [ val(meta), tsv ] +} diff --git a/subworkflows/local/taxa_class.nf b/subworkflows/local/taxa_class.nf index d76e1dff..0bf67312 100644 --- a/subworkflows/local/taxa_class.nf +++ b/subworkflows/local/taxa_class.nf @@ -12,47 +12,48 @@ workflow TAXA_CLASS { contigs // tuple val(meta), path(contigs) main: - ch_versions = Channel.empty() - ch_mmseqs_db = Channel.empty() - ch_taxonomy_querydb = Channel.empty() + ch_versions = Channel.empty() + ch_mmseqs_db = Channel.empty() + ch_taxonomy_querydb = Channel.empty() ch_taxonomy_querydb_taxdb = Channel.empty() - ch_taxonomy_tsv = Channel.empty() + ch_taxonomy_tsv = Channel.empty() - if ( params.taxa_classification_tool == 'mmseqs2') { + if (params.taxa_classification_tool == 'mmseqs2') { // Download the ref db if not supplied by user // MMSEQS_DATABASE - if ( params.taxa_classification_mmseqs_db != null ) { + if (params.taxa_classification_mmseqs_db != null) { ch_mmseqs_db = Channel - .fromPath( params.taxa_classification_mmseqs_db ) + .fromPath(params.taxa_classification_mmseqs_db, checkIfExists: true) .first() - } else { - MMSEQS_DATABASES ( params.taxa_classification_mmseqs_db_id ) - ch_versions = ch_versions.mix( MMSEQS_DATABASES.out.versions ) - ch_mmseqs_db = ( MMSEQS_DATABASES.out.database ) + } + else { + MMSEQS_DATABASES(params.taxa_classification_mmseqs_db_id) + ch_versions = ch_versions.mix(MMSEQS_DATABASES.out.versions) + ch_mmseqs_db = MMSEQS_DATABASES.out.database } // Create db for query contigs, assign taxonomy and convert to table format // MMSEQS_CREATEDB - MMSEQS_CREATEDB ( contigs ) - ch_versions = ch_versions.mix( MMSEQS_CREATEDB.out.versions ) + MMSEQS_CREATEDB(contigs) + ch_versions = ch_versions.mix(MMSEQS_CREATEDB.out.versions) // MMSEQS_TAXONOMY - MMSEQS_TAXONOMY ( MMSEQS_CREATEDB.out.db, ch_mmseqs_db ) - ch_versions = ch_versions.mix( MMSEQS_TAXONOMY.out.versions ) + MMSEQS_TAXONOMY(MMSEQS_CREATEDB.out.db, ch_mmseqs_db) + ch_versions = ch_versions.mix(MMSEQS_TAXONOMY.out.versions) ch_taxonomy_querydb_taxdb = MMSEQS_TAXONOMY.out.db_taxonomy // Join together to ensure in sync ch_taxonomy_input_for_createtsv = MMSEQS_CREATEDB.out.db - .join(MMSEQS_TAXONOMY.out.db_taxonomy) - .multiMap { meta, db, db_taxonomy -> - db: [ meta,db ] - taxdb: [ meta, db_taxonomy ] - } + .join(MMSEQS_TAXONOMY.out.db_taxonomy) + .multiMap { meta, db, db_taxonomy -> + db: [meta, db] + taxdb: [meta, db_taxonomy] + } // MMSEQS_CREATETSV - MMSEQS_CREATETSV ( ch_taxonomy_input_for_createtsv.taxdb, [[:],[]], ch_taxonomy_input_for_createtsv.db ) - ch_versions = ch_versions.mix( MMSEQS_CREATETSV.out.versions ) + MMSEQS_CREATETSV(ch_taxonomy_input_for_createtsv.taxdb, [[:], []], ch_taxonomy_input_for_createtsv.db) + ch_versions = ch_versions.mix(MMSEQS_CREATETSV.out.versions) ch_taxonomy_tsv = MMSEQS_CREATETSV.out.tsv } diff --git a/subworkflows/local/utils_nfcore_funcscan_pipeline/main.nf b/subworkflows/local/utils_nfcore_funcscan_pipeline/main.nf index 0d4b7afb..27dce6cb 100644 --- a/subworkflows/local/utils_nfcore_funcscan_pipeline/main.nf +++ b/subworkflows/local/utils_nfcore_funcscan_pipeline/main.nf @@ -8,29 +8,24 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -include { UTILS_NFVALIDATION_PLUGIN } from '../../nf-core/utils_nfvalidation_plugin' -include { paramsSummaryMap } from 'plugin/nf-validation' -include { fromSamplesheet } from 'plugin/nf-validation' -include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' -include { completionEmail } from '../../nf-core/utils_nfcore_pipeline' -include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' -include { dashedLine } from '../../nf-core/utils_nfcore_pipeline' -include { nfCoreLogo } from '../../nf-core/utils_nfcore_pipeline' -include { imNotification } from '../../nf-core/utils_nfcore_pipeline' -include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' -include { workflowCitation } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' +include { paramsSummaryMap } from 'plugin/nf-schema' +include { samplesheetToList } from 'plugin/nf-schema' +include { completionEmail } from '../../nf-core/utils_nfcore_pipeline' +include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' +include { imNotification } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW TO INITIALISE PIPELINE -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_INITIALISATION { - take: version // boolean: Display version and exit - help // boolean: Display help text validate_params // boolean: Boolean whether to validate parameters against the schema at runtime monochrome_logs // boolean: Do not use coloured log outputs nextflow_cli_args // array: List of positional nextflow CLI args @@ -44,44 +39,39 @@ workflow PIPELINE_INITIALISATION { // // Print version and exit if required and dump pipeline parameters to JSON file // - UTILS_NEXTFLOW_PIPELINE ( + UTILS_NEXTFLOW_PIPELINE( version, true, outdir, - workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1 + workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1, ) // // Validate parameters and generate parameter summary to stdout // - pre_help_text = nfCoreLogo(monochrome_logs) - post_help_text = '\n' + workflowCitation() + '\n' + dashedLine(monochrome_logs) - def String workflow_command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " - UTILS_NFVALIDATION_PLUGIN ( - help, - workflow_command, - pre_help_text, - post_help_text, + UTILS_NFSCHEMA_PLUGIN( + workflow, validate_params, - "nextflow_schema.json" + null, ) // // Check config provided to the pipeline // - UTILS_NFCORE_PIPELINE ( + UTILS_NFCORE_PIPELINE( nextflow_cli_args ) + // // Custom validation for pipeline parameters // validateInputParameters() - // // Create channel from input file provided through params.input // + Channel - .fromSamplesheet("input") + .fromList(samplesheetToList(input, "${projectDir}/assets/schema_input.json")) .set { ch_samplesheet } emit: @@ -90,13 +80,12 @@ workflow PIPELINE_INITIALISATION { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW FOR PIPELINE COMPLETION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_COMPLETION { - take: email // string: email address email_on_fail // string: email address sent on pipeline failure @@ -107,53 +96,62 @@ workflow PIPELINE_COMPLETION { multiqc_report // string: Path to MultiQC report main: - summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + def multiqc_reports = multiqc_report.toList() // // Completion email and summary // workflow.onComplete { if (email || email_on_fail) { - completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs, multiqc_report.toList()) + completionEmail( + summary_params, + email, + email_on_fail, + plaintext_email, + outdir, + monochrome_logs, + multiqc_reports.getVal(), + ) } completionSummary(monochrome_logs) - if (hook_url) { imNotification(summary_params, hook_url) } } workflow.onError { - log.error "Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting" + log.error("Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting") } } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ + // // Check and validate pipeline parameters // def validateInputParameters() { // Validate antiSMASH inputs for containers // 1. Make sure that either both or none of the antiSMASH directories are supplied - if ( ['docker', 'singularity'].contains(workflow.containerEngine) && ( ( params.run_bgc_screening && !params.bgc_antismash_db && params.bgc_antismash_installdir && !params.bgc_skip_antismash) || ( params.run_bgc_screening && params.bgc_antismash_db && !params.bgc_antismash_installdir && !params.bgc_skip_antismash ) ) ) + if (['docker', 'singularity'].contains(workflow.containerEngine) && ((params.run_bgc_screening && !params.bgc_antismash_db && params.bgc_antismash_installdir && !params.bgc_skip_antismash) || (params.run_bgc_screening && params.bgc_antismash_db && !params.bgc_antismash_installdir && !params.bgc_skip_antismash))) { error("[nf-core/funcscan] ERROR: You supplied either the antiSMASH database or its installation directory, but not both. Please either supply both directories or none (letting the pipeline download them instead).") - - // 2. If both are supplied: Exit if we have a name collision error - else if ( ['docker', 'singularity'].contains(workflow.containerEngine) && ( params.run_bgc_screening && params.bgc_antismash_db && params.bgc_antismash_installdir && !params.bgc_skip_antismash ) ) { + } + else if (['docker', 'singularity'].contains(workflow.containerEngine) && (params.run_bgc_screening && params.bgc_antismash_db && params.bgc_antismash_installdir && !params.bgc_skip_antismash)) { antismash_database_dir = new File(params.bgc_antismash_db) antismash_install_dir = new File(params.bgc_antismash_installdir) - if ( antismash_database_dir.name == antismash_install_dir.name ) error("[nf-core/funcscan] ERROR: Your supplied antiSMASH database and installation directories have identical names: \"" + antismash_install_dir.name + "\".\nPlease make sure to name them differently, for example:\n - Database directory: "+ antismash_database_dir.parent + "/antismash_db\n - Installation directory: " + antismash_install_dir.parent + "/antismash_dir") + if (antismash_database_dir.name == antismash_install_dir.name) { + error("[nf-core/funcscan] ERROR: Your supplied antiSMASH database and installation directories have identical names: " + antismash_install_dir.name + ".\nPlease make sure to name them differently, for example:\n - Database directory: " + antismash_database_dir.parent + "/antismash_db\n - Installation directory: " + antismash_install_dir.parent + "/antismash_dir") + } } // 3. Give warning if not using container system assuming conda - if ( params.run_bgc_screening && ( !params.bgc_antismash_db ) && !params.bgc_skip_antismash && ( session.config.conda && session.config.conda.enabled ) ) { - log.warn "[nf-core/funcscan] Running antiSMASH download database module, and detected conda has been enabled. Assuming using conda for pipeline run. Check config if this is not expected!" + if (params.run_bgc_screening && (!params.bgc_antismash_db) && !params.bgc_skip_antismash && (session.config.conda && session.config.conda.enabled)) { + log.warn("[nf-core/funcscan] Running antiSMASH download database module, and detected conda has been enabled. Assuming using conda for pipeline run. Check config if this is not expected!") } } @@ -164,37 +162,12 @@ def validateInputSamplesheet(input) { def (metas, fastas) = input[1..2] // Check that multiple runs of the same sample are of the same datatype i.e. single-end / paired-end - def endedness_ok = metas.collect{ it.single_end }.unique().size == 1 + def endedness_ok = metas.collect { meta -> meta.single_end }.unique().size == 1 if (!endedness_ok) { error("Please check input samplesheet -> Multiple runs of a sample must be of the same datatype i.e. single-end or paired-end: ${metas[0].id}") } - return [ metas[0], fastas ] -} -// -// Get attribute from genome config file e.g. fasta -// -def getGenomeAttribute(attribute) { - if (params.genomes && params.genome && params.genomes.containsKey(params.genome)) { - if (params.genomes[ params.genome ].containsKey(attribute)) { - return params.genomes[ params.genome ][ attribute ] - } - } - return null -} - -// -// Exit pipeline if incorrect --genome key provided -// -def genomeExistsError() { - if (params.genomes && params.genome && !params.genomes.containsKey(params.genome)) { - def error_string = "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" + - " Genome '${params.genome}' not found in any config files provided to the pipeline.\n" + - " Currently, the available genome keys are:\n" + - " ${params.genomes.keySet().join(", ")}\n" + - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" - error(error_string) - } + return [metas[0], fastas] } // @@ -205,46 +178,46 @@ def toolCitationText() { // Uncomment function in methodsDescriptionText to render in MultiQC report def preprocessing_text = "The pipeline used the following tools: preprocessing included SeqKit2 (Shen et al. 2024)." - def annotation_text = [ - "Annotation was carried out with:", - params.annotation_tool == 'prodigal' ? "Prodigal (Hyatt et al. 2010)." : "", - params.annotation_tool == 'pyrodigal' ? "Pyrodigal (Larralde 2022)." : "", - params.annotation_tool == 'bakta' ? "BAKTA (Schwengers et al. 2021)." : "", - params.annotation_tool == 'prokka' ? "PROKKA (Seemann 2014)." : "", - ].join(' ').trim() - - def amp_text = [ - "The following antimicrobial peptide screening tools were used:", - !params.amp_skip_amplify ? "AMPlify (Li et al. 2022)," : "", - !params.amp_skip_macrel ? "Macrel (Santos-Júnior et al. 2020)," : "", - !params.amp_skip_ampir ? "ampir (Fingerhut et al. 2021)," : "", - params.amp_run_hmmsearch ? "HMMER (Eddy 2011)," : "", - ". The output from the antimicrobial peptide screening tools were standardised and summarised with AMPcombi (Ibrahim and Perelo 2023)." - ].join(' ').trim().replaceAll(", \\.", ".") - - def arg_text = [ - "The following antimicrobial resistance gene screening tools were used:", - !params.arg_skip_fargene ? "fARGene (Berglund et al. 2019)," : "", - !params.arg_skip_rgi ? "RGI (Alcock et al. 2020)," : "", - !params.arg_skip_amrfinderplus ? "AMRfinderplus (Feldgarden et al. 2021)," : "", - !params.arg_skip_deeparg ? "deepARG (Arango-Argoty 2018)," : "", - !params.arg_skip_abricate ? "ABRicate (Seemann 2020)," : "", - !params.arg_skip_argnorm ? ". The outputs from ARG screening tools were normalized to the antibiotic resistance ontology using argNorm (Perovic et al. 2024)," : "", - ". The output from the antimicrobial resistance gene screening tools were standardised and summarised with hAMRonization (Maguire et al. 2023)." - ].join(' ').trim().replaceAll(", +\\.", ".") - - def bgc_text = [ - "The following biosynthetic gene cluster screening tools were used:", - !params.bgc_skip_antismash ? "antiSMASH (Blin et al. 2021)," : "", - !params.bgc_skip_deepbgc ? "deepBGC (Hannigan et al. 2019)," : "", - !params.bgc_skip_gecco ? "GECCO (Carroll et al. 2021)," : "", - params.bgc_run_hmmsearch ? "HMMER (Eddy 2011)," : "", - ". The output from the biosynthetic gene cluster screening tools were standardised and summarised with comBGC (Frangenberg et al. 2023)." - ].join(' ').replaceAll(", +\\.", ".").trim() + def annotation_text = [ + "Annotation was carried out with:", + params.annotation_tool == 'prodigal' ? "Prodigal (Hyatt et al. 2010)." : "", + params.annotation_tool == 'pyrodigal' ? "Pyrodigal (Larralde 2022)." : "", + params.annotation_tool == 'bakta' ? "BAKTA (Schwengers et al. 2021)." : "", + params.annotation_tool == 'prokka' ? "PROKKA (Seemann 2014)." : "", + ].join(' ').trim() + + def amp_text = [ + "The following antimicrobial peptide screening tools were used:", + !params.amp_skip_amplify ? "AMPlify (Li et al. 2022)," : "", + !params.amp_skip_macrel ? "Macrel (Santos-Júnior et al. 2020)," : "", + !params.amp_skip_ampir ? "ampir (Fingerhut et al. 2021)," : "", + params.amp_run_hmmsearch ? "HMMER (Eddy 2011)," : "", + ". The output from the antimicrobial peptide screening tools were standardised and summarised with AMPcombi (Ibrahim and Perelo 2023).", + ].join(' ').trim().replaceAll(', .', ".") + + def arg_text = [ + "The following antimicrobial resistance gene screening tools were used:", + !params.arg_skip_fargene ? "fARGene (Berglund et al. 2019)," : "", + !params.arg_skip_rgi ? "RGI (Alcock et al. 2020)," : "", + !params.arg_skip_amrfinderplus ? "AMRfinderplus (Feldgarden et al. 2021)," : "", + !params.arg_skip_deeparg ? "deepARG (Arango-Argoty 2018)," : "", + !params.arg_skip_abricate ? "ABRicate (Seemann 2020)," : "", + !params.arg_skip_argnorm ? ". The outputs from ARG screening tools were normalized to the antibiotic resistance ontology using argNorm (Perovic et al. 2024)," : "", + ". The output from the antimicrobial resistance gene screening tools were standardised and summarised with hAMRonization (Maguire et al. 2023).", + ].join(' ').trim().replaceAll(', +.', ".") + + def bgc_text = [ + "The following biosynthetic gene cluster screening tools were used:", + !params.bgc_skip_antismash ? "antiSMASH (Blin et al. 2021)," : "", + !params.bgc_skip_deepbgc ? "deepBGC (Hannigan et al. 2019)," : "", + !params.bgc_skip_gecco ? "GECCO (Carroll et al. 2021)," : "", + params.bgc_run_hmmsearch ? "HMMER (Eddy 2011)," : "", + ". The output from the biosynthetic gene cluster screening tools were standardised and summarised with comBGC (Frangenberg et al. 2023).", + ].join(' ').replaceAll(', +.', ".").trim() def postprocessing_text = "Run statistics were reported using MultiQC (Ewels et al. 2016)." - def citation_text = [ + def citation_text = [ preprocessing_text, annotation_text, params.run_amp_screening ? amp_text : "", @@ -257,45 +230,46 @@ def toolCitationText() { } def toolBibliographyText() { - // Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "
  • Author (2023) Pub name, Journal, DOI
  • " : "", + // Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? '
  • Author (2023) Pub name, Journal, DOI
  • " : "", // Uncomment function in methodsDescriptionText to render in MultiQC report - def preprocessing_text = "
  • Shen, W., Sipos, B., & Zhao, L. (2024). SeqKit2: A Swiss army knife for sequence and alignment processing. iMeta, e191. https://doi.org/10.1002/imt2.191
  • " - - def annotation_text = [ - params.annotation_tool == 'prodigal' ? "
  • Hyatt, D., Chen, G. L., Locascio, P. F., Land, M. L., Larimer, F. W., & Hauser, L. J. (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics, 11, 119. DOI: 10.1186/1471-2105-11-119" : "", - params.annotation_tool == 'pyrodigal' ? "
  • Larralde, M. (2022). Pyrodigal: Python bindings and interface to Prodigal, an efficient method for gene prediction in prokaryotes. Journal of Open Source Software, 7(72), 4296. DOI: 10.21105/joss.04296
  • " : "", - params.annotation_tool == 'bakta' ? "
  • Schwengers, O., Jelonek, L., Dieckmann, M. A., Beyvers, S., Blom, J., & Goesmann, A. (2021). Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microbial Genomics, 7(11). DOI: 10.1099/mgen.0.000685
  • " : "", - params.annotation_tool == 'prokka' ? "
  • Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069. DOI: 10.1093/bioinformatics/btu153
  • " : "", - ].join(' ').trim() - - def amp_text = [ - !params.amp_skip_amplify ? "
  • Li, C., Sutherland, D., Hammond, S. A., Yang, C., Taho, F., Bergman, L., Houston, S., Warren, R. L., Wong, T., Hoang, L., Cameron, C. E., Helbing, C. C., & Birol, I. (2022). AMPlify: attentive deep learning model for discovery of novel antimicrobial peptides effective against WHO priority pathogens. BMC genomics, 23(1), 77. DOI: 10.1186/s12864-022-08310-4
  • " : "", - !params.amp_skip_macrel ? "
  • Santos-Júnior, C. D., Pan, S., Zhao, X. M., & Coelho, L. P. (2020). Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ, 8, e10555. DOI: 10.7717/peerj.10555
  • " : "", - !params.amp_skip_ampir ? "
  • Fingerhut, L., Miller, D. J., Strugnell, J. M., Daly, N. L., & Cooke, I. R. (2021). ampir: an R package for fast genome-wide prediction of antimicrobial peptides. Bioinformatics (Oxford, England), 36(21), 5262–5263. DOI: 10.1093/bioinformatics/btaa653
  • " : "", - "
  • Ibrahim, A. & Perelo, L. (2023). Darcy220606/AMPcombi. DOI: 10.5281/zenodo.7639121
  • " - ].join(' ').trim().replaceAll(", \\.", ".") - - def arg_text = [ - !params.arg_skip_fargene ? "
  • Berglund, F., Österlund, T., Boulund, F., Marathe, N. P., Larsson, D., & Kristiansson, E. (2019). Identification and reconstruction of novel antibiotic resistance genes from metagenomes. Microbiome, 7(1), 52. DOI: 10.1186/s40168-019-0670-1
  • " : "", - !params.arg_skip_rgi ? "
  • Alcock, B. P., Raphenya, A. R., Lau, T., Tsang, K. K., Bouchard, M., Edalatmand, A., Huynh, W., Nguyen, A. V., Cheng, A. A., Liu, S., Min, S. Y., Miroshnichenko, A., Tran, H. K., Werfalli, R. E., Nasir, J. A., Oloni, M., Speicher, D. J., Florescu, A., Singh, B., Faltyn, M., … McArthur, A. G. (2020). CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic acids research, 48(D1), D517–D525. DOI: 10.1093/nar/gkz935
  • " : "", - !params.arg_skip_amrfinderplus ? "
  • Feldgarden, M., Brover, V., Gonzalez-Escalona, N., Frye, J. G., Haendiges, J., Haft, D. H., Hoffmann, M., Pettengill, J. B., Prasad, A. B., Tillman, G. E., Tyson, G. H., & Klimke, W. (2021). AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Scientific reports, 11(1), 12728. DOI: 10.1038/s41598-021-91456-0
  • " : "", - !params.arg_skip_deeparg ? "
  • Arango-Argoty, G., Garner, E., Pruden, A., Heath, L. S., Vikesland, P., & Zhang, L. (2018). DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data. Microbiome, 6(1), 23. DOI: 10.1186/s40168-018-0401-z" : "", - !params.arg_skip_abricate ? "
  • Seemann, T. (2020). ABRicate. Github https://github.com/tseemann/abricate.
  • " : "", - !params.arg_skip_argnorm ? "
  • Perovic, S. U., Ramji, V., Chong, H., Duan, Y., Maguire, F., Coelho, L. P. (2024). argNorm. DOI: .
  • " : "", - "
  • Public Health Alliance for Genomic Epidemiology (pha4ge). (2022). Parse multiple Antimicrobial Resistance Analysis Reports into a common data structure. Github. Retrieved October 5, 2022, from https://github.com/pha4ge/hAMRonization
  • " - ].join(' ').trim().replaceAll(", +\\.", ".") - - def bgc_text = [ - !params.bgc_skip_antismash ? "
  • Blin, K., Shaw, S., Kloosterman, A. M., Charlop-Powers, Z., van Wezel, G. P., Medema, M. H., & Weber, T. (2021). antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic acids research, 49(W1), W29–W35. DOI:
  • " : "", - !params.bgc_skip_deepbgc ? "
  • Hannigan, G. D., Prihoda, D., Palicka, A., Soukup, J., Klempir, O., Rampula, L., Durcak, J., Wurst, M., Kotowski, J., Chang, D., Wang, R., Piizzi, G., Temesi, G., Hazuda, D. J., Woelk, C. H., & Bitton, D. A. (2019). A deep learning genome-mining strategy for biosynthetic gene cluster prediction. Nucleic acids research, 47(18), e110. DOI: 10.1093/nar/gkz654
  • " : "", - !params.bgc_skip_gecco ? "
  • Carroll, L. M. , Larralde, M., Fleck, J. S., Ponnudurai, R., Milanese, A., Cappio Barazzone, E. & Zeller, G. (2021). Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv DOI: 0.1101/2021.05.03.442509
  • " : "", - "
  • Frangenberg, J. Fellows Yates, J. A., Ibrahim, A., Perelo, L., & Beber, M. E. (2023). nf-core/funcscan: 1.0.0 - German Rollmops - 2023-02-15. https://doi.org/10.5281/zenodo.7643100
  • " - ].join(' ').replaceAll(", +\\.", ".").trim() - - def postprocessing_text = "
  • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. https://doi.org/10.1093/bioinformatics/btw354
  • " + def preprocessing_text = '
  • Shen, W., Sipos, B., & Zhao, L. (2024). SeqKit2: A Swiss army knife for sequence and alignment processing. iMeta, e191. https://doi.org/10.1002/imt2.191
  • ' + + def annotation_text = [ + params.annotation_tool == 'prodigal' ? '
  • Hyatt, D., Chen, G. L., Locascio, P. F., Land, M. L., Larimer, F. W., & Hauser, L. J. (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics, 11, 119. DOI: 10.1186/1471-2105-11-119
  • ' : "", + params.annotation_tool == 'pyrodigal' ? '
  • Larralde, M. (2022). Pyrodigal: Python bindings and interface to Prodigal, an efficient method for gene prediction in prokaryotes. Journal of Open Source Software, 7(72), 4296. DOI: 10.21105/joss.04296
  • ' : "", + params.annotation_tool == 'bakta' ? '
  • Schwengers, O., Jelonek, L., Dieckmann, M. A., Beyvers, S., Blom, J., & Goesmann, A. (2021). Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microbial Genomics, 7(11). DOI: 10.1099/mgen.0.000685
  • ' : "", + params.annotation_tool == 'prokka' ? '
  • Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069. DOI: 10.1093/bioinformatics/btu153
  • ' : "", + ].join(' ').trim() + + def amp_text = [ + !params.amp_skip_amplify ? '
  • Li, C., Sutherland, D., Hammond, S. A., Yang, C., Taho, F., Bergman, L., Houston, S., Warren, R. L., Wong, T., Hoang, L., Cameron, C. E., Helbing, C. C., & Birol, I. (2022). AMPlify: attentive deep learning model for discovery of novel antimicrobial peptides effective against WHO priority pathogens. BMC genomics, 23(1), 77. DOI: 10.1186/s12864-022-08310-4
  • ' : "", + !params.amp_skip_macrel ? '
  • Santos-Júnior, C. D., Pan, S., Zhao, X. M., & Coelho, L. P. (2020). Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ, 8, e10555. DOI: 10.7717/peerj.10555
  • ' : "", + !params.amp_skip_ampir ? '
  • Fingerhut, L., Miller, D. J., Strugnell, J. M., Daly, N. L., & Cooke, I. R. (2021). ampir: an R package for fast genome-wide prediction of antimicrobial peptides. Bioinformatics (Oxford, England), 36(21), 5262–5263. DOI: 10.1093/bioinformatics/btaa653
  • ' : "", + '
  • Ibrahim, A. & Perelo, L. (2023). Darcy220606/AMPcombi. DOI: 10.5281/zenodo.7639121
  • ', + ].join(' ').trim().replaceAll(', .', ".") + + def arg_text = [ + !params.arg_skip_fargene ? '
  • Berglund, F., Österlund, T., Boulund, F., Marathe, N. P., Larsson, D., & Kristiansson, E. (2019). Identification and reconstruction of novel antibiotic resistance genes from metagenomes. Microbiome, 7(1), 52. DOI: 10.1186/s40168-019-0670-1
  • ' : "", + !params.arg_skip_rgi ? '
  • Alcock, B. P., Raphenya, A. R., Lau, T., Tsang, K. K., Bouchard, M., Edalatmand, A., Huynh, W., Nguyen, A. V., Cheng, A. A., Liu, S., Min, S. Y., Miroshnichenko, A., Tran, H. K., Werfalli, R. E., Nasir, J. A., Oloni, M., Speicher, D. J., Florescu, A., Singh, B., Faltyn, M., … McArthur, A. G. (2020). CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic acids research, 48(D1), D517–D525. DOI: 10.1093/nar/gkz935
  • ' : "", + !params.arg_skip_amrfinderplus ? '
  • Feldgarden, M., Brover, V., Gonzalez-Escalona, N., Frye, J. G., Haendiges, J., Haft, D. H., Hoffmann, M., Pettengill, J. B., Prasad, A. B., Tillman, G. E., Tyson, G. H., & Klimke, W. (2021). AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Scientific reports, 11(1), 12728. DOI: 10.1038/s41598-021-91456-0
  • ' : "", + !params.arg_skip_deeparg ? '
  • Arango-Argoty, G., Garner, E., Pruden, A., Heath, L. S., Vikesland, P., & Zhang, L. (2018). DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data. Microbiome, 6(1), 23. DOI: 10.1186/s40168-018-0401-z' : "", + !params.arg_skip_abricate ? '
  • Seemann, T. (2020). ABRicate. Github https://github.com/tseemann/abricate.
  • ' : "", + !params.arg_skip_argnorm ? '
  • Perovic, S. U., Ramji, V., Chong, H., Duan, Y., Maguire, F., Coelho, L. P. (2024). argNorm. DOI: .
  • ' : "", + '
  • Public Health Alliance for Genomic Epidemiology (pha4ge). (2022). Parse multiple Antimicrobial Resistance Analysis Reports into a common data structure. Github. Retrieved October 5, 2022, from https://github.com/pha4ge/hAMRonization
  • ', + ].join(' ').trim().replaceAll(', +.', ".") + + + def bgc_text = [ + !params.bgc_skip_antismash ? '
  • Blin, K., Shaw, S., Kloosterman, A. M., Charlop-Powers, Z., van Wezel, G. P., Medema, M. H., & Weber, T. (2021). antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic acids research, 49(W1), W29–W35. DOI:
  • ' : "", + !params.bgc_skip_deepbgc ? '
  • Hannigan, G. D., Prihoda, D., Palicka, A., Soukup, J., Klempir, O., Rampula, L., Durcak, J., Wurst, M., Kotowski, J., Chang, D., Wang, R., Piizzi, G., Temesi, G., Hazuda, D. J., Woelk, C. H., & Bitton, D. A. (2019). A deep learning genome-mining strategy for biosynthetic gene cluster prediction. Nucleic acids research, 47(18), e110. DOI: 10.1093/nar/gkz654
  • ' : "", + !params.bgc_skip_gecco ? '
  • Carroll, L. M. , Larralde, M., Fleck, J. S., Ponnudurai, R., Milanese, A., Cappio Barazzone, E. & Zeller, G. (2021). Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv DOI: 0.1101/2021.05.03.442509
  • ' : "", + '
  • Frangenberg, J. Fellows Yates, J. A., Ibrahim, A., Perelo, L., & Beber, M. E. (2023). nf-core/funcscan: 1.0.0 - German Rollmops - 2023-02-15. https://doi.org/10.5281/zenodo.7643100
  • ', + ].join(' ').replaceAll(', +.', ".").trim() + + def postprocessing_text = '
  • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. https://doi.org/10.1093/bioinformatics/btw354
  • ' // Special as reused in multiple subworkflows, and we don't want to cause duplicates - def hmmsearch_text = ( params.run_amp_screening && params.amp_run_hmmsearch ) || ( params.run_bgc_screening && params.bgc_run_hmmsearch ) ? "
  • Eddy S. R. (2011). Accelerated Profile HMM Searches. PLoS computational biology, 7(10), e1002195. DOI: 10.1371/journal.pcbi.1002195
  • " : "" + def hmmsearch_text = (params.run_amp_screening && params.amp_run_hmmsearch) || (params.run_bgc_screening && params.bgc_run_hmmsearch) ? '
  • Eddy S. R. (2011). Accelerated Profile HMM Searches. PLoS computational biology, 7(10), e1002195. DOI: 10.1371/journal.pcbi.1002195
  • ' : "" def reference_text = [ preprocessing_text, @@ -311,7 +285,7 @@ def toolBibliographyText() { } def methodsDescriptionText(mqc_methods_yaml) { - // Convert to a named map so can be used as with familar NXF ${workflow} variable syntax in the MultiQC YML file + // Convert to a named map so can be used as with familiar NXF ${workflow} variable syntax in the MultiQC YML file def meta = [:] meta.workflow = workflow.toMap() meta["manifest_map"] = workflow.manifest.toMap() @@ -322,23 +296,28 @@ def methodsDescriptionText(mqc_methods_yaml) { // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers // Removing ` ` since the manifest.doi is a string and not a proper list def temp_doi_ref = "" - String[] manifest_doi = meta.manifest_map.doi.tokenize(",") - for (String doi_ref: manifest_doi) temp_doi_ref += "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " + def manifest_doi = meta.manifest_map.doi.tokenize(",") + manifest_doi.each { doi_ref -> + temp_doi_ref += "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " + } meta["doi_text"] = temp_doi_ref.substring(0, temp_doi_ref.length() - 2) - } else meta["doi_text"] = "" + } + else { + meta["doi_text"] = "" + } meta["nodoi_text"] = meta.manifest_map.doi ? "" : "
  • If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used.
  • " // Tool references meta["tool_citations"] = "" meta["tool_bibliography"] = "" - meta["tool_citations"] = toolCitationText().replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") + meta["tool_citations"] = toolCitationText().replaceAll(', .', ".").replaceAll('. .', ".").replaceAll(', .', ".") meta["tool_bibliography"] = toolBibliographyText() def methods_text = mqc_methods_yaml.text - def engine = new groovy.text.SimpleTemplateEngine() + def engine = new groovy.text.SimpleTemplateEngine() def description_html = engine.createTemplate(methods_text).make(meta) return description_html.toString() diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf index ac31f28f..d6e593e8 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf @@ -2,18 +2,13 @@ // Subworkflow with functionality that may be useful for any Nextflow pipeline // -import org.yaml.snakeyaml.Yaml -import groovy.json.JsonOutput -import nextflow.extension.FilesEx - /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NEXTFLOW_PIPELINE { - take: print_version // boolean: print version dump_parameters // boolean: dump parameters @@ -26,7 +21,7 @@ workflow UTILS_NEXTFLOW_PIPELINE { // Print workflow version and exit on --version // if (print_version) { - log.info "${workflow.manifest.name} ${getWorkflowVersion()}" + log.info("${workflow.manifest.name} ${getWorkflowVersion()}") System.exit(0) } @@ -49,16 +44,16 @@ workflow UTILS_NEXTFLOW_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Generate version string // def getWorkflowVersion() { - String version_string = "" + def version_string = "" as String if (workflow.manifest.version) { def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' version_string += "${prefix_v}${workflow.manifest.version}" @@ -76,13 +71,13 @@ def getWorkflowVersion() { // Dump pipeline parameters to a JSON file // def dumpParametersToJSON(outdir) { - def timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') - def filename = "params_${timestamp}.json" - def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") - def jsonStr = JsonOutput.toJson(params) - temp_pf.text = JsonOutput.prettyPrint(jsonStr) + def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') + def filename = "params_${timestamp}.json" + def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") + def jsonStr = groovy.json.JsonOutput.toJson(params) + temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) - FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") + nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") temp_pf.delete() } @@ -90,37 +85,42 @@ def dumpParametersToJSON(outdir) { // When running with -profile conda, warn if channels have not been set-up appropriately // def checkCondaChannels() { - Yaml parser = new Yaml() + def parser = new org.yaml.snakeyaml.Yaml() def channels = [] try { def config = parser.load("conda config --show channels".execute().text) channels = config.channels - } catch(NullPointerException | IOException e) { - log.warn "Could not verify conda channel configuration." - return + } + catch (NullPointerException e) { + log.debug(e) + log.warn("Could not verify conda channel configuration.") + return null + } + catch (IOException e) { + log.debug(e) + log.warn("Could not verify conda channel configuration.") + return null } // Check that all channels are present // This channel list is ordered by required channel priority. - def required_channels_in_order = ['conda-forge', 'bioconda', 'defaults'] + def required_channels_in_order = ['conda-forge', 'bioconda'] def channels_missing = ((required_channels_in_order as Set) - (channels as Set)) as Boolean // Check that they are in the right order - def channel_priority_violation = false - def n = required_channels_in_order.size() - for (int i = 0; i < n - 1; i++) { - channel_priority_violation |= !(channels.indexOf(required_channels_in_order[i]) < channels.indexOf(required_channels_in_order[i+1])) - } + def channel_priority_violation = required_channels_in_order != channels.findAll { ch -> ch in required_channels_in_order } if (channels_missing | channel_priority_violation) { - log.warn "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" + - " There is a problem with your Conda configuration!\n\n" + - " You will need to set-up the conda-forge and bioconda channels correctly.\n" + - " Please refer to https://bioconda.github.io/\n" + - " The observed channel order is \n" + - " ${channels}\n" + - " but the following channel order is required:\n" + - " ${required_channels_in_order}\n" + - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + log.warn """\ + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + There is a problem with your Conda configuration! + You will need to set-up the conda-forge and bioconda channels correctly. + Please refer to https://bioconda.github.io/ + The observed channel order is + ${channels} + but the following channel order is required: + ${required_channels_in_order} + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + """.stripIndent(true) } } diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test b/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test index ca964ce8..02dbf094 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test +++ b/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test @@ -52,10 +52,12 @@ nextflow_workflow { } then { - assertAll( - { assert workflow.success }, - { assert workflow.stdout.contains("nextflow_workflow v9.9.9") } - ) + expect { + with(workflow) { + assert success + assert "nextflow_workflow v9.9.9" in stdout + } + } } } diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config b/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config index d0a926bf..a09572e5 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config +++ b/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config @@ -3,7 +3,7 @@ manifest { author = """nf-core""" homePage = 'https://127.0.0.1' description = """Dummy pipeline""" - nextflowVersion = '!>=23.04.0' + nextflowVersion = '!>=23.04.0' version = '9.9.9' doi = 'https://doi.org/10.5281/zenodo.5070524' } diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf index 14558c39..bfd25876 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf @@ -2,17 +2,13 @@ // Subworkflow with utility functions specific to the nf-core pipeline template // -import org.yaml.snakeyaml.Yaml -import nextflow.extension.FilesEx - /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NFCORE_PIPELINE { - take: nextflow_cli_args @@ -25,23 +21,20 @@ workflow UTILS_NFCORE_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Warn if a -profile or Nextflow config has not been provided to run the pipeline // def checkConfigProvided() { - valid_config = true + def valid_config = true as Boolean if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) { - log.warn "[$workflow.manifest.name] You are attempting to run the pipeline without any custom configuration!\n\n" + - "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + - " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + - " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + - " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + - "Please refer to the quick start section and usage docs for the pipeline.\n " + log.warn( + "[${workflow.manifest.name}] You are attempting to run the pipeline without any custom configuration!\n\n" + "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + "Please refer to the quick start section and usage docs for the pipeline.\n " + ) valid_config = false } return valid_config @@ -52,39 +45,22 @@ def checkConfigProvided() { // def checkProfileProvided(nextflow_cli_args) { if (workflow.profile.endsWith(',')) { - error "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + error( + "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } if (nextflow_cli_args[0]) { - log.warn "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + log.warn( + "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } } -// -// Citation string for pipeline -// -def workflowCitation() { - def temp_doi_ref = "" - String[] manifest_doi = workflow.manifest.doi.tokenize(",") - // Using a loop to handle multiple DOIs - // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers - // Removing ` ` since the manifest.doi is a string and not a proper list - for (String doi_ref: manifest_doi) temp_doi_ref += " https://doi.org/${doi_ref.replace('https://doi.org/', '').replace(' ', '')}\n" - return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + - "* The pipeline\n" + - temp_doi_ref + "\n" + - "* The nf-core framework\n" + - " https://doi.org/10.1038/s41587-020-0439-x\n\n" + - "* Software dependencies\n" + - " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" -} - // // Generate workflow version string // def getWorkflowVersion() { - String version_string = "" + def version_string = "" as String if (workflow.manifest.version) { def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' version_string += "${prefix_v}${workflow.manifest.version}" @@ -102,8 +78,8 @@ def getWorkflowVersion() { // Get software versions for pipeline // def processVersionsFromYAML(yaml_file) { - Yaml yaml = new Yaml() - versions = yaml.load(yaml_file).collectEntries { k, v -> [ k.tokenize(':')[-1], v ] } + def yaml = new org.yaml.snakeyaml.Yaml() + def versions = yaml.load(yaml_file).collectEntries { k, v -> [k.tokenize(':')[-1], v] } return yaml.dumpAsMap(versions).trim() } @@ -113,8 +89,8 @@ def processVersionsFromYAML(yaml_file) { def workflowVersionToYAML() { return """ Workflow: - $workflow.manifest.name: ${getWorkflowVersion()} - Nextflow: $workflow.nextflow.version + ${workflow.manifest.name}: ${getWorkflowVersion()} + Nextflow: ${workflow.nextflow.version} """.stripIndent().trim() } @@ -122,11 +98,7 @@ def workflowVersionToYAML() { // Get channel of software versions used in pipeline in YAML format // def softwareVersionsToYAML(ch_versions) { - return ch_versions - .unique() - .map { processVersionsFromYAML(it) } - .unique() - .mix(Channel.of(workflowVersionToYAML())) + return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(Channel.of(workflowVersionToYAML())) } // @@ -134,61 +106,40 @@ def softwareVersionsToYAML(ch_versions) { // def paramsSummaryMultiqc(summary_params) { def summary_section = '' - for (group in summary_params.keySet()) { - def group_params = summary_params.get(group) // This gets the parameters of that particular group - if (group_params) { - summary_section += "

    $group

    \n" - summary_section += "
    \n" - for (param in group_params.keySet()) { - summary_section += "
    $param
    ${group_params.get(param) ?: 'N/A'}
    \n" + summary_params + .keySet() + .each { group -> + def group_params = summary_params.get(group) + // This gets the parameters of that particular group + if (group_params) { + summary_section += "

    ${group}

    \n" + summary_section += "
    \n" + group_params + .keySet() + .sort() + .each { param -> + summary_section += "
    ${param}
    ${group_params.get(param) ?: 'N/A'}
    \n" + } + summary_section += "
    \n" } - summary_section += "
    \n" } - } - String yaml_file_text = "id: '${workflow.manifest.name.replace('/','-')}-summary'\n" - yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" - yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" - yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" - yaml_file_text += "plot_type: 'html'\n" - yaml_file_text += "data: |\n" - yaml_file_text += "${summary_section}" + def yaml_file_text = "id: '${workflow.manifest.name.replace('/', '-')}-summary'\n" as String + yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" + yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" + yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" + yaml_file_text += "plot_type: 'html'\n" + yaml_file_text += "data: |\n" + yaml_file_text += "${summary_section}" return yaml_file_text } -// -// nf-core logo -// -def nfCoreLogo(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) - String.format( - """\n - ${dashedLine(monochrome_logs)} - ${colors.green},--.${colors.black}/${colors.green},-.${colors.reset} - ${colors.blue} ___ __ __ __ ___ ${colors.green}/,-._.--~\'${colors.reset} - ${colors.blue} |\\ | |__ __ / ` / \\ |__) |__ ${colors.yellow}} {${colors.reset} - ${colors.blue} | \\| | \\__, \\__/ | \\ |___ ${colors.green}\\`-._,-`-,${colors.reset} - ${colors.green}`._,._,\'${colors.reset} - ${colors.purple} ${workflow.manifest.name} ${getWorkflowVersion()}${colors.reset} - ${dashedLine(monochrome_logs)} - """.stripIndent() - ) -} - -// -// Return dashed line -// -def dashedLine(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) - return "-${colors.dim}----------------------------------------------------${colors.reset}-" -} - // // ANSII colours used for terminal logging // def logColours(monochrome_logs=true) { - Map colorcodes = [:] + def colorcodes = [:] as Map // Reset / Meta colorcodes['reset'] = monochrome_logs ? '' : "\033[0m" @@ -200,79 +151,76 @@ def logColours(monochrome_logs=true) { colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" // Regular Colors - colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" - colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" - colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" - colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" - colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" - colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" - colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" - colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" + colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" + colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" + colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" + colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" + colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" + colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" + colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" + colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" // Bold - colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" - colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" - colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" - colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" - colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" - colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" - colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" - colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" + colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" + colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" + colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" + colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" + colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" + colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" + colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" + colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" // Underline - colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" - colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" - colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" - colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" - colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" - colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" - colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" - colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" + colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" + colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" + colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" + colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" + colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" + colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" + colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" + colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" // High Intensity - colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" - colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" - colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" - colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" - colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" - colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" - colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" - colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" + colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" + colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" + colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" + colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" + colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" + colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" + colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" + colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" // Bold High Intensity - colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" - colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" - colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" - colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" - colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" - colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" - colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" - colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" + colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" + colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" + colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" + colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" + colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" + colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" + colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" + colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" return colorcodes } -// -// Attach the multiqc report to email -// -def attachMultiqcReport(multiqc_report) { - def mqc_report = null - try { - if (workflow.success) { - mqc_report = multiqc_report.getVal() - if (mqc_report.getClass() == ArrayList && mqc_report.size() >= 1) { - if (mqc_report.size() > 1) { - log.warn "[$workflow.manifest.name] Found multiple reports from process 'MULTIQC', will use only one" - } - mqc_report = mqc_report[0] - } - } - } catch (all) { - if (multiqc_report) { - log.warn "[$workflow.manifest.name] Could not attach MultiQC report to summary email" +// Return a single report from an object that may be a Path or List +// +def getSingleReport(multiqc_reports) { + if (multiqc_reports instanceof Path) { + return multiqc_reports + } else if (multiqc_reports instanceof List) { + if (multiqc_reports.size() == 0) { + log.warn("[${workflow.manifest.name}] No reports found from process 'MULTIQC'") + return null + } else if (multiqc_reports.size() == 1) { + return multiqc_reports.first() + } else { + log.warn("[${workflow.manifest.name}] Found multiple reports from process 'MULTIQC', will use only one") + return multiqc_reports.first() } + } else { + return null } - return mqc_report } // @@ -281,26 +229,35 @@ def attachMultiqcReport(multiqc_report) { def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs=true, multiqc_report=null) { // Set up the e-mail variables - def subject = "[$workflow.manifest.name] Successful: $workflow.runName" + def subject = "[${workflow.manifest.name}] Successful: ${workflow.runName}" if (!workflow.success) { - subject = "[$workflow.manifest.name] FAILED: $workflow.runName" + subject = "[${workflow.manifest.name}] FAILED: ${workflow.runName}" } def summary = [:] - for (group in summary_params.keySet()) { - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] misc_fields['Date Started'] = workflow.start misc_fields['Date Completed'] = workflow.complete misc_fields['Pipeline script file path'] = workflow.scriptFile misc_fields['Pipeline script hash ID'] = workflow.scriptId - if (workflow.repository) misc_fields['Pipeline repository Git URL'] = workflow.repository - if (workflow.commitId) misc_fields['Pipeline repository Git Commit'] = workflow.commitId - if (workflow.revision) misc_fields['Pipeline Git branch/tag'] = workflow.revision - misc_fields['Nextflow Version'] = workflow.nextflow.version - misc_fields['Nextflow Build'] = workflow.nextflow.build + if (workflow.repository) { + misc_fields['Pipeline repository Git URL'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['Pipeline repository Git Commit'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['Pipeline Git branch/tag'] = workflow.revision + } + misc_fields['Nextflow Version'] = workflow.nextflow.version + misc_fields['Nextflow Build'] = workflow.nextflow.build misc_fields['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp def email_fields = [:] @@ -317,7 +274,7 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi email_fields['summary'] = summary << misc_fields // On success try attach the multiqc report - def mqc_report = attachMultiqcReport(multiqc_report) + def mqc_report = getSingleReport(multiqc_report) // Check if we are only sending emails on failure def email_address = email @@ -337,40 +294,45 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi def email_html = html_template.toString() // Render the sendmail template - def max_multiqc_email_size = (params.containsKey('max_multiqc_email_size') ? params.max_multiqc_email_size : 0) as nextflow.util.MemoryUnit - def smail_fields = [ email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes() ] + def max_multiqc_email_size = (params.containsKey('max_multiqc_email_size') ? params.max_multiqc_email_size : 0) as MemoryUnit + def smail_fields = [email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes()] def sf = new File("${workflow.projectDir}/assets/sendmail_template.txt") def sendmail_template = engine.createTemplate(sf).make(smail_fields) def sendmail_html = sendmail_template.toString() // Send the HTML e-mail - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map if (email_address) { try { - if (plaintext_email) { throw GroovyException('Send plaintext e-mail, not HTML') } + if (plaintext_email) { + new org.codehaus.groovy.GroovyException('Send plaintext e-mail, not HTML') + } // Try to send HTML e-mail using sendmail def sendmail_tf = new File(workflow.launchDir.toString(), ".sendmail_tmp.html") sendmail_tf.withWriter { w -> w << sendmail_html } - [ 'sendmail', '-t' ].execute() << sendmail_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (sendmail)-" - } catch (all) { + ['sendmail', '-t'].execute() << sendmail_html + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (sendmail)-") + } + catch (Exception msg) { + log.debug(msg.toString()) + log.debug("Trying with mail instead of sendmail") // Catch failures and try with plaintext - def mail_cmd = [ 'mail', '-s', subject, '--content-type=text/html', email_address ] + def mail_cmd = ['mail', '-s', subject, '--content-type=text/html', email_address] mail_cmd.execute() << email_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (mail)-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (mail)-") } } // Write summary e-mail HTML to a file def output_hf = new File(workflow.launchDir.toString(), ".pipeline_report.html") output_hf.withWriter { w -> w << email_html } - FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html"); + nextflow.extension.FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html") output_hf.delete() // Write summary e-mail TXT to a file def output_tf = new File(workflow.launchDir.toString(), ".pipeline_report.txt") output_tf.withWriter { w -> w << email_txt } - FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt"); + nextflow.extension.FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt") output_tf.delete() } @@ -378,15 +340,17 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi // Print pipeline summary on completion // def completionSummary(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map if (workflow.success) { if (workflow.stats.ignoredCount == 0) { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Pipeline completed successfully${colors.reset}-" - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Pipeline completed successfully${colors.reset}-") } - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-" + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-") + } + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.red} Pipeline completed with errors${colors.reset}-") } } @@ -395,21 +359,30 @@ def completionSummary(monochrome_logs=true) { // def imNotification(summary_params, hook_url) { def summary = [:] - for (group in summary_params.keySet()) { - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] - misc_fields['start'] = workflow.start - misc_fields['complete'] = workflow.complete - misc_fields['scriptfile'] = workflow.scriptFile - misc_fields['scriptid'] = workflow.scriptId - if (workflow.repository) misc_fields['repository'] = workflow.repository - if (workflow.commitId) misc_fields['commitid'] = workflow.commitId - if (workflow.revision) misc_fields['revision'] = workflow.revision - misc_fields['nxf_version'] = workflow.nextflow.version - misc_fields['nxf_build'] = workflow.nextflow.build - misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp + misc_fields['start'] = workflow.start + misc_fields['complete'] = workflow.complete + misc_fields['scriptfile'] = workflow.scriptFile + misc_fields['scriptid'] = workflow.scriptId + if (workflow.repository) { + misc_fields['repository'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['commitid'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['revision'] = workflow.revision + } + misc_fields['nxf_version'] = workflow.nextflow.version + misc_fields['nxf_build'] = workflow.nextflow.build + misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp def msg_fields = [:] msg_fields['version'] = getWorkflowVersion() @@ -434,13 +407,13 @@ def imNotification(summary_params, hook_url) { def json_message = json_template.toString() // POST - def post = new URL(hook_url).openConnection(); + def post = new URL(hook_url).openConnection() post.setRequestMethod("POST") post.setDoOutput(true) post.setRequestProperty("Content-Type", "application/json") - post.getOutputStream().write(json_message.getBytes("UTF-8")); - def postRC = post.getResponseCode(); - if (! postRC.equals(200)) { - log.warn(post.getErrorStream().getText()); + post.getOutputStream().write(json_message.getBytes("UTF-8")) + def postRC = post.getResponseCode() + if (!postRC.equals(200)) { + log.warn(post.getErrorStream().getText()) } } diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test b/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test index 1dc317f8..f117040c 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test +++ b/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test @@ -41,26 +41,14 @@ nextflow_function { } } - test("Test Function workflowCitation") { - - function "workflowCitation" - - then { - assertAll( - { assert function.success }, - { assert snapshot(function.result).match() } - ) - } - } - - test("Test Function nfCoreLogo") { + test("Test Function without logColours") { - function "nfCoreLogo" + function "logColours" when { function { """ - input[0] = false + input[0] = true """ } } @@ -73,9 +61,8 @@ nextflow_function { } } - test("Test Function dashedLine") { - - function "dashedLine" + test("Test Function with logColours") { + function "logColours" when { function { @@ -93,14 +80,13 @@ nextflow_function { } } - test("Test Function without logColours") { - - function "logColours" + test("Test Function getSingleReport with a single file") { + function "getSingleReport" when { function { """ - input[0] = true + input[0] = file(params.modules_testdata_base_path + '/generic/tsv/test.tsv', checkIfExists: true) """ } } @@ -108,18 +94,22 @@ nextflow_function { then { assertAll( { assert function.success }, - { assert snapshot(function.result).match() } + { assert function.result.contains("test.tsv") } ) } } - test("Test Function with logColours") { - function "logColours" + test("Test Function getSingleReport with multiple files") { + function "getSingleReport" when { function { """ - input[0] = false + input[0] = [ + file(params.modules_testdata_base_path + '/generic/tsv/test.tsv', checkIfExists: true), + file(params.modules_testdata_base_path + '/generic/tsv/network.tsv', checkIfExists: true), + file(params.modules_testdata_base_path + '/generic/tsv/expression.tsv', checkIfExists: true) + ] """ } } @@ -127,7 +117,9 @@ nextflow_function { then { assertAll( { assert function.success }, - { assert snapshot(function.result).match() } + { assert function.result.contains("test.tsv") }, + { assert !function.result.contains("network.tsv") }, + { assert !function.result.contains("expression.tsv") } ) } } diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap b/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap index 1037232c..02c67014 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap +++ b/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap @@ -17,26 +17,6 @@ }, "timestamp": "2024-02-28T12:02:59.729647" }, - "Test Function nfCoreLogo": { - "content": [ - "\n\n-\u001b[2m----------------------------------------------------\u001b[0m-\n \u001b[0;32m,--.\u001b[0;30m/\u001b[0;32m,-.\u001b[0m\n\u001b[0;34m ___ __ __ __ ___ \u001b[0;32m/,-._.--~'\u001b[0m\n\u001b[0;34m |\\ | |__ __ / ` / \\ |__) |__ \u001b[0;33m} {\u001b[0m\n\u001b[0;34m | \\| | \\__, \\__/ | \\ |___ \u001b[0;32m\\`-._,-`-,\u001b[0m\n \u001b[0;32m`._,._,'\u001b[0m\n\u001b[0;35m nextflow_workflow v9.9.9\u001b[0m\n-\u001b[2m----------------------------------------------------\u001b[0m-\n" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-28T12:03:10.562934" - }, - "Test Function workflowCitation": { - "content": [ - "If you use nextflow_workflow for your analysis please cite:\n\n* The pipeline\n https://doi.org/10.5281/zenodo.5070524\n\n* The nf-core framework\n https://doi.org/10.1038/s41587-020-0439-x\n\n* Software dependencies\n https://github.com/nextflow_workflow/blob/master/CITATIONS.md" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-28T12:03:07.019761" - }, "Test Function without logColours": { "content": [ { @@ -95,16 +75,6 @@ }, "timestamp": "2024-02-28T12:03:17.969323" }, - "Test Function dashedLine": { - "content": [ - "-\u001b[2m----------------------------------------------------\u001b[0m-" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-28T12:03:14.366181" - }, "Test Function with logColours": { "content": [ { diff --git a/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/subworkflows/nf-core/utils_nfschema_plugin/main.nf new file mode 100644 index 00000000..4994303e --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -0,0 +1,46 @@ +// +// Subworkflow that uses the nf-schema plugin to validate parameters and render the parameter summary +// + +include { paramsSummaryLog } from 'plugin/nf-schema' +include { validateParameters } from 'plugin/nf-schema' + +workflow UTILS_NFSCHEMA_PLUGIN { + + take: + input_workflow // workflow: the workflow object used by nf-schema to get metadata from the workflow + validate_params // boolean: validate the parameters + parameters_schema // string: path to the parameters JSON schema. + // this has to be the same as the schema given to `validation.parametersSchema` + // when this input is empty it will automatically use the configured schema or + // "${projectDir}/nextflow_schema.json" as default. This input should not be empty + // for meta pipelines + + main: + + // + // Print parameter summary to stdout. This will display the parameters + // that differ from the default given in the JSON schema + // + if(parameters_schema) { + log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) + } else { + log.info paramsSummaryLog(input_workflow) + } + + // + // Validate the parameters using nextflow_schema.json or the schema + // given via the validation.parametersSchema configuration option + // + if(validate_params) { + if(parameters_schema) { + validateParameters(parameters_schema:parameters_schema) + } else { + validateParameters() + } + } + + emit: + dummy_emit = true +} + diff --git a/subworkflows/nf-core/utils_nfschema_plugin/meta.yml b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml new file mode 100644 index 00000000..f7d9f028 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml @@ -0,0 +1,35 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "utils_nfschema_plugin" +description: Run nf-schema to validate parameters and create a summary of changed parameters +keywords: + - validation + - JSON schema + - plugin + - parameters + - summary +components: [] +input: + - input_workflow: + type: object + description: | + The workflow object of the used pipeline. + This object contains meta data used to create the params summary log + - validate_params: + type: boolean + description: Validate the parameters and error if invalid. + - parameters_schema: + type: string + description: | + Path to the parameters JSON schema. + This has to be the same as the schema given to the `validation.parametersSchema` config + option. When this input is empty it will automatically use the configured schema or + "${projectDir}/nextflow_schema.json" as default. The schema should not be given in this way + for meta pipelines. +output: + - dummy_emit: + type: boolean + description: Dummy emit to make nf-core subworkflows lint happy +authors: + - "@nvnieuwk" +maintainers: + - "@nvnieuwk" diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test new file mode 100644 index 00000000..8fb30164 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -0,0 +1,117 @@ +nextflow_workflow { + + name "Test Subworkflow UTILS_NFSCHEMA_PLUGIN" + script "../main.nf" + workflow "UTILS_NFSCHEMA_PLUGIN" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/utils_nfschema_plugin" + tag "plugin/nf-schema" + + config "./nextflow.config" + + test("Should run nothing") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } + + test("Should run nothing - custom schema") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params - custom schema") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } +} diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config new file mode 100644 index 00000000..0907ac58 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -0,0 +1,8 @@ +plugins { + id "nf-schema@2.1.0" +} + +validation { + parametersSchema = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + monochromeLogs = true +} \ No newline at end of file diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json similarity index 95% rename from subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json rename to subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json index 7626c1c9..331e0d2f 100644 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json @@ -1,10 +1,10 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/./master/nextflow_schema.json", "title": ". pipeline parameters", "description": "", "type": "object", - "definitions": { + "$defs": { "input_output_options": { "title": "Input/output options", "type": "object", @@ -87,10 +87,10 @@ }, "allOf": [ { - "$ref": "#/definitions/input_output_options" + "$ref": "#/$defs/input_output_options" }, { - "$ref": "#/definitions/generic_options" + "$ref": "#/$defs/generic_options" } ] } diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf b/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf deleted file mode 100644 index 2585b65d..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf +++ /dev/null @@ -1,62 +0,0 @@ -// -// Subworkflow that uses the nf-validation plugin to render help text and parameter summary -// - -/* -======================================================================================== - IMPORT NF-VALIDATION PLUGIN -======================================================================================== -*/ - -include { paramsHelp } from 'plugin/nf-validation' -include { paramsSummaryLog } from 'plugin/nf-validation' -include { validateParameters } from 'plugin/nf-validation' - -/* -======================================================================================== - SUBWORKFLOW DEFINITION -======================================================================================== -*/ - -workflow UTILS_NFVALIDATION_PLUGIN { - - take: - print_help // boolean: print help - workflow_command // string: default commmand used to run pipeline - pre_help_text // string: string to be printed before help text and summary log - post_help_text // string: string to be printed after help text and summary log - validate_params // boolean: validate parameters - schema_filename // path: JSON schema file, null to use default value - - main: - - log.debug "Using schema file: ${schema_filename}" - - // Default values for strings - pre_help_text = pre_help_text ?: '' - post_help_text = post_help_text ?: '' - workflow_command = workflow_command ?: '' - - // - // Print help message if needed - // - if (print_help) { - log.info pre_help_text + paramsHelp(workflow_command, parameters_schema: schema_filename) + post_help_text - System.exit(0) - } - - // - // Print parameter summary to stdout - // - log.info pre_help_text + paramsSummaryLog(workflow, parameters_schema: schema_filename) + post_help_text - - // - // Validate parameters relative to the parameter JSON schema - // - if (validate_params){ - validateParameters(parameters_schema: schema_filename) - } - - emit: - dummy_emit = true -} diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml b/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml deleted file mode 100644 index 3d4a6b04..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml +++ /dev/null @@ -1,44 +0,0 @@ -# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json -name: "UTILS_NFVALIDATION_PLUGIN" -description: Use nf-validation to initiate and validate a pipeline -keywords: - - utility - - pipeline - - initialise - - validation -components: [] -input: - - print_help: - type: boolean - description: | - Print help message and exit - - workflow_command: - type: string - description: | - The command to run the workflow e.g. "nextflow run main.nf" - - pre_help_text: - type: string - description: | - Text to print before the help message - - post_help_text: - type: string - description: | - Text to print after the help message - - validate_params: - type: boolean - description: | - Validate the parameters and error if invalid. - - schema_filename: - type: string - description: | - The filename of the schema to validate against. -output: - - dummy_emit: - type: boolean - description: | - Dummy emit to make nf-core subworkflows lint happy -authors: - - "@adamrtalbot" -maintainers: - - "@adamrtalbot" - - "@maxulysse" diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test deleted file mode 100644 index 5784a33f..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test +++ /dev/null @@ -1,200 +0,0 @@ -nextflow_workflow { - - name "Test Workflow UTILS_NFVALIDATION_PLUGIN" - script "../main.nf" - workflow "UTILS_NFVALIDATION_PLUGIN" - tag "subworkflows" - tag "subworkflows_nfcore" - tag "plugin/nf-validation" - tag "'plugin/nf-validation'" - tag "utils_nfvalidation_plugin" - tag "subworkflows/utils_nfvalidation_plugin" - - test("Should run nothing") { - - when { - - params { - monochrome_logs = true - test_data = '' - } - - workflow { - """ - help = false - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success } - ) - } - } - - test("Should run help") { - - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } } - ) - } - } - - test("Should run help with command") { - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = "nextflow run noorg/doesntexist" - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('nextflow run noorg/doesntexist') } }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } } - ) - } - } - - test("Should run help with extra text") { - - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = "nextflow run noorg/doesntexist" - pre_help_text = "pre-help-text" - post_help_text = "post-help-text" - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('pre-help-text') } }, - { assert workflow.stdout.any { it.contains('nextflow run noorg/doesntexist') } }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } }, - { assert workflow.stdout.any { it.contains('post-help-text') } } - ) - } - } - - test("Should validate params") { - - when { - - params { - monochrome_logs = true - test_data = '' - outdir = 1 - } - workflow { - """ - help = false - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = true - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.failed }, - { assert workflow.stdout.any { it.contains('ERROR ~ ERROR: Validation of pipeline parameters failed!') } } - ) - } - } -} diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml b/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml deleted file mode 100644 index 60b1cfff..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfvalidation_plugin: - - subworkflows/nf-core/utils_nfvalidation_plugin/** diff --git a/tests/test.nf.test b/tests/test.nf.test index c3db6a93..bc801a51 100644 --- a/tests/test.nf.test +++ b/tests/test.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // AMPir @@ -54,10 +54,10 @@ nextflow_pipeline { // AMPcombi { assert path("$outputDir/reports/ampcombi2/Ampcombi_summary.tsv").text.contains("NODE_515831_length_303_cov_1.532258_1") }, + { assert path("$outputDir/reports/ampcombi2/Ampcombi_parse_tables.log").text.contains("amp_DRAMP_database is found and will be used") }, { assert snapshot( path("$outputDir/reports/ampcombi2/Ampcombi_cluster.log"), path("$outputDir/reports/ampcombi2/Ampcombi_complete.log"), - path("$outputDir/reports/ampcombi2/Ampcombi_parse_tables.log") ).match("ampcombi_logfiles") }, // DeepARG @@ -71,6 +71,8 @@ nextflow_pipeline { { assert file("$outputDir/arg/deeparg/sample_2/sample_2.align.daa").name }, { assert path("$outputDir/arg/deeparg/sample_1/sample_1.mapping.potential.ARG").text.contains("#ARG") }, { assert path("$outputDir/arg/deeparg/sample_2/sample_2.mapping.potential.ARG").text.contains("#ARG") }, + { assert path("$outputDir/arg/deeparg/sample_1/sample_1.align.daa.tsv").text.contains("rifampin_monooxygenase|rifamycin|rifampin_monooxygenase") }, + { assert path("$outputDir/arg/deeparg/sample_2/sample_2.align.daa.tsv").text.contains("rifampin_monooxygenase|rifamycin|rifampin_monooxygenase") }, // ABRicate { assert snapshot( diff --git a/tests/test.nf.test.snap b/tests/test.nf.test.snap index b8784d4e..4a1ed0f2 100644 --- a/tests/test.nf.test.snap +++ b/tests/test.nf.test.snap @@ -47,15 +47,15 @@ "deeparg_tsv_ARG": { "content": [ "sample_1.align.daa.tsv:md5,21822364379fe8f991d27cdb52a33d1d", - "sample_2.align.daa.tsv:md5,f448465df58785a87cdee53691a77bfe", + "sample_2.align.daa.tsv:md5,d59287f357de198639bdca5dbaede173", "sample_1.mapping.ARG:md5,0e049e99eab4c55666062df21707d5b9", "sample_2.mapping.ARG:md5,0e049e99eab4c55666062df21707d5b9" ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T15:33:45.575881231" + "timestamp": "2024-12-18T12:41:33.325286058" }, "ampir": { "content": [ @@ -66,21 +66,20 @@ ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T15:33:45.512274661" + "timestamp": "2024-12-18T12:41:33.055416682" }, "ampcombi_logfiles": { "content": [ "Ampcombi_cluster.log:md5,4c78f5f134edf566f39e04e3ab7d8558", - "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1", - "Ampcombi_parse_tables.log:md5,cb5dc95f6b64edc2f0eb56bb541660d5" + "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1" ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T15:33:45.560675596" + "timestamp": "2024-12-18T12:41:33.230701016" }, "amplify": { "content": [ @@ -89,28 +88,28 @@ ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T15:33:45.522977776" + "timestamp": "2024-12-18T12:41:33.1312123" }, "macrel": { "content": [ "sample_1.macrel.smorfs.faa.gz:md5,1b5e2434860e635e95324d1804a3be7b", "sample_2.macrel.smorfs.faa.gz:md5,38108b5cdfdc2196afe67418b9b04682", - "sample_1.macrel.all_orfs.faa.gz:md5,86f6b3b590d1b22d9c5aa164f8a14080", - "sample_2.macrel.all_orfs.faa.gz:md5,fdb384925af50ecade05dccaff68afd8", - "sample_1.macrel.prediction.gz:md5,0c4b16e0838be56e012b99169863a168", - "sample_2.macrel.prediction.gz:md5,440deffd6b6d9986ce098e44c66db9ae", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.all_orfs.faa.gz:md5,844bb10e2f84e1a2b2db56eb36391dcf", + "sample_2.macrel.all_orfs.faa.gz:md5,9c0b8b1c3b03d7b20aee0b57103861ab", + "sample_1.macrel.prediction.gz:md5,9553e1dae8a5b912da8d74fa3f1cd9eb", + "sample_2.macrel.prediction.gz:md5,ae155e454eb7abd7c48c06aad9261603", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T15:33:45.525854315" + "timestamp": "2024-12-18T10:35:54.749106433" }, "amrfinderplus": { "content": [ diff --git a/tests/test_bakta.nf.test b/tests/test_bakta.nf.test index b1913b04..c0ff420c 100644 --- a/tests/test_bakta.nf.test +++ b/tests/test_bakta.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // AMPir @@ -53,11 +53,11 @@ nextflow_pipeline { ).match("macrel") }, // AMPcombi - { assert path("$outputDir/reports/ampcombi2/Ampcombi_summary.tsv").text.contains("KKEJHB_00100") }, + { assert path("$outputDir/reports/ampcombi2/Ampcombi_summary.tsv").text.contains("KDEMFK_0115") }, + { assert path("$outputDir/reports/ampcombi2/Ampcombi_parse_tables.log").text.contains("amp_DRAMP_database is found and will be used") }, { assert snapshot( path("$outputDir/reports/ampcombi2/Ampcombi_cluster.log"), - path("$outputDir/reports/ampcombi2/Ampcombi_complete.log"), - path("$outputDir/reports/ampcombi2/Ampcombi_parse_tables.log") + path("$outputDir/reports/ampcombi2/Ampcombi_complete.log") ).match("ampcombi_logfiles") }, // DeepARG diff --git a/tests/test_bakta.nf.test.snap b/tests/test_bakta.nf.test.snap index ff73f307..21e6633b 100644 --- a/tests/test_bakta.nf.test.snap +++ b/tests/test_bakta.nf.test.snap @@ -47,7 +47,7 @@ "deeparg_tsv_ARG": { "content": [ "sample_1.align.daa.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", - "sample_2.align.daa.tsv:md5,4a86ca69defa4c861fabf236609afe8a", + "sample_2.align.daa.tsv:md5,4557fadca3f90ccb037b59558dddd528", "sample_1.mapping.ARG:md5,0e049e99eab4c55666062df21707d5b9", "sample_2.mapping.ARG:md5,0e049e99eab4c55666062df21707d5b9" ], @@ -73,14 +73,13 @@ "ampcombi_logfiles": { "content": [ "Ampcombi_cluster.log:md5,4c78f5f134edf566f39e04e3ab7d8558", - "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1", - "Ampcombi_parse_tables.log:md5,cb5dc95f6b64edc2f0eb56bb541660d5" + "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.0", + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T16:51:37.230099612" + "timestamp": "2024-12-18T11:04:21.067236601" }, "amplify": { "content": [ @@ -97,20 +96,20 @@ "content": [ "sample_1.macrel.smorfs.faa.gz:md5,1b5e2434860e635e95324d1804a3be7b", "sample_2.macrel.smorfs.faa.gz:md5,38108b5cdfdc2196afe67418b9b04682", - "sample_1.macrel.all_orfs.faa.gz:md5,86f6b3b590d1b22d9c5aa164f8a14080", - "sample_2.macrel.all_orfs.faa.gz:md5,fdb384925af50ecade05dccaff68afd8", - "sample_1.macrel.prediction.gz:md5,0c4b16e0838be56e012b99169863a168", - "sample_2.macrel.prediction.gz:md5,440deffd6b6d9986ce098e44c66db9ae", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.all_orfs.faa.gz:md5,844bb10e2f84e1a2b2db56eb36391dcf", + "sample_2.macrel.all_orfs.faa.gz:md5,9c0b8b1c3b03d7b20aee0b57103861ab", + "sample_1.macrel.prediction.gz:md5,9553e1dae8a5b912da8d74fa3f1cd9eb", + "sample_2.macrel.prediction.gz:md5,ae155e454eb7abd7c48c06aad9261603", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.3" + "nf-test": "0.9.0", + "nextflow": "24.10.2" }, - "timestamp": "2024-07-23T16:51:37.208256804" + "timestamp": "2024-12-18T11:04:20.948791843" }, "amrfinderplus": { "content": [ diff --git a/tests/test_bgc_bakta.nf.test b/tests/test_bgc_bakta.nf.test index 37a0a0b1..3debacd6 100644 --- a/tests/test_bgc_bakta.nf.test +++ b/tests/test_bgc_bakta.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // antiSMASH diff --git a/tests/test_bgc_prokka.nf.test b/tests/test_bgc_prokka.nf.test index 0fe53cd5..f415a95b 100644 --- a/tests/test_bgc_prokka.nf.test +++ b/tests/test_bgc_prokka.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // antiSMASH diff --git a/tests/test_bgc_pyrodigal.nf.test b/tests/test_bgc_pyrodigal.nf.test index cab97577..24ecaeb3 100644 --- a/tests/test_bgc_pyrodigal.nf.test +++ b/tests/test_bgc_pyrodigal.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // antiSMASH diff --git a/tests/test_full.nf.test b/tests/test_full.nf.test index b5d53e6d..28b893de 100644 --- a/tests/test_full.nf.test +++ b/tests/test_full.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // AMP workflow diff --git a/tests/test_nothing.nf.test b/tests/test_minimal.nf.test similarity index 81% rename from tests/test_nothing.nf.test rename to tests/test_minimal.nf.test index a141d401..fe9e91de 100644 --- a/tests/test_nothing.nf.test +++ b/tests/test_minimal.nf.test @@ -4,9 +4,9 @@ nextflow_pipeline { script "main.nf" tag "pipeline" tag "nfcore_funcscan" - tag "test_nothing" + tag "test_minimal" - test("test_nothing_profile") { + test("test_minimal_profile") { when { params { @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, ) } diff --git a/tests/test_preannotated.nf.test b/tests/test_preannotated.nf.test index 32a86ac4..b853316b 100644 --- a/tests/test_preannotated.nf.test +++ b/tests/test_preannotated.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // AMPir @@ -27,7 +27,7 @@ nextflow_pipeline { path("$outputDir/amp/ampir/sample_2/sample_2.ampir.tsv").text.contains("MRWGYPLSLVLMALSVAAPMIYFRRKGWLR"), path("$outputDir/amp/ampir/sample_2/sample_2.ampir.faa"), path("$outputDir/amp/ampir/sample_3/sample_3.ampir.tsv").text.contains("IPELEMRWGYPLSLVLMALSVAAPMIYFRRKGWLR"), - path("$outputDir/amp/ampir/sample_3/sample_3.ampir.faa") + path("$outputDir/amp/ampir/sample_3/sample_3.ampir.faa").text.contains(">NODE_882919_length_258_cov_0.935961_1 # 149 # 256 # -1 # ID=") ).match("ampir") }, // AMPlify @@ -48,7 +48,6 @@ nextflow_pipeline { path("$outputDir/amp/macrel/sample_2.macrel/sample_2.macrel.smorfs.faa.gz"), path("$outputDir/amp/macrel/sample_3.macrel/sample_3.macrel.smorfs.faa.gz"), path("$outputDir/amp/macrel/sample_1.macrel/sample_1.macrel.all_orfs.faa.gz"), - path("$outputDir/amp/macrel/sample_2.macrel/sample_2.macrel.all_orfs.faa.gz"), path("$outputDir/amp/macrel/sample_3.macrel/sample_3.macrel.all_orfs.faa.gz"), path("$outputDir/amp/macrel/sample_1.macrel/sample_1.macrel.prediction.gz"), path("$outputDir/amp/macrel/sample_2.macrel/sample_2.macrel.prediction.gz"), @@ -60,6 +59,8 @@ nextflow_pipeline { path("$outputDir/amp/macrel/sample_2.macrel/sample_2.macrel_log.txt"), path("$outputDir/amp/macrel/sample_3.macrel/sample_3.macrel_log.txt") ).match("macrel") }, + { assert new File("$outputDir/amp/macrel/sample_2.macrel/sample_2.macrel.all_orfs.faa.gz").exists() }, + // AMPcombi { assert snapshot( @@ -107,21 +108,18 @@ nextflow_pipeline { ).match("rgi") }, // fARGene - { assert snapshot( - path("$outputDir/arg/fargene/sample_1/class_a/results_summary.txt"), - path("$outputDir/arg/fargene/sample_2/class_a/results_summary.txt"), - path("$outputDir/arg/fargene/sample_3/class_a/results_summary.txt"), - path("$outputDir/arg/fargene/sample_1/class_b_1_2/results_summary.txt"), - path("$outputDir/arg/fargene/sample_2/class_b_1_2/results_summary.txt"), - path("$outputDir/arg/fargene/sample_3/class_b_1_2/results_summary.txt") - ).match("fargene") - }, + { assert path("$outputDir/arg/fargene/sample_1/class_a/results_summary.txt").text.contains("class_A.hmm") }, + { assert path("$outputDir/arg/fargene/sample_2/class_a/results_summary.txt").text.contains("class_A.hmm") }, + { assert path("$outputDir/arg/fargene/sample_3/class_a/results_summary.txt").text.contains("class_A.hmm") }, + { assert path("$outputDir/arg/fargene/sample_1/class_b_1_2/results_summary.txt").text.contains("class_B_1_2.hmm") }, + { assert path("$outputDir/arg/fargene/sample_2/class_b_1_2/results_summary.txt").text.contains("class_B_1_2.hmm") }, + { assert path("$outputDir/arg/fargene/sample_3/class_b_1_2/results_summary.txt").text.contains("class_B_1_2.hmm") }, { assert path("$outputDir/arg/fargene/sample_1/fargene_analysis.log").text.contains("fARGene is done.") }, { assert path("$outputDir/arg/fargene/sample_2/fargene_analysis.log").text.contains("fARGene is done.") }, { assert path("$outputDir/arg/fargene/sample_3/fargene_analysis.log").text.contains("fARGene is done.") }, // hAMRonization - { assert snapshot(path("$outputDir/reports/hamronization_summarize/hamronization_combined_report.tsv").readLines().size()).match("hamronization") }, + { assert new File("$outputDir/reports/hamronization_summarize/hamronization_combined_report.tsv").exists() }, // argNorm { assert snapshot( diff --git a/tests/test_preannotated.nf.test.snap b/tests/test_preannotated.nf.test.snap index e53c7215..bf51c48f 100644 --- a/tests/test_preannotated.nf.test.snap +++ b/tests/test_preannotated.nf.test.snap @@ -1,9 +1,33 @@ { + "abricate": { + "content": [ + "sample_1.txt:md5,427cec26e354ac6b0ab6047ec6621202", + "sample_2.txt:md5,4c140c932a48a22bcd8ae911bda8f4c7", + "sample_3.txt:md5,d6534efe3d03173749d003bf9e624e68" + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.4" + }, + "timestamp": "2025-03-04T13:44:42.292890292" + }, + "rgi": { + "content": [ + "sample_1.txt:md5,ff8f179d06d8566d8cf779fc7d1f4955", + "sample_2.txt:md5,cc4ae1fb9e0d5f79ef5105d640c7b748", + "sample_3.txt:md5,ff8f179d06d8566d8cf779fc7d1f4955" + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.4" + }, + "timestamp": "2025-03-04T13:44:42.337377281" + }, "deeparg": { "content": [ "sample_1.align.daa.tsv:md5,0e71c37318bdc6cba792196d0455293d", "sample_2.align.daa.tsv:md5,1092ecd3cd6931653168b46c7afeb9e3", - "sample_3.align.daa.tsv:md5,b79070fe26acd1a10ae3aaf06b0d5901", + "sample_3.align.daa.tsv:md5,a9ed2f0651d75b318fde07a76b06d7b8", true, true, true @@ -21,13 +45,13 @@ true, "sample_2.ampir.faa:md5,12826875bd18623da78770187a7bbd2c", true, - "sample_3.ampir.faa:md5,0a36691485930a1b77c4b68a738fd98d" + true ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.3" }, - "timestamp": "2024-07-27T08:11:24.436374797" + "timestamp": "2024-12-20T11:01:38.808336691" }, "argnorm_amrfinderplus": { "content": [ @@ -53,6 +77,19 @@ }, "timestamp": "2024-07-27T08:23:32.486921338" }, + "ampcombi": { + "content": [ + "Ampcombi_cluster.log:md5,4c78f5f134edf566f39e04e3ab7d8558", + "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1", + true, + true + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.4" + }, + "timestamp": "2025-03-04T13:44:42.24263737" + }, "amplify": { "content": [ true, @@ -71,8 +108,8 @@ "sample_1.potential.ARG.normalized.tsv:md5,d6732b4b9765bfa47e27ba673e24b6a4", "sample_2.ARG.normalized.tsv:md5,1a19b894a7315aaae5f799e4539e6619", "sample_2.potential.ARG.normalized.tsv:md5,b241e22f9116d8f518ba8526d52ac4dc", - "sample_3.ARG.normalized.tsv:md5,d40d387176649ce80827420fef6a0169", - "sample_3.potential.ARG.normalized.tsv:md5,f331efd21ea143c180a15ae56a5210d3" + "sample_3.ARG.normalized.tsv:md5,d7577c0066a31e173f9cb545820650bf", + "sample_3.potential.ARG.normalized.tsv:md5,6d0889215f0ad7f601502ca67c0ca89e" ], "meta": { "nf-test": "0.9.0", @@ -82,89 +119,26 @@ }, "macrel": { "content": [ - "sample_1.macrel.smorfs.faa.gz:md5,9cddad1e4b6dbcb76888f1a87db388ec", - "sample_2.macrel.smorfs.faa.gz:md5,e055dd2a9e44f3dcaa8af7198600349c", - "sample_3.macrel.smorfs.faa.gz:md5,9cddad1e4b6dbcb76888f1a87db388ec", - "sample_1.macrel.all_orfs.faa.gz:md5,c276fb1ec494ff53ded1e6fc118e25b9", - "sample_2.macrel.all_orfs.faa.gz:md5,e75e434a30922d80169d0666fd07e446", - "sample_3.macrel.all_orfs.faa.gz:md5,c276fb1ec494ff53ded1e6fc118e25b9", - "sample_1.macrel.prediction.gz:md5,0277725512f7d2954a99692bb65f1475", - "sample_2.macrel.prediction.gz:md5,06f7ce99cfe6f364d38743aae094402a", - "sample_3.macrel.prediction.gz:md5,0277725512f7d2954a99692bb65f1475", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.smorfs.faa.gz:md5,a4f853b560c6a8c215e0d243c24ec056", + "sample_2.macrel.smorfs.faa.gz:md5,83ae7b9808d7183d87b41c10253c9c9e", + "sample_3.macrel.smorfs.faa.gz:md5,a4f853b560c6a8c215e0d243c24ec056", + "sample_1.macrel.all_orfs.faa.gz:md5,d1ae1cadc3770994b2ed4982aadd5406", + "sample_3.macrel.all_orfs.faa.gz:md5,d1ae1cadc3770994b2ed4982aadd5406", + "sample_1.macrel.prediction.gz:md5,62146cf9f759c9c6c2c2f9e5ba816119", + "sample_2.macrel.prediction.gz:md5,1b479d31bb7dbf636a2028ddef72f5cc", + "sample_3.macrel.prediction.gz:md5,62146cf9f759c9c6c2c2f9e5ba816119", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_3.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-27T08:11:24.514344973" - }, - "hamronization": { - "content": [ - 246 - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" - }, - "timestamp": "2024-09-05T10:17:06.711064611" - }, - "abricate": { - "content": [ - "sample_1.txt:md5,427cec26e354ac6b0ab6047ec6621202", - "sample_2.txt:md5,4c140c932a48a22bcd8ae911bda8f4c7", - "sample_3.txt:md5,d6534efe3d03173749d003bf9e624e68" - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" - }, - "timestamp": "2024-07-27T08:11:24.87794287" - }, - "fargene": { - "content": [ - "results_summary.txt:md5,2c8a073d2a7938e8aedcc097e6df2aa5", - "results_summary.txt:md5,3b86a5513e89e22a4c8b9279678ce0c0", - "results_summary.txt:md5,2c8a073d2a7938e8aedcc097e6df2aa5", - "results_summary.txt:md5,59f2e69c670d72f0c0a401e0dc90cbeb", - "results_summary.txt:md5,59f2e69c670d72f0c0a401e0dc90cbeb", - "results_summary.txt:md5,59f2e69c670d72f0c0a401e0dc90cbeb" - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" - }, - "timestamp": "2024-07-27T08:11:25.248986515" - }, - "rgi": { - "content": [ - "sample_1.txt:md5,dde77ae2dc240ee4717d8d33a92dfb66", - "sample_2.txt:md5,0e652d35ef6e9272aa194b55db609e75", - "sample_3.txt:md5,dde77ae2dc240ee4717d8d33a92dfb66" - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" - }, - "timestamp": "2024-07-27T08:11:25.117843821" - }, - "ampcombi": { - "content": [ - "Ampcombi_cluster.log:md5,4c78f5f134edf566f39e04e3ab7d8558", - "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1", - true, - true - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" - }, - "timestamp": "2024-07-27T08:11:24.639509225" + "timestamp": "2025-03-04T13:44:42.200904946" }, "amrfinderplus": { "content": [ @@ -173,9 +147,9 @@ "sample_3.tsv:md5,29cfb6f34f420d802eda95c6d9daa361" ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "24.10.4" }, - "timestamp": "2024-07-27T08:11:24.994284774" + "timestamp": "2025-03-04T13:44:42.316250647" } } \ No newline at end of file diff --git a/tests/test_preannotated_bgc.nf.test b/tests/test_preannotated_bgc.nf.test index 0e9ca618..f6f291fb 100644 --- a/tests/test_preannotated_bgc.nf.test +++ b/tests/test_preannotated_bgc.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // antiSMASH diff --git a/tests/test_preannotated_bgc.nf.test.snap b/tests/test_preannotated_bgc.nf.test.snap index b05b7921..e4967fc0 100644 --- a/tests/test_preannotated_bgc.nf.test.snap +++ b/tests/test_preannotated_bgc.nf.test.snap @@ -21,7 +21,7 @@ "content": [ "sample_1.bgc.gbk:md5,e50e429959e9c4bf0c4b97d9dcd54a08", "sample_2.bgc.gbk:md5,effe3cfc91772eb4e4b50ac46f13a941", - "sample_3.bgc.gbk:md5,c9028aca1282b314d296091e1f0b8e52" + "sample_3.bgc.gbk:md5,41920a93524a1bb32ae1003d69327642" ], "meta": { "nf-test": "0.9.0", diff --git a/tests/test_prokka.nf.test b/tests/test_prokka.nf.test index 94e65ae2..c46843eb 100644 --- a/tests/test_prokka.nf.test +++ b/tests/test_prokka.nf.test @@ -17,7 +17,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // AMPir @@ -53,11 +53,11 @@ nextflow_pipeline { ).match("macrel") }, // AMPcombi - { assert path("$outputDir/reports/ampcombi2/Ampcombi_summary.tsv").text.contains("PROKKA_00019") }, + { assert path("$outputDir/reports/ampcombi2/Ampcombi_summary.tsv").text.contains("PROKKA_377") }, + { assert path("$outputDir/reports/ampcombi2/Ampcombi_parse_tables.log").text.contains("amp_DRAMP_database is found and will be used") }, { assert snapshot( path("$outputDir/reports/ampcombi2/Ampcombi_cluster.log"), - path("$outputDir/reports/ampcombi2/Ampcombi_complete.log"), - path("$outputDir/reports/ampcombi2/Ampcombi_parse_tables.log") + path("$outputDir/reports/ampcombi2/Ampcombi_complete.log") ).match("ampcombi_logfiles") }, // DeepARG diff --git a/tests/test_prokka.nf.test.snap b/tests/test_prokka.nf.test.snap index 07cfeefd..4e30230d 100644 --- a/tests/test_prokka.nf.test.snap +++ b/tests/test_prokka.nf.test.snap @@ -47,7 +47,7 @@ "deeparg_tsv_ARG": { "content": [ "sample_1.align.daa.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", - "sample_2.align.daa.tsv:md5,06648de08caca0b7f42eab9576615226", + "sample_2.align.daa.tsv:md5,7802fb45d7343e492d3677fec67e6d0c", "sample_1.mapping.ARG:md5,0e049e99eab4c55666062df21707d5b9", "sample_2.mapping.ARG:md5,0e049e99eab4c55666062df21707d5b9" ], @@ -73,14 +73,13 @@ "ampcombi_logfiles": { "content": [ "Ampcombi_cluster.log:md5,4c78f5f134edf566f39e04e3ab7d8558", - "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1", - "Ampcombi_parse_tables.log:md5,1e2b5abad7d17e03428066f345b91117" + "Ampcombi_complete.log:md5,3dabfea4303bf94bd4f5d78c5b8c83c1" ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-24T12:53:09.914363724" + "timestamp": "2024-12-18T11:10:28.28666354" }, "amplify": { "content": [ @@ -97,20 +96,20 @@ "content": [ "sample_1.macrel.smorfs.faa.gz:md5,1b5e2434860e635e95324d1804a3be7b", "sample_2.macrel.smorfs.faa.gz:md5,38108b5cdfdc2196afe67418b9b04682", - "sample_1.macrel.all_orfs.faa.gz:md5,86f6b3b590d1b22d9c5aa164f8a14080", - "sample_2.macrel.all_orfs.faa.gz:md5,fdb384925af50ecade05dccaff68afd8", - "sample_1.macrel.prediction.gz:md5,0c4b16e0838be56e012b99169863a168", - "sample_2.macrel.prediction.gz:md5,440deffd6b6d9986ce098e44c66db9ae", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.all_orfs.faa.gz:md5,844bb10e2f84e1a2b2db56eb36391dcf", + "sample_2.macrel.all_orfs.faa.gz:md5,9c0b8b1c3b03d7b20aee0b57103861ab", + "sample_1.macrel.prediction.gz:md5,9553e1dae8a5b912da8d74fa3f1cd9eb", + "sample_2.macrel.prediction.gz:md5,ae155e454eb7abd7c48c06aad9261603", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-24T12:53:09.892460736" + "timestamp": "2024-12-18T11:10:28.238554892" }, "amrfinderplus": { "content": [ diff --git a/tests/test_taxonomy_bakta.nf.test b/tests/test_taxonomy_bakta.nf.test index 5a412fa9..8e8a14ec 100644 --- a/tests/test_taxonomy_bakta.nf.test +++ b/tests/test_taxonomy_bakta.nf.test @@ -18,21 +18,21 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // ampir { assert snapshot( - file("$outputDir/amp/ampir/sample_1/sample_1.ampir.tsv").text.contains("KKEJHB_00005"), - file("$outputDir/amp/ampir/sample_1/sample_1.ampir.faa").text.contains("KKEJHB_00005"), - file("$outputDir/amp/ampir/sample_2/sample_2.ampir.tsv").text.contains("KDEMFK_00005"), - file("$outputDir/amp/ampir/sample_2/sample_2.ampir.faa").text.contains("KDEMFK_00005") + file("$outputDir/amp/ampir/sample_1/sample_1.ampir.tsv").text.contains("KKEJHB_005"), + file("$outputDir/amp/ampir/sample_1/sample_1.ampir.faa").text.contains("KKEJHB_005"), + file("$outputDir/amp/ampir/sample_2/sample_2.ampir.tsv").text.contains("KDEMFK_0005"), + file("$outputDir/amp/ampir/sample_2/sample_2.ampir.faa").text.contains("KDEMFK_0005") ).match("ampir") }, // AMPlify { assert snapshot( - file("$outputDir/amp/amplify/sample_1/sample_1.amplify.tsv").text.contains("KKEJHB_00005"), - file("$outputDir/amp/amplify/sample_2/sample_2.amplify.tsv").text.contains("KDEMFK_00005") + file("$outputDir/amp/amplify/sample_1/sample_1.amplify.tsv").text.contains("KKEJHB_005"), + file("$outputDir/amp/amplify/sample_2/sample_2.amplify.tsv").text.contains("KDEMFK_0005") ).match("amplify") }, // Macrel @@ -55,7 +55,7 @@ nextflow_pipeline { // AMPcombi { assert snapshot ( - file("$outputDir/reports/ampcombi2/sample_2/sample_2_ampcombi.tsv").text.contains("KDEMFK_00575"), + file("$outputDir/reports/ampcombi2/sample_2/sample_2_ampcombi.tsv").text.contains("KDEMFK_0070"), ).match("ampcombi") }, { assert new File("$outputDir/reports/ampcombi2/ampcombi_complete_summary_taxonomy.tsv.gz").exists() }, @@ -79,7 +79,7 @@ nextflow_pipeline { ).match("fargene") }, // hAMRonization - { assert new File("$outputDir/reports/hamronization_summarize/hamronization_combined_report.tsv.gz").exists() }, + { assert new File("$outputDir/reports/hamronization_summarize/hamronization_complete_summary_taxonomy.tsv.gz").exists() }, // antiSMASH { assert snapshot ( diff --git a/tests/test_taxonomy_bakta.nf.test.snap b/tests/test_taxonomy_bakta.nf.test.snap index 5606db1e..c406c942 100644 --- a/tests/test_taxonomy_bakta.nf.test.snap +++ b/tests/test_taxonomy_bakta.nf.test.snap @@ -93,12 +93,12 @@ "content": [ "sample_1.macrel.smorfs.faa.gz:md5,1b5e2434860e635e95324d1804a3be7b", "sample_2.macrel.smorfs.faa.gz:md5,38108b5cdfdc2196afe67418b9b04682", - "sample_1.macrel.all_orfs.faa.gz:md5,86f6b3b590d1b22d9c5aa164f8a14080", - "sample_2.macrel.all_orfs.faa.gz:md5,fdb384925af50ecade05dccaff68afd8", - "sample_1.macrel.prediction.gz:md5,0c4b16e0838be56e012b99169863a168", - "sample_2.macrel.prediction.gz:md5,440deffd6b6d9986ce098e44c66db9ae", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.all_orfs.faa.gz:md5,844bb10e2f84e1a2b2db56eb36391dcf", + "sample_2.macrel.all_orfs.faa.gz:md5,9c0b8b1c3b03d7b20aee0b57103861ab", + "sample_1.macrel.prediction.gz:md5,9553e1dae8a5b912da8d74fa3f1cd9eb", + "sample_2.macrel.prediction.gz:md5,ae155e454eb7abd7c48c06aad9261603", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], diff --git a/tests/test_taxonomy_prokka.nf.test b/tests/test_taxonomy_prokka.nf.test index e0992dbf..d1b86fdd 100644 --- a/tests/test_taxonomy_prokka.nf.test +++ b/tests/test_taxonomy_prokka.nf.test @@ -18,7 +18,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // ampir @@ -55,7 +55,7 @@ nextflow_pipeline { // AMPcombi { assert snapshot ( - file("$outputDir/reports/ampcombi2/sample_2/sample_2_ampcombi.tsv").text.contains("PROKKA_00109"), + file("$outputDir/reports/ampcombi2/sample_2/sample_2_ampcombi.tsv").text.contains("PROKKA_00119"), ).match("ampcombi") }, { assert new File("$outputDir/reports/ampcombi2/ampcombi_complete_summary_taxonomy.tsv.gz").exists() }, @@ -79,7 +79,7 @@ nextflow_pipeline { ).match("fargene") }, // hAMRonization - { assert new File("$outputDir/reports/hamronization_summarize/hamronization_combined_report.tsv.gz").exists() }, + { assert new File("$outputDir/reports/hamronization_summarize/hamronization_complete_summary_taxonomy.tsv.gz").exists() }, // antiSMASH { assert snapshot ( diff --git a/tests/test_taxonomy_prokka.nf.test.snap b/tests/test_taxonomy_prokka.nf.test.snap index 8e2e581a..c00c3286 100644 --- a/tests/test_taxonomy_prokka.nf.test.snap +++ b/tests/test_taxonomy_prokka.nf.test.snap @@ -93,12 +93,12 @@ "content": [ "sample_1.macrel.smorfs.faa.gz:md5,1b5e2434860e635e95324d1804a3be7b", "sample_2.macrel.smorfs.faa.gz:md5,38108b5cdfdc2196afe67418b9b04682", - "sample_1.macrel.all_orfs.faa.gz:md5,86f6b3b590d1b22d9c5aa164f8a14080", - "sample_2.macrel.all_orfs.faa.gz:md5,fdb384925af50ecade05dccaff68afd8", - "sample_1.macrel.prediction.gz:md5,0c4b16e0838be56e012b99169863a168", - "sample_2.macrel.prediction.gz:md5,440deffd6b6d9986ce098e44c66db9ae", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.all_orfs.faa.gz:md5,844bb10e2f84e1a2b2db56eb36391dcf", + "sample_2.macrel.all_orfs.faa.gz:md5,9c0b8b1c3b03d7b20aee0b57103861ab", + "sample_1.macrel.prediction.gz:md5,9553e1dae8a5b912da8d74fa3f1cd9eb", + "sample_2.macrel.prediction.gz:md5,ae155e454eb7abd7c48c06aad9261603", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], diff --git a/tests/test_taxonomy_pyrodigal.nf.test b/tests/test_taxonomy_pyrodigal.nf.test index 3cc5535e..f0dc1012 100644 --- a/tests/test_taxonomy_pyrodigal.nf.test +++ b/tests/test_taxonomy_pyrodigal.nf.test @@ -18,7 +18,7 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, - { assert new File("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").exists() }, + { assert new File("$outputDir/pipeline_info/nf_core_funcscan_software_mqc_versions.yml").exists() }, { assert new File("$outputDir/multiqc/multiqc_report.html").exists() }, // ampir @@ -79,7 +79,7 @@ nextflow_pipeline { ).match("fargene") }, // hAMRonization - { assert new File("$outputDir/reports/hamronization_summarize/hamronization_combined_report.tsv.gz").exists() }, + { assert new File("$outputDir/reports/hamronization_summarize/hamronization_complete_summary_taxonomy.tsv.gz").exists() }, // antiSMASH { assert snapshot ( diff --git a/tests/test_taxonomy_pyrodigal.nf.test.snap b/tests/test_taxonomy_pyrodigal.nf.test.snap index 668aab92..9cde9d2a 100644 --- a/tests/test_taxonomy_pyrodigal.nf.test.snap +++ b/tests/test_taxonomy_pyrodigal.nf.test.snap @@ -93,19 +93,19 @@ "content": [ "sample_1.macrel.smorfs.faa.gz:md5,1b5e2434860e635e95324d1804a3be7b", "sample_2.macrel.smorfs.faa.gz:md5,38108b5cdfdc2196afe67418b9b04682", - "sample_1.macrel.all_orfs.faa.gz:md5,86f6b3b590d1b22d9c5aa164f8a14080", - "sample_2.macrel.all_orfs.faa.gz:md5,fdb384925af50ecade05dccaff68afd8", - "sample_1.macrel.prediction.gz:md5,0c4b16e0838be56e012b99169863a168", - "sample_2.macrel.prediction.gz:md5,440deffd6b6d9986ce098e44c66db9ae", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", - "README.md:md5,fa3706dfc95d0538a52c4d0d824be5fb", + "sample_1.macrel.all_orfs.faa.gz:md5,844bb10e2f84e1a2b2db56eb36391dcf", + "sample_2.macrel.all_orfs.faa.gz:md5,9c0b8b1c3b03d7b20aee0b57103861ab", + "sample_1.macrel.prediction.gz:md5,9553e1dae8a5b912da8d74fa3f1cd9eb", + "sample_2.macrel.prediction.gz:md5,ae155e454eb7abd7c48c06aad9261603", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", + "README.md:md5,cf088d9256ff7b7730699f17b64b4028", "sample_1.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e", "sample_2.macrel_log.txt:md5,d41d8cd98f00b204e9800998ecf8427e" ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.10.2" }, - "timestamp": "2024-07-24T16:24:30.025771" + "timestamp": "2024-12-18T11:35:52.952483937" } } \ No newline at end of file diff --git a/workflows/funcscan.nf b/workflows/funcscan.nf index 07e0a3d8..ba8f997a 100644 --- a/workflows/funcscan.nf +++ b/workflows/funcscan.nf @@ -4,21 +4,11 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' -include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' -include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_funcscan_pipeline' -include { paramsSummaryMap; validateParameters; paramsHelp; paramsSummaryLog; fromSamplesheet } from 'plugin/nf-validation' - -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - CONFIG FILES -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - -ch_multiqc_config = Channel.fromPath( "$projectDir/assets/multiqc_config.yml", checkIfExists: true ) -ch_multiqc_custom_config = params.multiqc_config ? Channel.fromPath( params.multiqc_config, checkIfExists: true ) : Channel.empty() -ch_multiqc_logo = params.multiqc_logo ? Channel.fromPath( params.multiqc_logo, checkIfExists: true ) : Channel.empty() -ch_multiqc_custom_methods_description = params.multiqc_methods_description ? file(params.multiqc_methods_description, checkIfExists: true) : file("$projectDir/assets/methods_description_template.yml", checkIfExists: true) +include { MULTIQC } from '../modules/nf-core/multiqc/main' +include { paramsSummaryMap } from 'plugin/nf-schema' +include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' +include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' +include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_funcscan_pipeline' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -29,11 +19,12 @@ ch_multiqc_custom_methods_description = params.multiqc_methods_description ? fil // // SUBWORKFLOW: Consisting of a mix of local and nf-core/modules // -include { ANNOTATION } from '../subworkflows/local/annotation' -include { AMP } from '../subworkflows/local/amp' -include { ARG } from '../subworkflows/local/arg' -include { BGC } from '../subworkflows/local/bgc' -include { TAXA_CLASS } from '../subworkflows/local/taxa_class' +include { ANNOTATION } from '../subworkflows/local/annotation' +include { PROTEIN_ANNOTATION } from '../subworkflows/local/protein_annotation' +include { AMP } from '../subworkflows/local/amp' +include { ARG } from '../subworkflows/local/arg' +include { BGC } from '../subworkflows/local/bgc' +include { TAXA_CLASS } from '../subworkflows/local/taxa_class' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -44,9 +35,9 @@ include { TAXA_CLASS } from '../subworkflows/local/taxa_class' // // MODULE: Installed directly from nf-core/modules // -include { MULTIQC } from '../modules/nf-core/multiqc/main' -include { GUNZIP as GUNZIP_INPUT_PREP } from '../modules/nf-core/gunzip/main' -include { SEQKIT_SEQ } from '../modules/nf-core/seqkit/seq/main' +include { GUNZIP as GUNZIP_INPUT_PREP } from '../modules/nf-core/gunzip/main' +include { SEQKIT_SEQ as SEQKIT_SEQ_LENGTH } from '../modules/nf-core/seqkit/seq/main' +include { SEQKIT_SEQ as SEQKIT_SEQ_FILTER } from '../modules/nf-core/seqkit/seq/main' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -55,7 +46,6 @@ include { SEQKIT_SEQ } from '../modules/nf-core/seqkit/seq/m */ workflow FUNCSCAN { - take: ch_samplesheet // channel: samplesheet read in from --input @@ -64,57 +54,67 @@ workflow FUNCSCAN { ch_versions = Channel.empty() ch_multiqc_files = Channel.empty() - ch_input = Channel.fromSamplesheet("input") + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CONFIG FILES + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + ch_multiqc_config = Channel.fromPath("${projectDir}/assets/multiqc_config.yml", checkIfExists: true) + ch_multiqc_custom_config = params.multiqc_config ? Channel.fromPath(params.multiqc_config, checkIfExists: true) : Channel.empty() + ch_multiqc_logo = params.multiqc_logo ? Channel.fromPath(params.multiqc_logo, checkIfExists: true) : Channel.empty() + ch_multiqc_custom_methods_description = params.multiqc_methods_description ? file(params.multiqc_methods_description, checkIfExists: true) : file("${projectDir}/assets/methods_description_template.yml", checkIfExists: true) + // Some tools require uncompressed input - ch_input_prep = ch_input - .map{ meta, fasta, faa, gbk -> [meta + [category: 'all'], [fasta, faa, gbk]] } - .transpose() - .branch { - compressed: it[1].toString().endsWith('.gz') - uncompressed: it[1] - } + ch_input_prep = ch_samplesheet + .map { meta, fasta, faa, gbk -> [meta + [category: 'all'], [fasta, faa, gbk]] } + .transpose() + .branch { + compressed: it[1].toString().endsWith('.gz') + uncompressed: it[1] + } - GUNZIP_INPUT_PREP ( ch_input_prep.compressed ) - ch_versions = ch_versions.mix( GUNZIP_INPUT_PREP.out.versions ) + GUNZIP_INPUT_PREP(ch_input_prep.compressed) + ch_versions = ch_versions.mix(GUNZIP_INPUT_PREP.out.versions) // Merge all the already uncompressed and newly compressed FASTAs here into // a single input channel for downstream ch_intermediate_input = GUNZIP_INPUT_PREP.out.gunzip - .mix( ch_input_prep.uncompressed ) - .groupTuple() - .map{ - meta, files -> - def fasta_found = files.find{it.toString().tokenize('.').last().matches('fasta|fas|fna|fa')} - def faa_found = files.find{it.toString().endsWith('.faa')} - def gbk_found = files.find{it.toString().tokenize('.').last().matches('gbk|gbff')} - def fasta = fasta_found != null ? fasta_found : [] - def faa = faa_found != null ? faa_found : [] - def gbk = gbk_found != null ? gbk_found : [] - - [meta, fasta, faa, gbk] - } - .branch { - meta, fasta, faa, gbk -> - preannotated: gbk != [] - fastas: true - } + .mix(ch_input_prep.uncompressed) + .groupTuple() + .map { meta, files -> + def fasta_found = files.find { it.toString().tokenize('.').last().matches('fasta|fas|fna|fa') } + def faa_found = files.find { it.toString().endsWith('.faa') } + def gbk_found = files.find { it.toString().tokenize('.').last().matches('gbk|gbff') } + def fasta = fasta_found != null ? fasta_found : [] + def faa = faa_found != null ? faa_found : [] + def gbk = gbk_found != null ? gbk_found : [] + + [meta, fasta, faa, gbk] + } + .branch { meta, fasta, faa, gbk -> + preannotated: gbk != [] + fastas: true + } // Duplicate and filter the duplicated file for long contigs only for BGC // This is to speed up BGC run and prevent 'no hits found' fails - if ( params.run_bgc_screening ){ - SEQKIT_SEQ ( ch_intermediate_input.fastas.map{meta, fasta, faa, gbk -> [ meta, fasta ]} ) + if (params.run_bgc_screening) { + SEQKIT_SEQ_LENGTH(ch_intermediate_input.fastas.map { meta, fasta, faa, gbk -> [meta, fasta] }) ch_input_for_annotation = ch_intermediate_input.fastas - .map { meta, fasta, protein, gbk -> [ meta, fasta ] } - .mix( SEQKIT_SEQ.out.fastx.map{ meta, fasta -> [ meta + [category: 'long'], fasta ] } ) - .filter { - meta, fasta -> - if ( fasta != [] && fasta.isEmpty() ) log.warn("[nf-core/funcscan] Sample ${meta.id} does not have contigs longer than ${params.bgc_mincontiglength} bp. Will not be screened for BGCs.") - !fasta.isEmpty() + .map { meta, fasta, protein, gbk -> [meta, fasta] } + .mix(SEQKIT_SEQ_LENGTH.out.fastx.map { meta, fasta -> [meta + [category: 'long'], fasta] }) + .filter { meta, fasta -> + if (fasta != [] && fasta.isEmpty()) { + log.warn("[nf-core/funcscan] Sample ${meta.id} does not have contigs longer than ${params.bgc_mincontiglength} bp. Will not be screened for BGCs.") + } + !fasta.isEmpty() } - ch_versions = ch_versions.mix( SEQKIT_SEQ.out.versions ) - } else { - ch_input_for_annotation = ch_intermediate_input.fastas.map { meta, fasta, protein, gbk -> [ meta, fasta ] } + ch_versions = ch_versions.mix(SEQKIT_SEQ_LENGTH.out.versions) + } + else { + ch_input_for_annotation = ch_intermediate_input.fastas.map { meta, fasta, protein, gbk -> [meta, fasta] } } /* @@ -122,40 +122,38 @@ workflow FUNCSCAN { */ // Some tools require annotated FASTAs - if ( ( params.run_arg_screening && !params.arg_skip_deeparg ) || ( params.run_amp_screening ) || ( params.run_bgc_screening ) ) { - ANNOTATION( ch_input_for_annotation ) - ch_versions = ch_versions.mix( ANNOTATION.out.versions ) + if ((params.run_arg_screening && !params.arg_skip_deeparg) || params.run_amp_screening || params.run_bgc_screening) { + ANNOTATION(ch_input_for_annotation) + ch_versions = ch_versions.mix(ANNOTATION.out.versions) ch_new_annotation = ch_input_for_annotation - .join( ANNOTATION.out.faa ) - .join( ANNOTATION.out.gbk ) - - } else { + .join(ANNOTATION.out.faa) + .join(ANNOTATION.out.gbk) + } + else { ch_new_annotation = ch_intermediate_input.fastas } // Mix back the preannotated samples with the newly annotated ones ch_prepped_input = ch_new_annotation - .filter { meta, fasta, faa, gbk -> meta.category != 'long' } - .mix( ch_intermediate_input.preannotated ) - .multiMap { - meta, fasta, faa, gbk -> - fastas: [meta, fasta] - faas: [meta, faa] - gbks: [meta, gbk] - } - - if ( params.run_bgc_screening ){ - - ch_prepped_input_long = ch_new_annotation - .filter { meta, fasta, faa, gbk -> meta.category == 'long'} - .mix( ch_intermediate_input.preannotated ) - .multiMap { - meta, fasta, faa, gbk -> - fastas: [meta, fasta] - faas: [meta, faa] - gbks: [meta, gbk] - } + .filter { meta, fasta, faa, gbk -> meta.category != 'long' } + .mix(ch_intermediate_input.preannotated) + .multiMap { meta, fasta, faa, gbk -> + fastas: [meta, fasta] + faas: [meta, faa] + gbks: [meta, gbk] + } + + if (params.run_bgc_screening) { + + ch_prepped_input_long = ch_new_annotation + .filter { meta, fasta, faa, gbk -> meta.category == 'long' } + .mix(ch_intermediate_input.preannotated) + .multiMap { meta, fasta, faa, gbk -> + fastas: [meta, fasta] + faas: [meta, faa] + gbks: [meta, gbk] + } } /* @@ -165,19 +163,52 @@ workflow FUNCSCAN { // The final subworkflow reports need taxonomic classification. // This can be either on NT or AA level depending on annotation. // TODO: Only NT at the moment. AA tax. classification will be added only when its PR is merged. - if ( params.run_taxa_classification ) { - TAXA_CLASS ( ch_prepped_input.fastas ) - ch_versions = ch_versions.mix( TAXA_CLASS.out.versions ) - ch_taxonomy_tsv = TAXA_CLASS.out.sample_taxonomy + if (params.run_taxa_classification) { + TAXA_CLASS(ch_prepped_input.fastas) + ch_versions = ch_versions.mix(TAXA_CLASS.out.versions) + ch_taxonomy_tsv = TAXA_CLASS.out.sample_taxonomy + } + else { - } else { + ch_mmseqs_db = Channel.empty() + ch_taxonomy_querydb = Channel.empty() + ch_taxonomy_querydb_taxdb = Channel.empty() + ch_taxonomy_tsv = Channel.empty() + } + + /* + PROTEIN ANNOTATION + */ + if (params.run_protein_annotation) { + def filtered_faas = ch_prepped_input.faas.filter { meta, file -> + if (file != [] && file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of the following sample produced an empty FAA file. InterProScan classification of the CDS requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + } + + SEQKIT_SEQ_FILTER(filtered_faas) + ch_versions = ch_versions.mix(SEQKIT_SEQ_FILTER.out.versions) + ch_input_for_protein_annotation = SEQKIT_SEQ_FILTER.out.fastx - ch_mmseqs_db = Channel.empty() - ch_taxonomy_querydb = Channel.empty() - ch_taxonomy_querydb_taxdb = Channel.empty() - ch_taxonomy_tsv = Channel.empty() + PROTEIN_ANNOTATION ( ch_input_for_protein_annotation ) + ch_versions = ch_versions.mix(PROTEIN_ANNOTATION.out.versions) + + ch_interproscan_tsv = PROTEIN_ANNOTATION.out.tsv.map { meta, file -> + if (file == [] || file.isEmpty()) { + log.warn("[nf-core/funcscan] Protein annotation with InterProScan produced an empty TSV file. No protein annotation will be added for sample ${meta.id}.") + [meta, []] + } else { + [meta, file] + } + } + } else { + ch_interproscan_tsv = ch_prepped_input.faas.map { meta, _ -> + [meta, []] + } } + /* SCREENING */ @@ -185,139 +216,145 @@ workflow FUNCSCAN { /* AMPs */ - if ( params.run_amp_screening && !params.run_taxa_classification ) { - AMP ( + if (params.run_amp_screening && !params.run_taxa_classification) { + AMP( ch_prepped_input.fastas, - ch_prepped_input.faas - .filter { - meta, file -> - if ( file != [] && file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. AMP screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - - }, + ch_prepped_input.faas.filter { meta, file -> + if (file != [] && file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. AMP screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, ch_taxonomy_tsv, - ch_prepped_input.gbks + ch_prepped_input.gbks, + ch_interproscan_tsv ) ch_versions = ch_versions.mix(AMP.out.versions) - } else if ( params.run_amp_screening && params.run_taxa_classification ) { - AMP ( + } + else if (params.run_amp_screening && params.run_taxa_classification) { + AMP( ch_prepped_input.fastas, - ch_prepped_input.faas - .filter { - meta, file -> - if ( file != [] && file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. AMP screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_taxonomy_tsv - .filter { - meta, file -> - if ( file != [] && file.isEmpty() ) log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_prepped_input.gbks + ch_prepped_input.faas.filter { meta, file -> + if (file != [] && file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. AMP screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_taxonomy_tsv.filter { meta, file -> + if (file != [] && file.isEmpty()) { + log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_prepped_input.gbks, + ch_interproscan_tsv ) - ch_versions = ch_versions.mix( AMP.out.versions ) + ch_versions = ch_versions.mix(AMP.out.versions) } /* ARGs */ - if ( params.run_arg_screening && !params.run_taxa_classification ) { - if ( params.arg_skip_deeparg ) { - ARG ( + if (params.run_arg_screening && !params.run_taxa_classification) { + if (params.arg_skip_deeparg) { + ARG( ch_prepped_input.fastas, [], - ch_taxonomy_tsv - ) - } else { - ARG ( + ch_taxonomy_tsv, + ) + } + else { + ARG( ch_prepped_input.fastas, - ch_prepped_input.faas - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. ARG screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_taxonomy_tsv + ch_prepped_input.faas.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. ARG screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_taxonomy_tsv, ) } - ch_versions = ch_versions.mix( ARG.out.versions ) - } else if ( params.run_arg_screening && params.run_taxa_classification ) { - if ( params.arg_skip_deeparg ) { - ARG ( + ch_versions = ch_versions.mix(ARG.out.versions) + } + else if (params.run_arg_screening && params.run_taxa_classification) { + if (params.arg_skip_deeparg) { + ARG( ch_prepped_input.fastas, [], - ch_taxonomy_tsv - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") - !file.isEmpty() + ch_taxonomy_tsv.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") } - ) - } else { - ARG ( + !file.isEmpty() + }, + ) + } + else { + ARG( ch_prepped_input.fastas, - ch_prepped_input.faas - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. ARG screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_taxonomy_tsv - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") - !file.isEmpty() - } + ch_prepped_input.faas.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. ARG screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_taxonomy_tsv.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") + } + !file.isEmpty() + }, ) } - ch_versions = ch_versions.mix( ARG.out.versions ) + ch_versions = ch_versions.mix(ARG.out.versions) } /* BGCs */ - if ( params.run_bgc_screening && !params.run_taxa_classification ) { - BGC ( + if (params.run_bgc_screening && !params.run_taxa_classification) { + BGC( ch_prepped_input_long.fastas, - ch_prepped_input_long.faas - .filter { - meta, file -> - if ( file != [] && file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty GFF file. BGC screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_prepped_input_long.gbks - .filter { - meta, file -> - if ( file != [] && file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. BGC screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_taxonomy_tsv + ch_prepped_input_long.faas.filter { meta, file -> + if (file != [] && file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty GFF file. BGC screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_prepped_input_long.gbks.filter { meta, file -> + if (file != [] && file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. BGC screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_taxonomy_tsv, ) - ch_versions = ch_versions.mix( BGC.out.versions ) - } else if ( params.run_bgc_screening && params.run_taxa_classification ) { - BGC ( + ch_versions = ch_versions.mix(BGC.out.versions) + } + else if (params.run_bgc_screening && params.run_taxa_classification) { + BGC( ch_prepped_input_long.fastas, - ch_prepped_input_long.faas - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. BGC screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_prepped_input_long.gbks - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Annotation of following sample produced an empty GBK file. BGC screening tools requiring this file will not be executed: ${meta.id}") - !file.isEmpty() - }, - ch_taxonomy_tsv - .filter { - meta, file -> - if ( file.isEmpty() ) log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") - !file.isEmpty() + ch_prepped_input_long.faas.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty FAA file. BGC screening tools requiring this file will not be executed: ${meta.id}") } + !file.isEmpty() + }, + ch_prepped_input_long.gbks.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Annotation of following sample produced an empty GBK file. BGC screening tools requiring this file will not be executed: ${meta.id}") + } + !file.isEmpty() + }, + ch_taxonomy_tsv.filter { meta, file -> + if (file.isEmpty()) { + log.warn("[nf-core/funcscan] Taxonomy classification of the following sample produced an empty TSV file. Taxonomy merging will not be executed: ${meta.id}") + } + !file.isEmpty() + }, ) - ch_versions = ch_versions.mix( BGC.out.versions ) + ch_versions = ch_versions.mix(BGC.out.versions) } // @@ -326,63 +363,64 @@ workflow FUNCSCAN { softwareVersionsToYAML(ch_versions) .collectFile( storeDir: "${params.outdir}/pipeline_info", - name: 'nf_core_pipeline_software_mqc_versions.yml', + name: 'nf_core_' + 'funcscan_software_' + 'mqc_' + 'versions.yml', sort: true, - newLine: true - ).set { ch_collated_versions } + newLine: true, + ) + .set { ch_collated_versions } + // // MODULE: MultiQC // - ch_multiqc_config = Channel.fromPath( - "$projectDir/assets/multiqc_config.yml", checkIfExists: true) - ch_multiqc_custom_config = params.multiqc_config ? - Channel.fromPath(params.multiqc_config, checkIfExists: true) : - Channel.empty() - ch_multiqc_logo = params.multiqc_logo ? - Channel.fromPath(params.multiqc_logo, checkIfExists: true) : - Channel.fromPath("${workflow.projectDir}/docs/images/nf-core-funcscan_logo_light.png", checkIfExists: true) - - summary_params = paramsSummaryMap( - workflow, parameters_schema: "nextflow_schema.json") + ch_multiqc_config = Channel.fromPath( + "${projectDir}/assets/multiqc_config.yml", + checkIfExists: true + ) + ch_multiqc_custom_config = params.multiqc_config + ? Channel.fromPath(params.multiqc_config, checkIfExists: true) + : Channel.empty() + ch_multiqc_logo = params.multiqc_logo + ? Channel.fromPath(params.multiqc_logo, checkIfExists: true) + : Channel.fromPath("${workflow.projectDir}/docs/images/nf-core-funcscan_logo_light.png", checkIfExists: true) + + summary_params = paramsSummaryMap( + workflow, + parameters_schema: "nextflow_schema.json" + ) ch_workflow_summary = Channel.value(paramsSummaryMultiqc(summary_params)) - - ch_multiqc_custom_methods_description = params.multiqc_methods_description ? - file(params.multiqc_methods_description, checkIfExists: true) : - file("$projectDir/assets/methods_description_template.yml", checkIfExists: true) - ch_methods_description = Channel.value( - methodsDescriptionText(ch_multiqc_custom_methods_description)) - ch_multiqc_files = ch_multiqc_files.mix( - ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml')) + ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml') + ) + ch_multiqc_custom_methods_description = params.multiqc_methods_description + ? file(params.multiqc_methods_description, checkIfExists: true) + : file("${projectDir}/assets/methods_description_template.yml", checkIfExists: true) + ch_methods_description = Channel.value( + methodsDescriptionText(ch_multiqc_custom_methods_description) + ) + ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) ch_multiqc_files = ch_multiqc_files.mix( ch_methods_description.collectFile( name: 'methods_description_mqc.yaml', - sort: true + sort: true, ) ) - if ( ( params.run_arg_screening && !params.arg_skip_deeparg ) || ( params.run_amp_screening && ( params.amp_run_hmmsearch || !params.amp_skip_amplify || !params.amp_skip_ampir ) ) || ( params.run_bgc_screening ) ) { - ch_multiqc_files = ch_multiqc_files.mix( ANNOTATION.out.multiqc_files.collect{it[1]} ) + if ((params.run_arg_screening && !params.arg_skip_deeparg) || (params.run_amp_screening && (params.amp_run_hmmsearch || !params.amp_skip_amplify || !params.amp_skip_ampir)) || params.run_bgc_screening) { + ch_multiqc_files = ch_multiqc_files.mix(ANNOTATION.out.multiqc_files.collect { it[1] }) } - MULTIQC ( + MULTIQC( ch_multiqc_files.collect(), ch_multiqc_config.toList(), ch_multiqc_custom_config.toList(), ch_multiqc_logo.toList(), [], - [] + [], ) emit: multiqc_report = MULTIQC.out.report.toList() // channel: /path/to/multiqc_report.html - versions = ch_versions // channel: [ path(versions.yml) ] + versions = ch_versions // channel: [ path(versions.yml) ] } - -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - THE END -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/