Skip to content

Pyrodigal #251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,10 @@ jobs:
- "22.10.1"
- "latest-everything"
parameters:
- "--annotation_tool bakta --annotation_bakta_db_downloadtype light"
- "--annotation_tool prodigal"
- "--annotation_tool prokka"
- "--annotation_tool bakta --annotation_bakta_db_downloadtype light"
- "--annotation_tool pyrodigal"

steps:
- name: Check out pipeline code
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#238](https://github.com/nf-core/funcscan/pull/238) Added dedicated DRAMP database downloading step for AMPcombi to prevent parallel downloads when no database provided by user (by @jfy133)
- [#235](https://github.com/nf-core/funcscan/pull/235) Added parameter `annotation_bakta_db_downloadtype` to be able to switch between downloading either full (33.1 GB) or light (1.3 GB excluding UPS, IPS, PSC, see parameter description) versions of the Bakta database. (by @jasmezz)
- [#249](https://github.com/nf-core/funcscan/pull/249) Added bakta annotation to CI tests. (by @jasmezz)
- [#251](https://github.com/nf-core/funcscan/pull/251) Added annotation tool: Pyrodigal. (by @jasmezz)

### `Fixed`

Expand Down
12 changes: 8 additions & 4 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

- [ABRicate](https://github.com/tseemann/abricate)

> Seemann T. (2020). ABRicate. Github [https://github.com/tseemann/abricate](https://github.com/tseemann/abricate).
> Seemann, T. (2020). ABRicate. Github [https://github.com/tseemann/abricate](https://github.com/tseemann/abricate).

- [AMPir](https://doi.org/10.1093/bioinformatics/btaa653)

Expand Down Expand Up @@ -48,15 +48,15 @@

- [GECCO](https://gecco.embl.de)

> Carroll, L.M. , Larralde, M., Fleck, J. S., Ponnudurai, R., Milanese, A., Cappio Barazzone, E. & Zeller, G. (2021). Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv [DOI: 10.1101/2021.05.03.442509](https://doi.org/10.1101/2021.05.03.442509)
> Carroll, L. M. , Larralde, M., Fleck, J. S., Ponnudurai, R., Milanese, A., Cappio Barazzone, E. & Zeller, G. (2021). Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv [DOI: 10.1101/2021.05.03.442509](https://doi.org/10.1101/2021.05.03.442509)

- [hAMRonization](https://github.com/pha4ge/hAMRonization)

> Public Health Alliance for Genomic Epidemiology (pha4ge). (2022). Parse multiple Antimicrobial Resistance Analysis Reports into a common data structure. Github. Retrieved October 5, 2022, from [https://github.com/pha4ge/hAMRonization](https://github.com/pha4ge/hAMRonization)

- [AMPcombi](https://github.com/Darcy220606/AMPcombi)

> Anan Ibrahim, & Louisa Perelo. (2023). Darcy220606/AMPcombi. [DOI: 10.5281/zenodo.7639121](https://doi.org/10.5281/zenodo.7639121).
> Ibrahim, A. & Perelo, L. (2023). Darcy220606/AMPcombi. [DOI: 10.5281/zenodo.7639121](https://doi.org/10.5281/zenodo.7639121).

- [HMMER](https://doi.org/10.1371/journal.pcbi.1002195.)

Expand All @@ -72,7 +72,11 @@

- [PROKKA](https://doi.org/10.1093/bioinformatics/btu153)

> Seemann T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069. [DOI: 10.1093/bioinformatics/btu153](https://doi.org/10.1093/bioinformatics/btu153)
> Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069. [DOI: 10.1093/bioinformatics/btu153](https://doi.org/10.1093/bioinformatics/btu153)

- [Pyrodigal](https://doi.org/10.1186/1471-2105-11-119)

> Larralde, M. (2022). Pyrodigal: Python bindings and interface to Prodigal, an efficient method for gene prediction in prokaryotes. Journal of Open Source Software, 7(72), 4296. [DOI: 10.21105/joss.04296](https://doi.org/10.21105/joss.04296)

- [RGI](https://doi.org/10.1093/nar/gkz935)

Expand Down
16 changes: 16 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,22 @@ process {
].join(' ').trim()
}

withName: PYRODIGAL {
publishDir = [
path: { "${params.outdir}/annotation/pyrodigal/${meta.id}" },
mode: params.publish_dir_mode,
enabled: params.save_annotations,
pattern: "*.{faa,fna}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.args = [
params.annotation_pyrodigal_singlemode ? "-p single" : "-p meta",
params.annotation_prodigal_closed ? "-c" : "",
params.annotation_prodigal_forcenonsd ? "-n" : "",
"-g ${params.annotation_prodigal_transtable}"
].join(' ').trim()
}

withName: ABRICATE_RUN {
publishDir = [
path: { "${params.outdir}/arg/abricate/${meta.id}" },
Expand Down
29 changes: 24 additions & 5 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The output of nf-core/funcscan provides reports for each of the functional group
- antimicrobial peptides (tools: [Macrel](https://github.com/BigDataBiology/macrel), [AMPlify](https://github.com/bcgsc/AMPlify), [ampir](https://ampir.marine-omics.net), [hmmsearch](http://hmmer.org) – summarised by [AMPcombi](https://github.com/Darcy220606/AMPcombi))
- biosynthetic gene clusters (tools: [antiSMASH](https://docs.antismash.secondarymetabolites.org), [DeepBGC](https://github.com/Merck/deepbgc), [GECCO](https://gecco.embl.de), [hmmsearch](http://hmmer.org) – summarised by [comBGC](#combgc))

As a general workflow, we recommend to first look at the summary reports ([ARGs](#hamronization), [AMPs](#ampcombi), [BGCs](#combgc)), to get a general overview of what hits have been found across all the tools of each functional group. After which, you can explore the specific output directories of each tool to get more detailed information about each result. The tool-specific output directories also includes the output from the functional annotation steps of either [prokka](https://github.com/tseemann/prokka), [prodigal](https://github.com/hyattpd/Prodigal), or [Bakta](https://github.com/oschwengers/bakta) if the `--save_annotations` flag was set.
As a general workflow, we recommend to first look at the summary reports ([ARGs](#hamronization), [AMPs](#ampcombi), [BGCs](#combgc)), to get a general overview of what hits have been found across all the tools of each functional group. After which, you can explore the specific output directories of each tool to get more detailed information about each result. The tool-specific output directories also includes the output from the functional annotation steps of either [prokka](https://github.com/tseemann/prokka), [pyrodigal](https://github.com/althonos/pyrodigal), [prodigal](https://github.com/hyattpd/Prodigal), or [Bakta](https://github.com/oschwengers/bakta) if the `--save_annotations` flag was set.

Similarly, all downloaded databases are saved (i.e. from [antiSMASH](https://docs.antismash.secondarymetabolites.org), [AMRFinderPlus](https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder), [Bakta](https://github.com/oschwengers/bakta), and/or [DeepARG](https://bitbucket.org/gusphdproj/deeparg-ss/src/master)) into the output directory `<outdir>/downloads/` if the `--save_databases` flag was set.

Expand All @@ -19,9 +19,10 @@ The directories listed below will be created in the results directory (specified
```console
results/
├── annotation/
| ├── prodigal/
| ├── bakta/
| ├── prodigal
| ├── prokka/
| └── bakta/
| └── pyrodigal/
├── amp/
| ├── ampir/
| ├── amplify/
Expand Down Expand Up @@ -55,7 +56,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes p

ORF prediction and annotation with any of:

- [Prodigal](#prodigal) (default) – for open reading frame prediction.
- [Pyrodigal](#pyrodigal) (default) – for open reading frame prediction.
- [Prodigal](#prodigal) – for open reading frame prediction.
- [Prokka](#prokka) – open reading frame prediction and functional protein annotation.
- [Bakta](#bakta) – open reading frame prediction and functional protein annotation.

Expand Down Expand Up @@ -93,7 +95,7 @@ Output Summaries:

### Annotation tools

[Prodigal](#prodigal), [Prokka](#prokka), [Bakta](#bakta)
[Pyrodigal](#pyrodigal), [Prodigal](#prodigal), [Prokka](#prokka), [Bakta](#bakta)

#### Prodigal

Expand All @@ -113,6 +115,23 @@ Output Summaries:

[Prodigal](https://github.com/hyattpd/Prodigal) annotates whole (meta-)genomes by identifying ORFs in a set of genomic DNA sequences. The output is used by some of the functional screening tools.

#### Pyrodigal

<details markdown="1">
<summary>Output files</summary>

- `pyrodigal/`
- `<samplename>/`:
- `*.gff`: annotation in GFF3 format, containing both sequences and annotations
- `*.fna`: nucleotide FASTA file of the input contig sequences
- `*.faa`: protein FASTA file of the translated CDS sequences

> Descriptions taken from the [Pyrodigal documentation](https://pyrodigal.readthedocs.io/)

</details>

[Pyrodigal](https://github.com/althonos/pyrodigal) annotates whole (meta-)genomes by identifying ORFs in a set of genomic DNA sequences. It produces the same results as [Prodigal](#prodigal) while being more resource-optimized, thus faster. Other than Prodigal, Pyrodigal cannot produce output in GenBank format. The output is used by some of the functional screening tools.

#### Prokka

<details markdown="1">
Expand Down
2 changes: 1 addition & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ To prevent entire pipeline failures due to a single 'bad sample', nf-core/funcsc

When the annotation is run with Prokka, the resulting `.gbk` file passed to antiSMASH may produce the error `translation longer than location allows` and end the pipeline run. This Prokka bug has been reported before (see [discussion on GitHub](https://github.com/antismash/antismash/discussions/450)) and is not likely to be fixed soon.

> ⚠️ If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead leave the default annotation tool Prodigal or switch to Bakta (for bacteria only!).
> ⚠️ If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead use the default annotation tool (Pyrodigal) or switch to Prodigal, or Bakta (the latter to bacteria only!).

## Databases and reference files

Expand Down
35 changes: 20 additions & 15 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
"nf-core": {
"abricate/run": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"ampcombi": {
"branch": "master",
"git_sha": "37c5127d6c65a818677e5786587efcc2b64a11b7",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"ampir": {
Expand All @@ -22,7 +22,7 @@
},
"amplify/predict": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "5293fc55d4d645cf9daffad835bee270d328ce91",
"installed_by": ["modules"]
},
"amrfinderplus/run": {
Expand All @@ -32,7 +32,7 @@
},
"amrfinderplus/update": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"antismash/antismashlite": {
Expand All @@ -57,18 +57,18 @@
},
"bioawk": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"],
"patch": "modules/nf-core/bioawk/bioawk.diff"
},
"custom/dumpsoftwareversions": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "b6d4d476aee074311c89d82a69c1921bd70c8180",
"installed_by": ["modules"]
},
"deeparg/downloaddata": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"deeparg/predict": {
Expand All @@ -88,7 +88,7 @@
},
"fargene": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"fastqc": {
Expand All @@ -98,17 +98,17 @@
},
"gecco/run": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"gunzip": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"hamronization/abricate": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"hamronization/amrfinderplus": {
Expand All @@ -118,22 +118,22 @@
},
"hamronization/deeparg": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"hamronization/fargene": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"hamronization/rgi": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"hamronization/summarize": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
"installed_by": ["modules"]
},
"hmmer/hmmsearch": {
Expand Down Expand Up @@ -161,6 +161,11 @@
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"installed_by": ["modules"]
},
"pyrodigal": {
"branch": "master",
"git_sha": "93cca9af587f39eaaa357b9e589e3e657d8a0f75",
"installed_by": ["modules"]
},
"rgi/main": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
Expand Down
4 changes: 2 additions & 2 deletions modules/nf-core/abricate/run/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 12 additions & 12 deletions modules/nf-core/ampcombi/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading