Skip to content

Commit 4b5f36d

Browse files
authored
Merge pull request #251 from nf-core/pyrodigal
Pyrodigal
2 parents 32cf7ea + fac328d commit 4b5f36d

File tree

30 files changed

+285
-98
lines changed

30 files changed

+285
-98
lines changed

.github/workflows/ci.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -83,9 +83,10 @@ jobs:
8383
- "22.10.1"
8484
- "latest-everything"
8585
parameters:
86+
- "--annotation_tool bakta --annotation_bakta_db_downloadtype light"
8687
- "--annotation_tool prodigal"
8788
- "--annotation_tool prokka"
88-
- "--annotation_tool bakta --annotation_bakta_db_downloadtype light"
89+
- "--annotation_tool pyrodigal"
8990

9091
steps:
9192
- name: Check out pipeline code

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1010
- [#238](https://github.com/nf-core/funcscan/pull/238) Added dedicated DRAMP database downloading step for AMPcombi to prevent parallel downloads when no database provided by user (by @jfy133)
1111
- [#235](https://github.com/nf-core/funcscan/pull/235) Added parameter `annotation_bakta_db_downloadtype` to be able to switch between downloading either full (33.1 GB) or light (1.3 GB excluding UPS, IPS, PSC, see parameter description) versions of the Bakta database. (by @jasmezz)
1212
- [#249](https://github.com/nf-core/funcscan/pull/249) Added bakta annotation to CI tests. (by @jasmezz)
13+
- [#251](https://github.com/nf-core/funcscan/pull/251) Added annotation tool: Pyrodigal. (by @jasmezz)
1314

1415
### `Fixed`
1516

CITATIONS.md

+8-4
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
- [ABRicate](https://github.com/tseemann/abricate)
1414

15-
> Seemann T. (2020). ABRicate. Github [https://github.com/tseemann/abricate](https://github.com/tseemann/abricate).
15+
> Seemann, T. (2020). ABRicate. Github [https://github.com/tseemann/abricate](https://github.com/tseemann/abricate).
1616
1717
- [AMPir](https://doi.org/10.1093/bioinformatics/btaa653)
1818

@@ -48,15 +48,15 @@
4848
4949
- [GECCO](https://gecco.embl.de)
5050

51-
> Carroll, L.M. , Larralde, M., Fleck, J. S., Ponnudurai, R., Milanese, A., Cappio Barazzone, E. & Zeller, G. (2021). Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv [DOI: 10.1101/2021.05.03.442509](https://doi.org/10.1101/2021.05.03.442509)
51+
> Carroll, L. M. , Larralde, M., Fleck, J. S., Ponnudurai, R., Milanese, A., Cappio Barazzone, E. & Zeller, G. (2021). Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv [DOI: 10.1101/2021.05.03.442509](https://doi.org/10.1101/2021.05.03.442509)
5252
5353
- [hAMRonization](https://github.com/pha4ge/hAMRonization)
5454

5555
> Public Health Alliance for Genomic Epidemiology (pha4ge). (2022). Parse multiple Antimicrobial Resistance Analysis Reports into a common data structure. Github. Retrieved October 5, 2022, from [https://github.com/pha4ge/hAMRonization](https://github.com/pha4ge/hAMRonization)
5656
5757
- [AMPcombi](https://github.com/Darcy220606/AMPcombi)
5858

59-
> Anan Ibrahim, & Louisa Perelo. (2023). Darcy220606/AMPcombi. [DOI: 10.5281/zenodo.7639121](https://doi.org/10.5281/zenodo.7639121).
59+
> Ibrahim, A. & Perelo, L. (2023). Darcy220606/AMPcombi. [DOI: 10.5281/zenodo.7639121](https://doi.org/10.5281/zenodo.7639121).
6060
6161
- [HMMER](https://doi.org/10.1371/journal.pcbi.1002195.)
6262

@@ -72,7 +72,11 @@
7272
7373
- [PROKKA](https://doi.org/10.1093/bioinformatics/btu153)
7474

75-
> Seemann T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069. [DOI: 10.1093/bioinformatics/btu153](https://doi.org/10.1093/bioinformatics/btu153)
75+
> Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069. [DOI: 10.1093/bioinformatics/btu153](https://doi.org/10.1093/bioinformatics/btu153)
76+
77+
- [Pyrodigal](https://doi.org/10.1186/1471-2105-11-119)
78+
79+
> Larralde, M. (2022). Pyrodigal: Python bindings and interface to Prodigal, an efficient method for gene prediction in prokaryotes. Journal of Open Source Software, 7(72), 4296. [DOI: 10.21105/joss.04296](https://doi.org/10.21105/joss.04296)
7680
7781
- [RGI](https://doi.org/10.1093/nar/gkz935)
7882

conf/modules.config

+16
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,22 @@ process {
157157
].join(' ').trim()
158158
}
159159

160+
withName: PYRODIGAL {
161+
publishDir = [
162+
path: { "${params.outdir}/annotation/pyrodigal/${meta.id}" },
163+
mode: params.publish_dir_mode,
164+
enabled: params.save_annotations,
165+
pattern: "*.{faa,fna}",
166+
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
167+
]
168+
ext.args = [
169+
params.annotation_pyrodigal_singlemode ? "-p single" : "-p meta",
170+
params.annotation_prodigal_closed ? "-c" : "",
171+
params.annotation_prodigal_forcenonsd ? "-n" : "",
172+
"-g ${params.annotation_prodigal_transtable}"
173+
].join(' ').trim()
174+
}
175+
160176
withName: ABRICATE_RUN {
161177
publishDir = [
162178
path: { "${params.outdir}/arg/abricate/${meta.id}" },

docs/output.md

+24-5
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The output of nf-core/funcscan provides reports for each of the functional group
88
- antimicrobial peptides (tools: [Macrel](https://github.com/BigDataBiology/macrel), [AMPlify](https://github.com/bcgsc/AMPlify), [ampir](https://ampir.marine-omics.net), [hmmsearch](http://hmmer.org) – summarised by [AMPcombi](https://github.com/Darcy220606/AMPcombi))
99
- biosynthetic gene clusters (tools: [antiSMASH](https://docs.antismash.secondarymetabolites.org), [DeepBGC](https://github.com/Merck/deepbgc), [GECCO](https://gecco.embl.de), [hmmsearch](http://hmmer.org) – summarised by [comBGC](#combgc))
1010

11-
As a general workflow, we recommend to first look at the summary reports ([ARGs](#hamronization), [AMPs](#ampcombi), [BGCs](#combgc)), to get a general overview of what hits have been found across all the tools of each functional group. After which, you can explore the specific output directories of each tool to get more detailed information about each result. The tool-specific output directories also includes the output from the functional annotation steps of either [prokka](https://github.com/tseemann/prokka), [prodigal](https://github.com/hyattpd/Prodigal), or [Bakta](https://github.com/oschwengers/bakta) if the `--save_annotations` flag was set.
11+
As a general workflow, we recommend to first look at the summary reports ([ARGs](#hamronization), [AMPs](#ampcombi), [BGCs](#combgc)), to get a general overview of what hits have been found across all the tools of each functional group. After which, you can explore the specific output directories of each tool to get more detailed information about each result. The tool-specific output directories also includes the output from the functional annotation steps of either [prokka](https://github.com/tseemann/prokka), [pyrodigal](https://github.com/althonos/pyrodigal), [prodigal](https://github.com/hyattpd/Prodigal), or [Bakta](https://github.com/oschwengers/bakta) if the `--save_annotations` flag was set.
1212

1313
Similarly, all downloaded databases are saved (i.e. from [antiSMASH](https://docs.antismash.secondarymetabolites.org), [AMRFinderPlus](https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder), [Bakta](https://github.com/oschwengers/bakta), and/or [DeepARG](https://bitbucket.org/gusphdproj/deeparg-ss/src/master)) into the output directory `<outdir>/downloads/` if the `--save_databases` flag was set.
1414

@@ -19,9 +19,10 @@ The directories listed below will be created in the results directory (specified
1919
```console
2020
results/
2121
├── annotation/
22-
| ├── prodigal/
22+
| ├── bakta/
23+
| ├── prodigal
2324
| ├── prokka/
24-
| └── bakta/
25+
| └── pyrodigal/
2526
├── amp/
2627
| ├── ampir/
2728
| ├── amplify/
@@ -55,7 +56,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes p
5556

5657
ORF prediction and annotation with any of:
5758

58-
- [Prodigal](#prodigal) (default) – for open reading frame prediction.
59+
- [Pyrodigal](#pyrodigal) (default) – for open reading frame prediction.
60+
- [Prodigal](#prodigal) – for open reading frame prediction.
5961
- [Prokka](#prokka) – open reading frame prediction and functional protein annotation.
6062
- [Bakta](#bakta) – open reading frame prediction and functional protein annotation.
6163

@@ -93,7 +95,7 @@ Output Summaries:
9395

9496
### Annotation tools
9597

96-
[Prodigal](#prodigal), [Prokka](#prokka), [Bakta](#bakta)
98+
[Pyrodigal](#pyrodigal), [Prodigal](#prodigal), [Prokka](#prokka), [Bakta](#bakta)
9799

98100
#### Prodigal
99101

@@ -113,6 +115,23 @@ Output Summaries:
113115

114116
[Prodigal](https://github.com/hyattpd/Prodigal) annotates whole (meta-)genomes by identifying ORFs in a set of genomic DNA sequences. The output is used by some of the functional screening tools.
115117

118+
#### Pyrodigal
119+
120+
<details markdown="1">
121+
<summary>Output files</summary>
122+
123+
- `pyrodigal/`
124+
- `<samplename>/`:
125+
- `*.gff`: annotation in GFF3 format, containing both sequences and annotations
126+
- `*.fna`: nucleotide FASTA file of the input contig sequences
127+
- `*.faa`: protein FASTA file of the translated CDS sequences
128+
129+
> Descriptions taken from the [Pyrodigal documentation](https://pyrodigal.readthedocs.io/)
130+
131+
</details>
132+
133+
[Pyrodigal](https://github.com/althonos/pyrodigal) annotates whole (meta-)genomes by identifying ORFs in a set of genomic DNA sequences. It produces the same results as [Prodigal](#prodigal) while being more resource-optimized, thus faster. Other than Prodigal, Pyrodigal cannot produce output in GenBank format. The output is used by some of the functional screening tools.
134+
116135
#### Prokka
117136

118137
<details markdown="1">

docs/usage.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ To prevent entire pipeline failures due to a single 'bad sample', nf-core/funcsc
8282
8383
When the annotation is run with Prokka, the resulting `.gbk` file passed to antiSMASH may produce the error `translation longer than location allows` and end the pipeline run. This Prokka bug has been reported before (see [discussion on GitHub](https://github.com/antismash/antismash/discussions/450)) and is not likely to be fixed soon.
8484

85-
> ⚠️ If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead leave the default annotation tool Prodigal or switch to Bakta (for bacteria only!).
85+
> ⚠️ If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead use the default annotation tool (Pyrodigal) or switch to Prodigal, or Bakta (the latter to bacteria only!).
8686
8787
## Databases and reference files
8888

modules.json

+20-15
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@
77
"nf-core": {
88
"abricate/run": {
99
"branch": "master",
10-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
10+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
1111
"installed_by": ["modules"]
1212
},
1313
"ampcombi": {
1414
"branch": "master",
15-
"git_sha": "37c5127d6c65a818677e5786587efcc2b64a11b7",
15+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
1616
"installed_by": ["modules"]
1717
},
1818
"ampir": {
@@ -22,7 +22,7 @@
2222
},
2323
"amplify/predict": {
2424
"branch": "master",
25-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
25+
"git_sha": "5293fc55d4d645cf9daffad835bee270d328ce91",
2626
"installed_by": ["modules"]
2727
},
2828
"amrfinderplus/run": {
@@ -32,7 +32,7 @@
3232
},
3333
"amrfinderplus/update": {
3434
"branch": "master",
35-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
35+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
3636
"installed_by": ["modules"]
3737
},
3838
"antismash/antismashlite": {
@@ -57,18 +57,18 @@
5757
},
5858
"bioawk": {
5959
"branch": "master",
60-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
60+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
6161
"installed_by": ["modules"],
6262
"patch": "modules/nf-core/bioawk/bioawk.diff"
6363
},
6464
"custom/dumpsoftwareversions": {
6565
"branch": "master",
66-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
66+
"git_sha": "b6d4d476aee074311c89d82a69c1921bd70c8180",
6767
"installed_by": ["modules"]
6868
},
6969
"deeparg/downloaddata": {
7070
"branch": "master",
71-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
71+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
7272
"installed_by": ["modules"]
7373
},
7474
"deeparg/predict": {
@@ -88,7 +88,7 @@
8888
},
8989
"fargene": {
9090
"branch": "master",
91-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
91+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
9292
"installed_by": ["modules"]
9393
},
9494
"fastqc": {
@@ -98,17 +98,17 @@
9898
},
9999
"gecco/run": {
100100
"branch": "master",
101-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
101+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
102102
"installed_by": ["modules"]
103103
},
104104
"gunzip": {
105105
"branch": "master",
106-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
106+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
107107
"installed_by": ["modules"]
108108
},
109109
"hamronization/abricate": {
110110
"branch": "master",
111-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
111+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
112112
"installed_by": ["modules"]
113113
},
114114
"hamronization/amrfinderplus": {
@@ -118,22 +118,22 @@
118118
},
119119
"hamronization/deeparg": {
120120
"branch": "master",
121-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
121+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
122122
"installed_by": ["modules"]
123123
},
124124
"hamronization/fargene": {
125125
"branch": "master",
126-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
126+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
127127
"installed_by": ["modules"]
128128
},
129129
"hamronization/rgi": {
130130
"branch": "master",
131-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
131+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
132132
"installed_by": ["modules"]
133133
},
134134
"hamronization/summarize": {
135135
"branch": "master",
136-
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
136+
"git_sha": "0f8a77ff00e65eaeebc509b8156eaa983192474b",
137137
"installed_by": ["modules"]
138138
},
139139
"hmmer/hmmsearch": {
@@ -161,6 +161,11 @@
161161
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
162162
"installed_by": ["modules"]
163163
},
164+
"pyrodigal": {
165+
"branch": "master",
166+
"git_sha": "93cca9af587f39eaaa357b9e589e3e657d8a0f75",
167+
"installed_by": ["modules"]
168+
},
164169
"rgi/main": {
165170
"branch": "master",
166171
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",

modules/nf-core/abricate/run/meta.yml

+2-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

modules/nf-core/ampcombi/meta.yml

+12-12
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)