Skip to content

Pyrodigal #251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 3, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ jobs:
parameters:
- "--annotation_tool prodigal"
- "--annotation_tool prokka"
- "--annotation_tool pyrodigal"

steps:
- name: Check out pipeline code
Expand Down
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- [#238](https://github.com/nf-core/funcscan/pull/238) Added dedicated DRAMP database downloading step for AMPcombi to prevent parallel downloads when no database provided by user (by @jfy133)
- [#235](https://github.com/nf-core/funcscan/pull/235) Added parameter `annotation_bakta_db_downloadtype` to be able to switch between downloading either full (33.1 GB) or light (1.3 GB excluding UPS, IPS, PSC, see parameter description) versions of the Bakta database. (by @jasmezz)
- [#251](https://github.com/nf-core/funcscan/pull/251) Added annotation tool: pyrodigal. (by @jasmezz)
- [#251](https://github.com/nf-core/funcscan/pull/251) Added annotation tool: Pyrodigal. (by @jasmezz)

### `Fixed`

Expand Down
16 changes: 16 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,22 @@ process {
].join(' ').trim()
}

withName: PYRODIGAL {
publishDir = [
path: { "${params.outdir}/annotation/pyrodigal/${meta.id}" },
mode: params.publish_dir_mode,
enabled: params.save_annotations,
pattern: "*.{faa,fna}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.args = [
params.annotation_pyrodigal_singlemode ? "-p single" : "-p meta",
params.annotation_prodigal_closed ? "-c" : "",
params.annotation_prodigal_forcenonsd ? "-n" : "",
"-g ${params.annotation_prodigal_transtable}"
].join(' ').trim()
}

withName: ABRICATE_RUN {
publishDir = [
path: { "${params.outdir}/arg/abricate/${meta.id}" },
Expand Down
2 changes: 1 addition & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ To prevent entire pipeline failures due to a single 'bad sample', nf-core/funcsc

When the annotation is run with Prokka, the resulting `.gbk` file passed to antiSMASH may produce the error `translation longer than location allows` and end the pipeline run. This Prokka bug has been reported before (see [discussion on GitHub](https://github.com/antismash/antismash/discussions/450)) and is not likely to be fixed soon.

> ⚠️ If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead leave the default annotation tool Pyrodigal or switch to Bakta (for bacteria only!).
> ⚠️ If antiSMASH is run for BGC detection, we recommend to **not** run Prokka for annotation but instead leave the default annotation tool Pyrodigal or switch to Prodigal or Bakta (for bacteria only!).

## Databases and reference files

Expand Down
7 changes: 6 additions & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,19 @@ params {
igenomes_ignore = true

// Annotation options
annotation_tool = 'prodigal'
annotation_tool = 'pyrodigal'
save_annotations = false

annotation_prodigal_singlemode = false
annotation_prodigal_closed = false
annotation_prodigal_transtable = 11
annotation_prodigal_forcenonsd = false

annotation_pyrodigal_singlemode = false
annotation_pyrodigal_closed = false
annotation_pyrodigal_transtable = 11
annotation_pyrodigal_forcenonsd = false

annotation_bakta_db_localpath = null
annotation_bakta_db_downloadtype = 'full'
annotation_bakta_mincontiglen = 1
Expand Down
54 changes: 46 additions & 8 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,9 @@
"properties": {
"annotation_tool": {
"type": "string",
"default": "prodigal",
"default": "pyrodigal",
"description": "Specify which annotation tool to use for some downstream tools.",
"enum": ["prodigal", "prokka", "bakta"],
"enum": ["prodigal", "pyrodigal", "prokka", "bakta"],
"fa_icon": "fas fa-edit"
},
"save_annotations": {
Expand Down Expand Up @@ -250,7 +250,7 @@
"default": "Bacteria",
"fa_icon": "fab fa-accusoft",
"description": "Specify the kingdom that the input represents.",
"help_text": "Specifies the kingdom that the input sample is derived from and/or you wish to screen for\n\n> ⚠️ Prokka cannot annotate Eukaryotes.\n\nFor more information please check Prokka [documentation](https://github.com/tseemann/prokka).\n\n> Modifies tool parameter(s):\n> - Prokka: `--kingdom`",
"help_text": "Specifies the kingdom that the input sample is derived from and/or you wish to screen for\n\n> \u26a0\ufe0f Prokka cannot annotate Eukaryotes.\n\nFor more information please check Prokka [documentation](https://github.com/tseemann/prokka).\n\n> Modifies tool parameter(s):\n> - Prokka: `--kingdom`",
"enum": ["Archaea", "Bacteria", "Mitochondria", "Viruses"]
},
"annotation_prokka_gcode": {
Expand Down Expand Up @@ -328,15 +328,15 @@
"help_text": "Prokka annotates genomic sequences belonging to bacterial, archaeal and viral genomes.\n\nDocumentation: https://github.com/tseemann/prokka"
},
"annotation_prodigal": {
"title": "Annotation: PRODIGAL",
"title": "Annotation: Prodigal",
"type": "object",
"description": "These parameters influence the annotation algorithm used by PRODIGAL.",
"description": "These parameters influence the annotation algorithm used by Prodigal.",
"default": "",
"properties": {
"annotation_prodigal_singlemode": {
"type": "boolean",
"description": "Specify whether to use Prodigal's single-genome mode for long sequences.",
"help_text": "By default Prodigal runs in 'single genome' mode that requires sequence lengths to be equal or longer than 20000 characters.\n\nHowever, more fragmented reads from MAGs often result in contigs shorter than this. Therefore, nf-core/funcscan will run with the `meta` mode by default, but providing this parameter allows to override this and run in single genome mode again.\n\nFor more information check Prodigal [documentation](https://github.com/hyattpd/prodigal/wiki).\n\n> Modifies tool parameter(s): \n> -PRODIGAL:`-p`",
"help_text": "By default Prodigal runs in 'single genome' mode that requires sequence lengths to be equal or longer than 20000 characters.\n\nHowever, more fragmented reads from MAGs often result in contigs shorter than this. Therefore, nf-core/funcscan will run with the `meta` mode by default, but providing this parameter allows to override this and run in single genome mode again.\n\nFor more information check Prodigal [documentation](https://github.com/hyattpd/prodigal/wiki).\n\n> Modifies tool parameter(s): \n> -PRODIGAL: `-p`",
"fa_icon": "far fa-circle"
},
"annotation_prodigal_closed": {
Expand All @@ -354,14 +354,49 @@
},
"annotation_prodigal_forcenonsd": {
"type": "boolean",
"description": "Forces PRODIGAL to scan for motifs.",
"description": "Forces Prodigal to scan for motifs.",
"help_text": "Forces PRODIGAL to a full scan for motifs rather than activating the Shine-Dalgarno RBS finder, the default scanner for PRODIGAL to train for motifs.\n\nFor more information check Prodigal [documentation](https://github.com/hyattpd/prodigal/wiki).\n\n> Modifies tool parameter(s):\n> - PRODIGAL: `-n`",
"fa_icon": "fas fa-barcode"
}
},
"fa_icon": "fas fa-tools",
"help_text": "Prodigal is a protein-coding gene prediction tool developed to run on bacterial and archaeal genomes.\n\nDocumentation: https://github.com/hyattpd/prodigal/wiki"
},
"annotation_pyrodigal": {
"title": "Annotation: Pyrodigal",
"type": "object",
"description": "These parameters influence the annotation algorithm used by Pyrodigal.",
"default": "",
"properties": {
"annotation_pyrodigal_singlemode": {
"type": "boolean",
"fa_icon": "far fa-circle",
"description": "Specify whether to use Pyrodigal's single-genome mode for long sequences.",
"help_text": "By default Pyrodigal runs in 'single genome' mode that requires sequence lengths to be equal or longer than 20000 characters.\n\nHowever, more fragmented reads from MAGs often result in contigs shorter than this. Therefore, nf-core/funcscan will run with the `meta` mode by default, but providing this parameter allows to override this and run in single genome mode again.\n\nFor more information check Pyrodigal [documentation](https://pyrodigal.readthedocs.io).\n\n> Modifies tool parameter(s): \n> -PYRODIGAL: `-p`"
},
"annotation_pyrodigal_closed": {
"type": "boolean",
"fa_icon": "fas fa-circle",
"description": "Does not allow partial genes on contig edges.",
"help_text": "Suppresses the partial genes from being on contig edge to result in closed ends. Should be activated for genomes where it is sure the first and last bases of the sequence(s) do not fall inside a gene. Run together with `-p single` .\n\nFor more information check Pyrodigal [documentation](https://pyrodigal.readthedocs.io).\n\n> Modifies tool parameter(s):\n> - PYRODIGAL: `-c`"
},
"annotation_pyrodigal_transtable": {
"type": "integer",
"default": 11,
"fa_icon": "fas fa-border-all",
"description": "Specifies the translation table used for gene annotation.",
"help_text": "Specifies which translation table should be used for seqeunce annotation. All possible genetic code translation tables can be found [here](https://en.wikipedia.org/wiki/List_of_genetic_codes). The default is set at 11, which is used for standard Bacteria/Archeae.\n\nFor more information check Pyrodigal [documentation](https://pyrodigal.readthedocs.io).\n\n> Modifies tool parameter(s):\n> - PYRODIGAL: `-g`"
},
"annotation_pyrodigal_forcenonsd": {
"type": "boolean",
"fa_icon": "fas fa-barcode",
"description": "Forces Pyrodigal to scan for motifs.",
"help_text": "Forces Pyrodigal to a full scan for motifs rather than activating the Shine-Dalgarno RBS finder, the default scanner for Pyrodigal to train for motifs.\n\nFor more information check Pyrodigal [documentation](https://pyrodigal.readthedocs.io).\n\n> Modifies tool parameter(s):\n> - PYRODIGAL: `-n`"
}
},
"fa_icon": "fas fa-tools",
"help_text": "Pyrodigal produces protein-coding gene predictions of bacterial and archaeal genomes, based on the tool Prodigal being resource-optimized. Read more at the [Pyrodigal GitHub](https://github.com/althonos/pyrodigal)\n\nDocumentation: https://pyrodigal.readthedocs.io"
},
"database_downloading_options": {
"title": "Database downloading options",
"type": "object",
Expand Down Expand Up @@ -828,7 +863,7 @@
"default": 1000,
"description": "Minimum longest-contig length a sample must have to be screened with antiSMASH.",
"fa_icon": "fas fa-ruler-horizontal",
"help_text": "This specifies the minimum length that the longest contig must have for the entire sample to be screened by antiSMASH.\n\nAny samples that do not reach this length will be not be sent to antiSMASH, therefore you will not receive output for these samples in your `--outdir`.\n\n> ⚠️ This is not the same as `--bgc_antismash_contigminlength`, which specifies to only analyse contigs above that threshold but _within_ a sample that has already passed `--bgc_antismash_sampleminlength` sample filter!"
"help_text": "This specifies the minimum length that the longest contig must have for the entire sample to be screened by antiSMASH.\n\nAny samples that do not reach this length will be not be sent to antiSMASH, therefore you will not receive output for these samples in your `--outdir`.\n\n> \u26a0\ufe0f This is not the same as `--bgc_antismash_contigminlength`, which specifies to only analyse contigs above that threshold but _within_ a sample that has already passed `--bgc_antismash_sampleminlength` sample filter!"
},
"bgc_antismash_contigminlength": {
"type": "integer",
Expand Down Expand Up @@ -1301,6 +1336,9 @@
}
},
"allOf": [
{
"$ref": "#/definitions/annotation_pyrodigal"
},
{
"$ref": "#/definitions/input_output_options"
},
Expand Down
2 changes: 1 addition & 1 deletion subworkflows/local/bgc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ workflow BGC {

}

if ( params.annotation_tool == 'prodigal' ) {
if ( params.annotation_tool == 'prodigal' || params.annotation_tool == "pyrodigal" ) {

ch_antismash_input = fna.join(gff, by: 0)
.filter {
Expand Down
2 changes: 1 addition & 1 deletion workflows/funcscan.nf
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ workflow FUNCSCAN {
ch_annotation_gbk = PRODIGAL_GBK.out.gene_annotations
}
} else if ( params.annotation_tool == "pyrodigal" ) {
PYRODIGAL ( ch_prepped_input, "gff" )
PYRODIGAL ( ch_prepped_input )
ch_versions = ch_versions.mix(PYRODIGAL.out.versions)
ch_annotation_faa = PYRODIGAL.out.faa
ch_annotation_fna = PYRODIGAL.out.fna
Expand Down