Skip to content

Commit b96a753

Browse files
authored
Merge pull request #1476 from nf-core/dev
Dev -> Master for 3.18.0
2 parents 00f924c + 324dcdf commit b96a753

File tree

97 files changed

+6676
-517
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

97 files changed

+6676
-517
lines changed

.github/workflows/ci.yml

+5
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,11 @@ jobs:
6868
- name: Check out pipeline code
6969
uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4
7070

71+
- uses: actions/setup-java@8df1039502a15bceb9433410b1a100fbe190c53b # v4
72+
with:
73+
distribution: "temurin"
74+
java-version: "17"
75+
7176
- name: Set up Nextflow
7277
uses: nf-core/setup-nextflow@v2
7378
with:

CHANGELOG.md

+78-30
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,54 @@
33
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
55

6+
# 3.18.0 - 2024-12-19
7+
8+
### Credits
9+
10+
Special thanks to the following for their contributions to the release:
11+
12+
- [Caitlin Winkler](https://github.com/oligomyeggo)
13+
- [Jonathan Manning](https://github.com/pinin4fjords)
14+
- [Lorenzo Sola](https://github.com/LorenzoS96)
15+
- [Maxime Garcia](https://github.com/maxulysse)
16+
- [Siddhartha Bagaria](https://github.com/siddharthab)
17+
18+
### Enhancements & fixes
19+
20+
- [PR #1369](https://github.com/nf-core/rnaseq/pull/1369) - Add umicollapse as an alternative to umi-tools
21+
- [PR #1461](https://github.com/nf-core/rnaseq/pull/1461) - Add FASTQ linting during preprocessing
22+
- [PR #1463](https://github.com/nf-core/rnaseq/pull/1463) - Move channel operations outside of the onComplete() block
23+
- [PR #1467](https://github.com/nf-core/rnaseq/pull/1467) - Add test suite for UMI handling functionality
24+
- [PR #1466](https://github.com/nf-core/rnaseq/pull/1466) - Factor out UMI handling
25+
- [PR #1470](https://github.com/nf-core/rnaseq/pull/1470) - Update subworkflow to account for fix to bad argument handling
26+
- [PR #1469](https://github.com/nf-core/rnaseq/pull/1469) - Minor docs fix
27+
- [PR #1459](https://github.com/nf-core/rnaseq/pull/1466) - Remove reference to unused "skip_sample_count" value in email templates
28+
- [PR #1471](https://github.com/nf-core/rnaseq/pull/1471) - Fix prepare_genome subworkflow for sortmerna
29+
- [PR #1473](https://github.com/nf-core/rnaseq/pull/1473) - Bump STAR modules
30+
- [PR #1474](https://github.com/nf-core/rnaseq/pull/1474) - Bump versions to 3.18.0
31+
- [PR #1475](https://github.com/nf-core/rnaseq/pull/1475) - Fix log publishing around umitools/ umicollapse
32+
- [PR #1447](https://github.com/nf-core/rnaseq/pull/1447) - Add tutorial series for analysing count data
33+
34+
## Parameters
35+
36+
| Old parameter | New parameter |
37+
| ------------- | --------------------- |
38+
| | `--skip_linting` |
39+
| | `--extra_fqlint_args` |
40+
| | `--umi_dedup_tool` |
41+
42+
### Software dependencies
43+
44+
| Dependency | Old version | New version |
45+
| ------------- | ----------- | ----------- |
46+
| `UMICollapse` | | 1.1.0 |
47+
48+
> **NB:** Dependency has been **updated** if both old and new version information is present.
49+
>
50+
> **NB:** Dependency has been **added** if just the new version information is present.
51+
>
52+
> **NB:** Dependency has been **removed** if new version information isn't present.
53+
654
## [[3.17.0](https://github.com/nf-core/rnaseq/releases/tag/3.17.0)] - 2024-10-23
755

856
### Credits
@@ -1007,14 +1055,14 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi
10071055

10081056
### Parameters
10091057

1010-
| Old parameter | New parameter |
1011-
| --------------------------- | -------------------------------------- |
1012-
| `--fc_extra_attributes` | `--gtf_extra_attributes` |
1013-
|  `--fc_group_features` |  `--gtf_group_features` |
1014-
|  `--fc_count_type` |  `--gtf_count_type` |
1015-
|  `--fc_group_features_type` |  `--gtf_group_features_type` |
1016-
|   |  `--singularity_pull_docker_container` |
1017-
|  `--skip_featurecounts` |   |
1058+
| Old parameter | New parameter |
1059+
| -------------------------- | ------------------------------------- |
1060+
| `--fc_extra_attributes` | `--gtf_extra_attributes` |
1061+
| `--fc_group_features` | `--gtf_group_features` |
1062+
| `--fc_count_type` | `--gtf_count_type` |
1063+
| `--fc_group_features_type` | `--gtf_group_features_type` |
1064+
| | `--singularity_pull_docker_container` |
1065+
| `--skip_featurecounts` | |
10181066

10191067
> **NB:** Parameter has been **updated** if both old and new parameter information is present.
10201068
> **NB:** Parameter has been **added** if just the new parameter information is present.
@@ -1092,28 +1140,28 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi
10921140

10931141
#### Updated
10941142

1095-
| Old parameter | New parameter |
1096-
| ----------------------------- | --------------------------- |
1097-
| `--reads` | `--input` |
1098-
|  `--igenomesIgnore` |  `--igenomes_ignore` |
1099-
|  `--removeRiboRNA` |  `--remove_ribo_rna` |
1100-
|  `--rRNA_database_manifest` |  `--ribo_database_manifest` |
1101-
|  `--save_nonrRNA_reads` |  `--save_non_ribo_reads` |
1102-
|  `--saveAlignedIntermediates` |  `--save_align_intermeds` |
1103-
|  `--saveReference` |  `--save_reference` |
1104-
|  `--saveTrimmed` |  `--save_trimmed` |
1105-
|  `--saveUnaligned` |  `--save_unaligned` |
1106-
|  `--skipAlignment` |  `--skip_alignment` |
1107-
|  `--skipBiotypeQC` |  `--skip_biotype_qc` |
1108-
|  `--skipDupRadar` |  `--skip_dupradar` |
1109-
|  `--skipFastQC` |  `--skip_fastqc` |
1110-
|  `--skipMultiQC` |  `--skip_multiqc` |
1111-
|  `--skipPreseq` |  `--skip_preseq` |
1112-
|  `--skipQC` |  `--skip_qc` |
1113-
|  `--skipQualimap` |  `--skip_qualimap` |
1114-
|  `--skipRseQC` |  `--skip_rseqc` |
1115-
|  `--skipTrimming` |  `--skip_trimming` |
1116-
|  `--stringTieIgnoreGTF` |  `--stringtie_ignore_gtf` |
1143+
| Old parameter | New parameter |
1144+
| ---------------------------- | -------------------------- |
1145+
| `--reads` | `--input` |
1146+
| `--igenomesIgnore` | `--igenomes_ignore` |
1147+
| `--removeRiboRNA` | `--remove_ribo_rna` |
1148+
| `--rRNA_database_manifest` | `--ribo_database_manifest` |
1149+
| `--save_nonrRNA_reads` | `--save_non_ribo_reads` |
1150+
| `--saveAlignedIntermediates` | `--save_align_intermeds` |
1151+
| `--saveReference` | `--save_reference` |
1152+
| `--saveTrimmed` | `--save_trimmed` |
1153+
| `--saveUnaligned` | `--save_unaligned` |
1154+
| `--skipAlignment` | `--skip_alignment` |
1155+
| `--skipBiotypeQC` | `--skip_biotype_qc` |
1156+
| `--skipDupRadar` | `--skip_dupradar` |
1157+
| `--skipFastQC` | `--skip_fastqc` |
1158+
| `--skipMultiQC` | `--skip_multiqc` |
1159+
| `--skipPreseq` | `--skip_preseq` |
1160+
| `--skipQC` | `--skip_qc` |
1161+
| `--skipQualimap` | `--skip_qualimap` |
1162+
| `--skipRseQC` | `--skip_rseqc` |
1163+
| `--skipTrimming` | `--skip_trimming` |
1164+
| `--stringTieIgnoreGTF` | `--stringtie_ignore_gtf` |
11171165

11181166
#### Added
11191167

assets/email_template.html

-19
Original file line numberDiff line numberDiff line change
@@ -34,25 +34,6 @@ <h4 style="margin-top: 0; color: inherit">nf-core/rnaseq execution completed uns
3434
<p>The full error message was:</p>
3535
<pre style="white-space: pre-wrap; overflow: visible; margin-bottom: 0">${errorReport}</pre>
3636
</div>
37-
""" } else if(skip_sample_count > 0) { out << """
38-
<div
39-
style="
40-
color: #856404;
41-
background-color: #fff3cd;
42-
border-color: #ffeeba;
43-
padding: 15px;
44-
margin-bottom: 20px;
45-
border: 1px solid transparent;
46-
border-radius: 4px;
47-
"
48-
>
49-
<h4 style="margin-top: 0; color: inherit">nf-core/rnaseq execution completed with warnings!</h4>
50-
<p>
51-
The pipeline finished successfully, but samples were skipped. Please check warnings at the top of the MultiQC report.
52-
</p>
53-
<p></p>
54-
</div>
55-
5637
""" } else { out << """
5738
<div
5839
style="

assets/email_template.txt

-7
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,6 @@ The full error message was:
1717

1818
${errorReport}
1919
"""
20-
} else if (skip_sample_count > 0) {
21-
out << """##################################################
22-
## nf-core/rnaseq execution completed with warnings ##
23-
##################################################
24-
The pipeline finished successfully, but samples were skipped.
25-
Please check warnings at the top of the MultiQC report.
26-
"""
2720
} else {
2821
out << "## nf-core/rnaseq execution completed successfully! ##"
2922
}

docs/images/mqc_fastqc_adapter.png

22.9 KB
Loading

docs/images/mqc_fastqc_counts.png

33.1 KB
Loading

docs/images/mqc_fastqc_quality.png

54.5 KB
Loading

docs/output.md

+19-4
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
1919
- [Pipeline overview](#pipeline-overview)
2020
- [Preprocessing](#preprocessing)
2121
- [cat](#cat)
22+
[fq lint](#fq-lint)
2223
- [FastQC](#fastqc)
2324
- [UMI-tools extract](#umi-tools-extract)
2425
- [TrimGalore](#trimgalore)
@@ -73,6 +74,20 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
7374

7475
If multiple libraries/runs have been provided for the same sample in the input samplesheet (e.g. to increase sequencing depth) then these will be merged at the very beginning of the pipeline in order to have consistent sample naming throughout the pipeline. Please refer to the [usage documentation](https://nf-co.re/rnaseq/usage#samplesheet-input) to see how to specify these samples in the input samplesheet.
7576

77+
### fq lint
78+
79+
<details markdown="1">
80+
<summary>Output files</summary>
81+
82+
- `fq_lint/*`
83+
- `*.fq_lint.txt`: Linting report per library from `fq lint`.
84+
85+
> **NB:** You will see subdirectories here based on the stage of preprocessing for the files that have been linted, for example `raw`, `trimmed`.
86+
87+
</details>
88+
89+
[fq lint](https://github.com/stjude-rust-labs/fq) runs several checks on input FASTQ files. It will fail with a non-zero error code when issues are found, which will terminate the workflow execution. In the absence of this, the successful linting produces the logs you will find here.
90+
7691
### FastQC
7792

7893
<details markdown="1">
@@ -105,7 +120,7 @@ If multiple libraries/runs have been provided for the same sample in the input s
105120

106121
</details>
107122

108-
[UMI-tools](https://github.com/CGATOxford/UMI-tools) deduplicates reads based on unique molecular identifiers (UMIs) to address PCR-bias. Firstly, the UMI-tools `extract` command removes the UMI barcode information from the read sequence and adds it to the read name. Secondly, reads are deduplicated based on UMI identifier after mapping as highlighted in the [UMI-tools dedup](#umi-tools-dedup) section.
123+
[UMI-tools](https://github.com/CGATOxford/UMI-tools) and [UMICollapse](https://github.com/Daniel-Liu-c0deb0t/UMICollapse) deduplicate reads based on unique molecular identifiers (UMIs) to address PCR-bias. Firstly, the UMI-tools `extract` command removes the UMI barcode information from the read sequence and adds it to the read name. Secondly, reads are deduplicated based on UMI identifier after mapping as highlighted in the [UMI dedup](#umi-dedup) section.
109124

110125
To facilitate processing of input data which has the UMI barcode already embedded in the read name from the start, `--skip_umi_extract` can be specified in conjunction with `--with_umi`.
111126

@@ -290,7 +305,7 @@ The original BAM files generated by the selected alignment algorithm are further
290305

291306
![MultiQC - SAMtools mapped reads per contig plot](images/mqc_samtools_idxstats.png)
292307

293-
### UMI-tools dedup
308+
### UMI dedup
294309

295310
<details markdown="1">
296311
<summary>Output files</summary>
@@ -299,7 +314,7 @@ The original BAM files generated by the selected alignment algorithm are further
299314
- `<SAMPLE>.umi_dedup.sorted.bam`: If `--save_umi_intermeds` is specified the UMI deduplicated, coordinate sorted BAM file containing read alignments will be placed in this directory.
300315
- `<SAMPLE>.umi_dedup.sorted.bam.bai`: If `--save_umi_intermeds` is specified the BAI index file for the UMI deduplicated, coordinate sorted BAM file will be placed in this directory.
301316
- `<SAMPLE>.umi_dedup.sorted.bam.csi`: If `--save_umi_intermeds --bam_csi_index` is specified the CSI index file for the UMI deduplicated, coordinate sorted BAM file will be placed in this directory.
302-
- `<ALIGNER>/umitools/`
317+
- `<ALIGNER>/umitools/` (UMI-tools only)
303318
- `*_edit_distance.tsv`: Reports the (binned) average edit distance between the UMIs at each position.
304319
- `*_per_umi.tsv`: UMI-level summary statistics.
305320
- `*_per_umi_per_position.tsv`: Tabulates the counts for unique combinations of UMI and position.
@@ -308,7 +323,7 @@ The content of the files above is explained in more detail in the [UMI-tools doc
308323

309324
</details>
310325

311-
After extracting the UMI information from the read sequence (see [UMI-tools extract](#umi-tools-extract)), the second step in the removal of UMI barcodes involves deduplicating the reads based on both mapping and UMI barcode information using the UMI-tools `dedup` command. This will generate a filtered BAM file after the removal of PCR duplicates.
326+
After extracting the UMI information from the read sequence (see [UMI-tools extract](#umi-tools-extract)), the second step in the removal of UMI barcodes involves deduplicating the reads based on both mapping and UMI barcode information. UMI deduplication can be carried out either with [UMI-tools](https://github.com/CGATOxford/UMI-tools) or [UMICollapse](https://github.com/Daniel-Liu-c0deb0t/UMICollapse), set via the `umi_dedup_tool` parameter. The output BAM files are the same, though UMI-tools has some additional outputs, as described above. Either method will generate a filtered BAM file after the removal of PCR duplicates.
312327

313328
### picard MarkDuplicates
314329

docs/usage.md

+6
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,12 @@ CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz,a
2727
CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz,auto
2828
```
2929

30+
### Linting
31+
32+
By default, the pipeline will run [fq lint](https://github.com/stjude-rust-labs/fq) on all input FASTQ files, both at the start of preprocessing and after each preprocessing step that manipulates FASTQ files. If errors are found, and error will be reported and the workflow will stop.
33+
34+
The `extra_fqlint_args` parameter can be manipulated to disable [any validator](https://github.com/stjude-rust-labs/fq?tab=readme-ov-file#validators) from `fq` you wish. For example, we have found that checks on the names of paired reads are prone to failure, so that check is disabled by default (setting `extra_fqlint_args` to `--disable-validator P001`).
35+
3036
### Strandedness Prediction
3137

3238
If you set the strandedness value to `auto`, the pipeline will sub-sample the input FastQ files to 1 million reads, use Salmon Quant to automatically infer the strandedness, and then propagate this information through the rest of the pipeline. This behavior is controlled by the `--stranded_threshold` and `--unstranded_threshold` parameters, which are set to 0.8 and 0.1 by default, respectively. This means:

0 commit comments

Comments
 (0)