Skip to content

Commit 0374473

Browse files
BAM/CRAM support
For milestone 2.2.0
2 parents 0420389 + 55bb07f commit 0374473

32 files changed

+1769
-17
lines changed

CHANGELOG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,24 @@
33
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
55

6+
## [v2.2.0dev](https://github.com/nf-core/pairgenomealign/releases/tag/2.2.0)
7+
8+
### `Added`
9+
10+
- Support for export to BAM and CRAM formats ([#31](https://github.com/nf-core/pairgenomealign/issues/31)) ([#43](https://github.com/nf-core/pairgenomealign/issues/43)).
11+
12+
### `Dependencies`
13+
14+
| Dependency | Old version | New version |
15+
| ---------------- | ----------- | ----------- |
16+
| `SAMTOOLS_BGZIP` | | 1.21 |
17+
| `SAMTOOLS_DICT` | | 1.21 |
18+
| `SAMTOOLS_FAIDX` | | 1.21 |
19+
20+
### `Fixed`
21+
22+
- Remove noisy tag in the `MULTIQC_ASSEMBLYSCAN_PLOT_DATA` local module ([#64](https://github.com/nf-core/pairgenomealign/issues/64)).
23+
624
## [v2.1.0](https://github.com/nf-core/pairgenomealign/releases/tag/2.1.0) "Goya champuru" - [May 16th 2025]
725

826
### `Added`

CITATIONS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@
2626
2727
> Frith MC, Shaw J, Spouge JL. How to optimally sample a sequence for rapid analysis. doi: 10.1093/bioinformatics/btad057 PubMed PMID: 36702468 (Describes the lastdb -u RY sparsity options.)
2828
29+
- [SAMtools](https://pubmed.ncbi.nlm.nih.gov/19505943/)
30+
31+
> Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8. PubMed PMID: 19505943; PubMed Central PMCID: PMC2723002.
32+
2933
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
3034

3135
> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ The main steps of the pipeline are:
2929
2. Genome indexing ([`lastdb`](https://gitlab.com/mcfrith/last/-/blob/main/doc/lastdb.rst)).
3030
3. Genome pairwise alignments ([`lastal`](https://gitlab.com/mcfrith/last/-/blob/main/doc/lastal.rst)).
3131
4. Alignment plotting ([`last-dotplot`](https://gitlab.com/mcfrith/last/-/blob/main/doc/last-dotplot.rst)).
32+
5. Alignment export to various formats with [`maf-convert`](https://gitlab.com/mcfrith/last/-/blob/main/doc/maf-convert.rst), plus [`Samtools`](https://www.htslib.org/) for SAM/BAM/CRAM.
3233

3334
The pipeline can generate four kinds of outputs, called _many-to-many_, _many-to-one_, _one-to-many_ and _one-to-one_, depending on whether sequences of one genome are allowed match the other genome multiple times or not.
3435

assets/multiqc_config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
report_comment: >
2-
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/releases/tag/2.1.0"
2+
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/tree/dev"
33
target="_blank">nf-core/pairgenomealign</a> analysis pipeline. For information about
4-
how to interpret these results, please see the <a href="https://nf-co.re/pairgenomealign/2.1.0/docs/output"
4+
how to interpret these results, please see the <a href="https://nf-co.re/pairgenomealign/dev/docs/output"
55
target="_blank">documentation</a>.
66
report_section_order:
77
"nf-core-pairgenomealign-methods-description":

conf/modules.config

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,4 +126,24 @@ process {
126126
]
127127
}
128128

129+
withName: 'SAMTOOLS_BGZIP' {
130+
publishDir = publishDir + [
131+
path: { "${params.outdir}/alignment" }
132+
]
133+
}
134+
135+
withName: 'SAMTOOLS_FAIDX' {
136+
publishDir = [
137+
path: { "${params.outdir}/alignment" },
138+
mode: params.publish_dir_mode,
139+
saveAs: { filename -> (filename.equals('versions.yml') || filename.endsWith('.sizes')) ? null : filename }
140+
]
141+
}
142+
143+
withName: 'SAMTOOLS_DICT' {
144+
publishDir = [
145+
enabled: false
146+
]
147+
}
148+
129149
}

docs/output.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Basic statistics on nucleotide content and contig length are collected for align
4141
- `*.m2o_aln.maf.gz` is the _**many-to-one**_ alignment regions of the _target_ genome are matched at most once by the _query_ genome. (optional through the `--m2m` option)
4242
- `*.o2m_aln.maf.gz` is the _**one-to-many**_ alignment between the _target_ and _query_ genomes. (optional through the `--m2m` option)
4343
- `*.o2o_aln.maf.gz` is the _**one-to-one**_ alignment between the _target_ and _query_ genomes.
44-
- For each MAF file there will be an additional file in a format such as Axt, Chain, GFF or SAM if you used the `--export_aln_to` parameter. These files are always compressed.
44+
- For each _**one-to-one**_ alignment there will be an additional file in a format such as Axt, Chain, GFF or SAM/BAM/CRAM if you used the `--export_aln_to` parameter. These extra files are always compressed with gzip when their format is text-based.
4545

4646
</details>
4747

modules.json

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,21 @@
5050
"git_sha": "7b50cb7be890e4b28cffb82e438cc6a8d7805d3f",
5151
"installed_by": ["modules"]
5252
},
53+
"samtools/bgzip": {
54+
"branch": "master",
55+
"git_sha": "f2dba87f4793a4015f0611cd1743cbfb598c74e7",
56+
"installed_by": ["modules"]
57+
},
58+
"samtools/dict": {
59+
"branch": "master",
60+
"git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208",
61+
"installed_by": ["modules"]
62+
},
63+
"samtools/faidx": {
64+
"branch": "master",
65+
"git_sha": "05954dab2ff481bcb999f24455da29a5828af08d",
66+
"installed_by": ["modules"]
67+
},
5368
"seqtk/cutn": {
5469
"branch": "master",
5570
"git_sha": "05954dab2ff481bcb999f24455da29a5828af08d",

modules/local/multiqc_assemblyscan_plot_data/main.nf

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
process MULTIQC_ASSEMBLYSCAN_PLOT_DATA {
2-
tag "${json.baseName}"
32
label 'process_single'
43

54
conda "${moduleDir}/environment.yml"

modules/nf-core/samtools/bgzip/environment.yml

Lines changed: 8 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

modules/nf-core/samtools/bgzip/main.nf

Lines changed: 60 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

modules/nf-core/samtools/bgzip/meta.yml

Lines changed: 50 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

modules/nf-core/samtools/bgzip/tests/main.nf.test

Lines changed: 109 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)