Nextflow Cancer Analysis Workflow Prototype developed at the National Genomics Infastructure at SciLifeLab Stockholm, Sweden.
0.8.3
- Sebastian DiLorenzo (@Sebastian-D)
- Jesper Eisfeldt (@J35P312)
- Maxime Garcia (@MaxUlysse)
- Szilveszter Juhos (@szilvajuhos)
- Max Käller (@gulfshores)
- Malin Larsson (@malinlarsson)
- Björn Nystedt (@bjornnystedt)
- Pall Olason (@pallolason)
- Pelin Sahlén (@pelinakan)
See the Nextflow documentation from SciLifeLab
See the workflow installation documentation
I would recommand to run Nextflow within a screen session (cf help on screen).
nextflow run SciLifeLab/CAW --sample <file.tsv> [--steps STEP[,STEP]]
All variables and parameters are specified in the config (cf configuration options) and the sample files.
To configure which processes will be runned or skipped in the workflow. Different steps to be separated by commas. Possible values are:
- preprocessing (default, will start workflow with FASTQ files)
- recalibrate (will start workflow with non-recalibrated BAM files)
- skipPreprocessing (will start workflow with recalibrated BAM files)
- MuTect1 (use MuTect1 for VC)
- MuTect2 (use MuTect2 for VC)
- VarDict (use VarDict for VC)
- Strelka (use Strelka for VC)
- HaplotypeCaller (use HaplotypeCaller for normal bams VC)
- Manta (use Manta for SV)
- ascat (use ascat for CNV)
To have more information about files being processed, you can use the verbose option
nextflow run SciLifeLab/CAW --sample mysample.tsv --steps preprocessing --verbose
More informations on Nextflow documentation
-c <file.config>
If no config file is specified, Nextflow will look for one in Nextflow intallation $NXF_HOME/config
of for one in the current directory nextflow.config
.
The config file provided as an example is a config file specific to Swedish UPPMAX milou cluster, but can be easily modified to suit any clusters.
You can use this file as an example to make your own config file. And you can even if needed make several config files (for example if you want to have a config file for each UPPMAX project identifier).
Use nextflow clean -f
to remove everything contained in the work
directory. Do not worry, non-recalibrated bam, indexes and recalibration tables as well as recalibrated bams and index are stored respectively in the Preprocessing/NonRecalibrated
and Preprocessing/Recalibrated
directories. And variant calling files are stored in the VariantCalling
directory.
nextflow clean -f
Use -resume
to restart the workflow where it last failed.
nextflow run SciLifeLab/CAW --sample mysample.tsv --steps preprocessing -resume
Use info
to get information about the workflow.
nextflow info SciLifeLab/CAW
Use pull
to update the workflow.
nextflow pull SciLifeLab/CAW
Several processes are run within Nextflow We divide them for the moment into 2 main steps
- Mapping - Map reads with BWA
- MergeBam - Merge BAMs if multilane samples
- RenameSingleBam - Rename BAM if non-multilane sample
- MarkDuplicates - using Picard
- CreateIntervals - using GATK
- Realign - using GATK
- CreateRecalibrationTable - using GATK
- RecalibrateBam - using GATK
- RunMutect2 - using MuTect2 shipped in GATK v3.6
- concatFiles - merge MuTect2 results
- VarDict - run VarDict on multiple intervals
- VarDictCollatedVCF - merge Vardict results
- RunStrelka - using Strelka 1.0.15
- Manta - run Manta 1.0.0
It's a Tab Separated Value file, based on: subject status sample lane fastq1 fastq2
, subject status sample bam bai recal
or subject status sample bam bai
Quite straight-forward:
subject
is the ID of the Patientstatus
is the status of the Patient, (0 for Normal and 1 for Tumor)sample
is the Sample ID (It is possible to have more than one tumor sample for each patient)lane
is used when the sample is multiplexed on several lanesfastq1
is the path to the first pair of the fastq filefastq2
is the path to the second pair of the fastq filebam
is the realigned bam filebai
is the indexrecal
is the recalibration table
See the workflow TSV example documentation
- nextflow 0.17.3
- bwa 0.7.8
- samtools 1.3
- picard 1.118
- GATK 3.6
- R 3.2.3
- gcc 4.9.2
- perl 5.18.4
- strelka 1.0.15
- manta 1.0.0
See the workflow use cases documentation