-
Notifications
You must be signed in to change notification settings - Fork 3
5. Output files
Håkon Kaspersen edited this page Mar 29, 2023
·
5 revisions
Three output directories are generated: versions, logs, and results. The versions directory contains the versions of each tool, while the logs directory contain the nextflow run report and the logs from each tool. Finally, the files listed below are placed in the results folder. Each track has its own html report file outputted in the results directory.
ANI track FASTANI_results.txt: The raw FastANI results FASTANI_cleaned.txt: Filtered and de-duplicated FastANI results cgMLST track MLST_report.txt: MLST results, including sequence type and individual allele results CHEWBBACA_schema: Prepped or downloaded schema CHEWBBACA_schema_evaluation: Schema evaluation results CHEWBBACA_results_statistics.tsv: Overview of allelecalling stats CHEWBBACA_results_alleles.tsv: Main results file from ChewBBACA Allelecall CHEWBBACA_loci_stats.tsv: Allelecalling stats for each loci in the schema CHEWBBACA_unclassified_sequences.fasta: Distinct CDSs that were not classified CHEWBBACA_filtered_allele_results.tsv: Filtered allele results based on --max_missing R_dissimilarity_matrix.tsv: Dissimilarity matrix based on the filtered allele results R_hamming_distances.tsv: Hamming distances, number of called and compared alleles, and number of missing alleles for each pairwise comparison R_dendrogram.phylo: A dendrogram of the dissimilarity matrix based on the user-selected clustering method Core gene track PANAROO_mashdist.txt: Mash distances, from PANAROO QC. MDS_mash_plot.png: MDS plot of the mash results, from PANAROO QC. ncontigs_barplot.png: Number of contigs for each included genome, from PANAROO QC. ngenes_barplot.png: Number of genes per included genome, from PANAROO PANGENOME. PANAROO_pangenome_results.txt: Number of core genes, accessory genes and total genes, from PANAROO PANGENOME. PANAROO_core_gene_alignment.aln: The concatenated core gene alignment of all included genomes in FASTA format, from PANAROO PANGENOME. Core genome track PARSNP_alignment.aln: The converted FASTA alignment of all included genomes, from PARSNP. PARSNP_gingr_archive.ggr: Gingr-file used for visualizing the alignment, from PARSNP. PARSNP_results.txt: Result file that contains information about the size (bp) and the percent coverage of each genome, from PARSNP. PARSNP_unaligned.txt Unaligned regions not included in the core genome alignment. Mapping track SNIPPY_alignment.aln: Core genome multiple alignment (reconstituted, see Snippy multi). SNIPPY_results.txt: File that contains the result statistics, from SNIPPY. General output files SEQKIT_deduplicated_alignment.fasta: Deduplicated alignment in FASTA format, from SEQKIT. SEQKIT_deduplicated_sequences.fasta: The FASTA sequence of each duplicated genome, from SEQKIT. SEQKIT_duplicated_list.txt: List of duplicated sequence IDs, from SEQKIT. Each line represents each group of identical samples. GUBBINS_filtered_alignment.aln: Alignment with recombinant sites removed in FASTA format, from GUBBINS. GUBBINS_statistics.txt: Text file with the result statistics, from GUBBINS. MASKRC_masked_alignment.aln: Alignment with masked recombinant sites in FASTA format, from MASKRC. MASKRC_recombinant_regions.txt: Text file with information about the recombinant regions, from MASKRC. MASKRC_recombinant_plot.svg: Genomic location of recombination according to the alignment, from MASKRC. IQTREE_tree.phylo: Concensus tree with bootstrap values in NEXUS format, from IQTREE. IQTREE_ml_tree.phylo: Maximum likelihood tree in NEXUS format, from IQTREE. IQTREE_results.txt: Text file with results, from IQTREE. IQTREE_bootstrap_trees.ufboot: Bootstrap values with branch lengths, from IQTREE IQTREE_alninfo.txt: Alignment information, from IQTREE IQTREE_splits.nex: Support values in percentages for all splits SNPDIST_results.txt: The SNP distance matrix calculated from the (recombination-masked) alignment, from SNPDIST.