Skip to content

Parallelize distance calculation  #142

Open
@gordonkoehn

Description

@gordonkoehn

The runtime of distance calculation may exceed the actual MCMC runtime.

For MP3 distance with 30 mutations in the trees, the runtime of calculating the distances is more than double that of the actual MCMC chain.

This calls for making the distance calculation parallelizable.

I.e. compute the distances in chunks using multiprocessing in Python.

In particular, the function to be parallelized is:

yg.analyze.analyze_mcmc_run(mcmc_data, metric, base_tree)

Along with this, the number of threads should be adjusted in the snakemake rules.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions