Skip to content

Container recipe for wrappers #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mbhall88 opened this issue Feb 10, 2020 · 18 comments
Open

Container recipe for wrappers #57

mbhall88 opened this issue Feb 10, 2020 · 18 comments
Labels
enhancement New feature or request

Comments

@mbhall88
Copy link
Member

Transferring issue from bitbucket.

Adrien Leger:

Conda env files are a great way to easily deploy software in wrappers, but sometimes the program you want is not on anaconda cloud or requires more complicated installation and setting steps.

What about having the possibility to use singularity/docker recipes instead to auto deploy a wrapper ?

Johannes:

Yes, that indeed won’t hurt. But the container images have to come from somewhere where sustainability is guaranteed, e.g. biocontainers. Snakemake wrapper implementation needs a minor extension such that it also searches for singularity/docker image definitions in the wrapper repo. I will put this on my TODO list.

Adrien:

Awesome
An synthax to use a local file, similar to what is currently available for a conda recipe would be nice as well, at least for dev ?
Thanks

Johannes:

I think it would make sense to simply put a container URL into the meta.yaml.

@mbhall88 mbhall88 added the enhancement New feature or request label Feb 10, 2020
@mbhall88
Copy link
Member Author

I am transferring this as I would like to reignite the conversation.
This issue is the sole issue preventing me from using wrappers at the moment (and some people in my group).

Would you like some help with this @johanneskoester (that is if you think this feature should be added)? If so, would you be able to point me towards where would be the best place for me to start investigating a solution?

@a-slide
Copy link

a-slide commented Feb 10, 2020

Thanks Mike
I am still very interested in this feature.
I also like to use wrapper locally (see https://github.com/a-slide/NanoSnake). So I would say having the possibility to use a local singularity recipe instead of the conda YAML would be very cool.

@vsoch
Copy link

vsoch commented Feb 19, 2020

I can offer to help as well! @mbhall88 and @a-slide in case you didn't notice at the top, please note #59 as well!

@mbhall88
Copy link
Member Author

Great! Thanks @vsoch . And yes, I did see the message from Johannes - the poor guy is going to have so many notifications to come back to.

Did you want to have a chat to sort out how best to approach implementing this? What is the best way for communicating for you? (maybe we can continue this via DM on twitter? or anywhere else is fine)

@vsoch
Copy link

vsoch commented Feb 21, 2020

I wouldn't be the right one to have an idea for best practices for implementation, I have worked on snakemake source code but not snakemake-wrappers source code. If you think you have enough experience to be that guide, then we can definitely give it a go and then get Johannes feeback! But otherwise, I think we should at least wait for @johanneskoester to come back and give some vision for how it's best done.

And reaching me - yes DM on Twitter works, as does any channel on Gitter.

@mbhall88
Copy link
Member Author

Ok, fair point. We'll wait for Johannes.

@mbhall88
Copy link
Member Author

@johanneskoester would you be interested in this?

@johanneskoester
Copy link
Contributor

Hi folks,
I'm back :-). Absolutely, I would be very happy to include this. As cited above, a container URL in the wrapper meta yaml should be enough to configure it. Then, one needs a function snakemake.wrapper.get_container_img(path, prefix=None), analog to snakemake.wrapper.get_conda_env. And then, you just need to use it analogously to get_conda_env in workflow.py here.

All in all, just a few lines of code I think. Which makes me even more sorry to not having done this myself already.

@vsoch
Copy link

vsoch commented Jul 27, 2020

@mbhall88 do you have an example of the wrapper meta yaml so I could see what it looks like? I see that the dag also pulls singularity containers and I'm trying to figure out how those two are different.

@mbhall88
Copy link
Member Author

As cited above, a container URL in the wrapper meta yaml should be enough to configure it.

@johanneskoester are you suggesting that if the container URI is in the meta.yaml then there is no need for envornment.yaml? Or have both and if the user doesn't specify --use-singularity then we default to the conda env?

@vsoch I'll provide an example from bwa mem.

meta.yaml

name: "bwa mem"
description: Map reads using bwa mem, with optional sorting using
  samtools or picard.
authors:
  - Johannes Köster
  - Julian de Ruiter

So I guess we could just add something akin to snakemake where we say container: <URI>?

@vsoch
Copy link

vsoch commented Jul 29, 2020

@mbhall88 ah I think I understand now! So the wrapper would essentially have a singularity (via docker) image backend instead of using conda. @mbhall88 what about if a recipe would want to provide a conda and container option? Would we want to maintain separate wrappers, e.g.::

rule samtools_sort:
    input:
        "mapped/{sample}.bam"
    output:
        "mapped/{sample}.sorted.bam"
    params:
        "-m 4G"
    threads: 8
    wrapper:
        "0.0.8/bio/samtools_sort_container"`

or even a different "container" namespace that mirrors the top level:

    wrapper:
        "0.0.8/containers/bio/samtools_sort"`

Or add some parameter to the recipe for the user to choose (and then have a default?). My worry with doing the first is that the user would expect the exact same outcome / performance between the container and conda runs, and I'm not sure we could promise that. On the other hand, maintaining them separately is also challenging. What do you think?

@mbhall88
Copy link
Member Author

mbhall88 commented Jul 30, 2020

I was thinking it might be good to provide both a conda environment (if applicable) and a container for the wrapper. Then, we decide which environment to use based on the snakemake CLI options

# use wrapper's conda env (if there is one)
snakemake --use-conda
# use wrapper's container env (if there is one)
snakemake --use-singularity
# use wrapper's container env. if no container env, use conda env
snakemake --use-singularity --use-conda

This way the user doesn't have to think about additional parameters within the snakefile and it is all controlled with existing CLI options that are already required for using wrappers and containers.

@vsoch
Copy link

vsoch commented Jul 30, 2020

I like that too! And probably since most wrappers are conda now, we would default to that if a container isn't available. And this also means we would try our best to match versions, but of course it can't be absolutely guaranteed that the container and conda install would be exactly the same.

One more question for you @mbhall88 before we look at code - I'm concerned about mapping containers to snakemake-wrappers. For example, if you look at the biocontainers bwa Dockerfile, it's only installing bwa, but the wrapper that you linked has bwa, samtools, and picard. We should decide on some strategy for mapping containers to wrappers.

  • we could use biocontainers, and only map those that are absolute matches (I'm not sure I like this approach because the versions would move at different times and many wrappers don't have an absolute match)
  • we could have an automated build to docker hub,perhaps a Snakemake wrappers organization, that always has the correct versions and software (this is my preference). So a wrapper here would have a template for an automated build (we could easily use GitHub actions) and the build would also be tested when a new wrapper version is created. I'm (personally) more of a fan of using Quay.io because we could have repository specific robot tokens, as opposed to docker hub which has tokens only for an entire user account. But we could also try setting up a hook (automated build) if docker hub is preferred. What do you think?

@mbhall88
Copy link
Member Author

I'm concerned about mapping containers to snakemake-wrappers. For example, if you look at the biocontainers bwa Dockerfile, it's only installing bwa, but the wrapper that you linked has bwa, samtools, and picard. We should decide on some strategy for mapping containers to wrappers.

Yes, you're right. This is a pain point.
I like your second option.
One question I have about this is are you recommending we have a Dockerfile with each wrapper? Or it auto-builds an image based on the packages and versions in the environment.yaml?

@vsoch
Copy link

vsoch commented Jul 31, 2020

Sort of both - we would have a Dockerfile that uses the environment.yml to install stuffs. That way, they are as close to identical as possible! The one issue I see for setting up automated builds is the lack of modularity in the single repo. Ideally we would have one automated build per repo without some complicated logic for the trigger and settings. I was thinking we should slowly port over to a Snakemake-wrappers org, but of course we need feedback from @johanneskoester first! But if y’all like the idea I can take a first shot at a modular repo with an automated build and then work on a Snakemake PR to add the functionality.

@vsoch
Copy link

vsoch commented Jul 31, 2020

@mbhall88 one more question - does this approach assume we need to read in the meta.yaml file to discover the container? Or did you have something else in mind? I did a grep of snakemake and (as far as I can tell) it's not read in anywhere, because largely we don't need it for the run.

@vsoch
Copy link

vsoch commented Aug 1, 2020

hey @mbhall88 - happy Saturday! I got us started on something to get this ball rolling, see snakemake/snakemake#532. Note that the implementation will require changes to the wrappers repository here, and I've detailed what we need to discuss in the PR description.

@mbhall88 mbhall88 changed the title Singularity recipe for wrappers Container recipe for wrappers Aug 3, 2020
@mbhall88
Copy link
Member Author

mbhall88 commented Aug 3, 2020

@vsoch you are impressive!!

I have some questions/comments but will move the discussion to the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants