Skip to content

GCP provider #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 68 commits into from
Nov 6, 2020
Merged

GCP provider #131

merged 68 commits into from
Nov 6, 2020

Conversation

quasiben
Copy link
Member

@quasiben quasiben commented Sep 8, 2020

Attempt at resolving #130

cc @jacobtomlinson @mrocklin

Currently, this code is non-functional but scaffolding is in place to get things up and running

PR should be in a runnable state now

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this for a spin but am stuck with credentials errors. It's likely something I'm doing, but I'm going to need some docs/guidance.

I've tried:

  • Reauthenticating with gcloud auth login
  • Upgrading my SDK version pip install google-api-python-client --upgrade

See my comments for details on what I'm seeing.

setup.py Outdated
@@ -8,6 +8,8 @@
extras_require = {
"aws": ["aiobotocore>=0.10.2"],
"azure": ["azureml-sdk>=1.0.83"],
"digitalocean": ["python-digitalocean"],
"googlecloud": ["google-api-python-client"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match the ImportError which says pip install "dask-cloudprovider[gcp]".

**kwargs,
):
super().__init__(**kwargs)
self.compute = googleapiclient.discovery.build("compute", "v1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm initially having credentials issues and am seeing the following exception. We should probably catch this and raise something friendlier.

    @pytest.mark.asyncio
    async def test_init():
>       cluster = GCPCluster(asynchronous=True)

dask_cloudprovider/providers/gcp/tests/test_gcp.py:44: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dask_cloudprovider/providers/gcp/instances.py:284: in __init__
    self.compute = googleapiclient.discovery.build("compute", "v1")
../../../miniconda3/envs/daskdev/lib/python3.8/site-packages/googleapiclient/_helpers.py:134: in positional_wrapper
    return wrapped(*args, **kwargs)
../../../miniconda3/envs/daskdev/lib/python3.8/site-packages/googleapiclient/discovery.py:276: in build
    return build_from_document(
../../../miniconda3/envs/daskdev/lib/python3.8/site-packages/googleapiclient/_helpers.py:134: in positional_wrapper
    return wrapped(*args, **kwargs)
../../../miniconda3/envs/daskdev/lib/python3.8/site-packages/googleapiclient/discovery.py:516: in build_from_document
    credentials = _auth.default_credentials(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

scopes = None, quota_project_id = None

    def default_credentials(scopes=None, quota_project_id=None):
        """Returns Application Default Credentials."""
        if HAS_GOOGLE_AUTH:
>           credentials, _ = google.auth.default(scopes=scopes, quota_project_id=quota_project_id)
E           TypeError: default() got an unexpected keyword argument 'quota_project_id'

../../../miniconda3/envs/daskdev/lib/python3.8/site-packages/googleapiclient/_auth.py:54: TypeError

Comment on lines 17 to 20
You must configure your GCP credentials to run this test.

$ export GOOGLE_APPLICATION_CREDENTIALS=<path-to-gcp-json-credentials>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have configured credentials as described here but the test is still skipping. See the traceback I raised before.

I'm guessing something else is required to configure the credentials here, or the try/except is being greedy and catching some other exception.

@quasiben
Copy link
Member Author

@eric-czech I finished some touches. If you have time/interest in the coming weeks I think this is safe to test

@jacobtomlinson
Copy link
Member

Thanks for the fixes here @quasiben. I have now managed to run the tests locally.

Once #129 is merged this should be good to go.

@eric-czech
Copy link
Contributor

eric-czech commented Oct 20, 2020

Hey @quasiben, I wasn't quite sure what basic configuration + usage would look like but I started with this:

from dask_cloudprovider.providers.gcp.instances import GCPCluster
cluster = GCPCluster(
    name='dask-gcp-test-1', 
    zone='us-east1-c', 
    machine_type='n1-standard-8', 
    docker_image="daskdev/dask:latest",
    worker_command='dask-worker',
    ngpus=0,
    projectid=PROJECT_ID,
)
~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/distributed/deploy/spec.py in _start(self)
    299         if isinstance(cls, str):
    300             cls = import_term(cls)
--> 301         self.scheduler = cls(**self.scheduler_spec.get("options", {}))
    302 
    303         self.status = Status.starting

TypeError: 'NoneType' object is not callable
Full Trace --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 cluster = GCPCluster( 2 name='dask-gcp-test-1', zone='us-east1-c', machine_type='n1-standard-8', 3 projectid='uk-biobank-279813', 4 docker_image="daskdev/dask:latest", 5 worker_command='dask-worker',

~/repos/dask-cloudprovider/dask_cloudprovider/providers/gcp/instances.py in init(self, name, zone, machine_type, projectid, source_image, docker_image, ngpus, gpu_type, worker_command, worker_extra_args, auto_shutdown, **kwargs)
296 **kwargs,
297 ):
--> 298 super().init(**kwargs)
299 try:
300 self.compute = googleapiclient.discovery.build("compute", "v1")

~/repos/dask-cloudprovider/dask_cloudprovider/providers/generic/vmcluster.py in init(self, n_workers, **kwargs)
92 self.bootstrap = None
93 self.auto_shutdown = True
---> 94 super().init(**kwargs)
95
96 async def _start(self,):

~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/distributed/deploy/spec.py in init(self, workers, scheduler, worker, asynchronous, loop, security, silence_logs, name)
274 if not self.asynchronous:
275 self._loop_runner.start()
--> 276 self.sync(self._start)
277 self.sync(self._correct_state)
278

~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/distributed/deploy/cluster.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
181 return future
182 else:
--> 183 return sync(self.loop, func, *args, **kwargs)
184
185 def _log(self, log):

~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
338 if error[0]:
339 typ, exc, tb = error[0]
--> 340 raise exc.with_traceback(tb)
341 else:
342 return result[0]

~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/distributed/utils.py in f()
322 if callback_timeout is not None:
323 future = asyncio.wait_for(future, callback_timeout)
--> 324 result[0] = yield future
325 except Exception as exc:
326 error[0] = sys.exc_info()

~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/tornado/gen.py in run(self)
733
734 try:
--> 735 value = future.result()
736 except Exception:
737 exc_info = sys.exc_info()

~/repos/dask-cloudprovider/dask_cloudprovider/providers/generic/vmcluster.py in _start(self)
115 "Hang tight! ",
116 ):
--> 117 await super()._start()
118
119 def render_cloud_init(self, *args, **kwargs):

~/miniconda3/envs/dask-cp/lib/python3.8/site-packages/distributed/deploy/spec.py in _start(self)
299 if isinstance(cls, str):
300 cls = import_term(cls)
--> 301 self.scheduler = cls(**self.scheduler_spec.get("options", {}))
302
303 self.status = Status.starting

TypeError: 'NoneType' object is not callable

Am I missing something obvious?

@quasiben
Copy link
Member Author

quasiben commented Nov 4, 2020

@eric-czech things should be a far better state now with sync and async support thanks to @jacobtomlinson.

@jacobtomlinson
Copy link
Member

I've just pulled master and pushed a commit to shuffle the imports to the new layout.

@quasiben
Copy link
Member Author

quasiben commented Nov 5, 2020

Thanks @jacobtomlinson . I'll fix the imports in tests and push. After that would you be willing to give this another review ?

quasiben and others added 3 commits November 5, 2020 08:28
* Migrate to GitHub Actions

* Fix environment file name

* Quiet conda

* Success message

* Exclude versioneer from flake8
@jacobtomlinson jacobtomlinson merged commit ac35f9e into master Nov 6, 2020
@jacobtomlinson jacobtomlinson deleted the gcp-provider branch November 6, 2020 13:48
@quasiben
Copy link
Member Author

quasiben commented Nov 6, 2020

Thank you @jacobtomlinson !

@mrocklin
Copy link
Member

mrocklin commented Nov 6, 2020 via email

@jacobtomlinson jacobtomlinson added the provider/gcp/vm Cluster provider for GCP Instances label Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider/gcp/vm Cluster provider for GCP Instances
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants