Skip to content

Getting error urllib3.exceptions.SSLError: [Errno 2] No such file or directory running example script #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tsuberim opened this issue Dec 12, 2018 · 8 comments

Comments

@tsuberim
Copy link

Script

from dask_kubernetes import KubeCluster

cluster = KubeCluster.from_yaml('worker-spec.yml')
cluster.scale_up(10)  # specify number of nodes explicitly

cluster.adapt(minimum=1, maximum=100)

Error

2018-12-12 05:02:58,458 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(FileNotFoundError(2, 'No such file or directory'))': /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker
2018-12-12 05:02:58,458 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(FileNotFoundError(2, 'No such file or directory'))': /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker
2018-12-12 05:02:58,634 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(FileNotFoundError(2, 'No such file or directory'))': /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker
2018-12-12 05:02:58,634 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(FileNotFoundError(2, 'No such file or directory'))': /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker
2018-12-12 05:02:58,810 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(FileNotFoundError(2, 'No such file or directory'))': /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker
2018-12-12 05:02:58,810 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(FileNotFoundError(2, 'No such file or directory'))': /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker
Traceback (most recent call last):
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/util/ssl_.py", line 321, in ssl_wrap_socket
    context.load_verify_locations(ca_certs, ca_cert_dir)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 839, in _validate_conn
    conn.connect()
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connection.py", line 344, in connect
    ssl_context=context)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/util/ssl_.py", line 323, in ssl_wrap_socket
    raise SSLError(e)
urllib3.exceptions.SSLError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/weakref.py", line 624, in _exitfunc
    f()
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/weakref.py", line 548, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/dask_kubernetes/core.py", line 501, in _cleanup_pods
    pods = api.list_namespaced_pod(namespace, label_selector=format_labels(labels))
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/apis/core_v1_api.py", line 12514, in list_namespaced_pod
    (data) = self.list_namespaced_pod_with_http_info(namespace, **kwargs)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/apis/core_v1_api.py", line 12617, in list_namespaced_pod_with_http_info
    collection_formats=collection_formats)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 321, in call_api
    _return_http_data_only, collection_formats, _preload_content, _request_timeout)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 155, in __call_api
    _request_timeout=_request_timeout)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 342, in request
    headers=headers)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/rest.py", line 231, in GET
    query_params=query_params)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/kubernetes/client/rest.py", line 205, in request
    headers=headers)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/request.py", line 68, in request
    **urlopen_kw)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/request.py", line 89, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/poolmanager.py", line 323, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 667, in urlopen
    **response_kw)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 667, in urlopen
    **response_kw)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 667, in urlopen
    **response_kw)
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/home/tsuberim/anaconda2/envs/py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='104.154.176.5', port=443): Max retries exceeded with url: /api/v1/namespaces/default/pods?labelSelector=foo%3Dbar%2Cdask.org%2Fcluster-name%3Ddask-tsuberim-36b1060e-b%2Cuser%3Dtsuberim%2Capp%3Ddask%2Ccomponent%3Ddask-worker (Caused by SSLError(FileNotFoundError(2, 'No such file or directory')))

Environment:

python: 3.7
os: Ubuntu 18.0.4

@jacobtomlinson
Copy link
Member

Can you give more information on your setup?

Specifically

  • How did you create your Kubernetes cluster?
  • Where are you running dask-kubernetes?

@tsuberim
Copy link
Author

tsuberim commented Dec 12, 2018

  1. I created the cluster using the GCP Console, via the UI.
  2. I connected kubectl with the cluster.
  3. I ran the mentioned script on my laptop and gave it a GOOGLE_APPLICATION_CREDENTIALS env var pointing to a service-account.json.

Anything wrong?

@jacobtomlinson
Copy link
Member

Yeah. I'm afraid the workers created in the Kubernetes cluster will need to be able to connect back to your laptop via your IP address. There is an assumption that you are running dask-kubernetes on the kubernetes cluster also.

This does raise a few questions for the project, but nothing which will help you immediately:

  • Is this documented well enough?
  • Can we create a better and more descriptive error message?
  • Can we fix this long term (remote scheduler for instance)?

Sorry!

@tsuberim
Copy link
Author

@jacobtomlinson Hey, I was under the impression that you use this lib to create and control the cluster from your laptop, no? so your code creates the cluster, connects to it and dispatches remote tasks to it and shows you the result.

@jacobtomlinson
Copy link
Member

The library creates a scheduler locally to where you run the library and some workers remotely in the kubernetes cluster. But they need to be able to communicate with each other. The easiest way to do this is by having your session in the kubernetes cluster too. It's common to use this in conjunction with zero to jupyterhub or a derivative like Pangeo.

We are keen to enhance it to make the scheduler remote too, which would resolve the issue. But aren't there with it yet.

@bphi
Copy link
Contributor

bphi commented Feb 20, 2019

Would be very nice to be able to run the library locally and control a remote cluster.

@jacobtomlinson
Copy link
Member

It's on the roadmap! 😄

@jacobtomlinson
Copy link
Member

I'm going to close this in favour of #84.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants