Skip to content

Spanner batch inserts >1hr fail with GaxError Exception StatusCode.UNAUTHENTICATED #3387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dr3x opened this issue May 9, 2017 · 7 comments
Assignees
Labels
api: spanner Issues related to the Spanner API. auth type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@dr3x
Copy link

dr3x commented May 9, 2017

  1. OS type and version Microsoft Windows [Version 10.0.10586]

  2. Python version Python 2.7.12

  3. google-cloud-python version pip show google-cloud, pip show google-<service> or pip freeze
    google-auth==1.0.0
    google-auth-httplib2==0.0.2
    google-cloud-core==0.24.1
    google-cloud-spanner==0.24.1
    google-cloud-storage==1.1.0
    google-gax==0.15.8
    google-resumable-media==0.0.2
    googleapis-common-protos==1.5.2
    grpc-google-iam-v1==0.11.1
    grpcio==1.3.0

  4. Stacktrace if available

Traceback (most recent call last):
  File "C:\src\GitHub\python-docs-samples\quick_table_load.py", line 165, in <module>
    load_table(table_name)
  File "C:\src\GitHub\python-docs-samples\quick_table_load.py", line 143, in load_table
    insert_batch(table_name,col_names,chgs,tsv_arr)
  File "C:\src\GitHub\python-docs-samples\quick_table_load.py", line 113, in insert_batch
    values=tsv_arr)
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\cloud\spanner\database.py", line 462, in __exit__
    self._batch.commit()
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\cloud\spanner\batch.py", line 156, in commit
    options=options)
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\cloud\gapic\spanner\v1\spanner_client.py", line 832, in commit
    return self._commit(request, options)
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\api_callable.py", line 419, in inner
    return api_caller(api_call, this_settings, request)
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\api_callable.py", line 407, in base_caller
    return api_call(*args)
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\api_callable.py", line 368, in inner
    return a_func(*args, **kwargs)
  File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\retry.py", line 126, in inner
    ' classified as transient', exception)
RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.UNAUTHENTICATED, Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.)>)
  1. Steps to reproduce
  • Parse a file into 500 row batches and insert to a Spanner database
  • Refresh credential when <15 minutes remaining until expiry
    Output (first number is row count):
    Inserting first batch...
    Inserting batch: 502, Table: Inventory_Avail, Expiration: 0:59:58.494000, Elapsed: 0:00:02.076000
    ...
    Inserting batch: 1214425, Table: Inventory_Avail, Expiration: 0:45:00.843000, Elapsed: 1:00:00.710000
    <error here, see stack trace above>
  1. Code example
import argparse, codecs, os, re, sys
from datetime import datetime, timedelta
from oauth2client import client, crypt
import google.auth
from google.cloud import spanner, storage
from google.gax import config, errors

request = google.auth.transport.requests.Request()
credentials, project_id = google.auth.default(scopes=['https://www.googleapis.com/auth/cloud-platform'], request=request)

def check_credentials(credentials, dump_info=False):
    idinfo = client.verify_id_token(credentials.id_token, credentials.client_id)
    ## validate credentials here

credentials.refresh(request)
check_credentials(credentials, True)       
spanner_client = spanner.Client(project=project_id, credentials = credentials)
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
storage_client = storage.Client(project=project_id, credentials = credentials)

mutation_count = 500

def insert_batch(table_id,col_names,chgs,tsv_arr):
    with database.batch() as batch:
        batch.insert(
            table=table_id,
            columns=(col_names),
            values=tsv_arr)

def load_table(table_name):
    ...
    insert_batch(table_name,col_names,chgs,tsv_arr)
    if ((credentials.expiry - datetime.utcnow()) < timedelta(minutes=15)):
        print('Updating credentials...')
        credentials.refresh(request)
    ...    
@dhermes
Copy link
Contributor

dhermes commented May 9, 2017

@dr3x This is continuing the discussion from #1917, yes?

We fixed this issue today and made a new release of the auth library.

@dhermes dhermes added api: spanner Issues related to the Spanner API. auth type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 9, 2017
@dhermes
Copy link
Contributor

dhermes commented May 9, 2017

@dr3x I am going to pre-emptively close this (so it doesn't linger in limbo) but REALLY am interested to see if the new release fixes the issue for you.

Does that seem OK?

@dhermes dhermes closed this as completed May 9, 2017
@dr3x
Copy link
Author

dr3x commented May 9, 2017

Yes - it is a continuation of the discussion from #1917 but seemed like it might be a new issue

The code does a credentials refresh when ~15 minutes remain so I don't think the clock skew will affect it.

Looks like the error is happening when it is retrieving from the session pool so maybe it is maxed out there and incorrectly reporting lack of available sessions as an auth error.

Updated to the latest auth lib and am running with it now, should know in a couple hours.

Thanks

@dr3x
Copy link
Author

dr3x commented May 9, 2017

Ok now throwing a different error - StatusCode.DEADLINE_EXCEEDED at the 24 minute mark:

Inserting batch: 460420, Table: Inventory_Avail, Expiration: 0:35:28.233000, Elapsed: 0:24:32.318000
 Traceback (most recent call last):
   File "C:\src\GitHub\python-docs-samples\quick_table_load.py", line 121, in insert_batch
    values=tsv_arr)
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\cloud\spanner\database.py", line 462, in __exit__
    self._batch.commit()
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\cloud\spanner\batch.py", line 156, in commit
    options=options)
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\cloud\gapic\spanner\v1\spanner_client.py", line 832, in commit
    return self._commit(request, options)
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\api_callable.py", line 419, in inner
    return api_caller(api_call, this_settings, request)
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\api_callable.py", line 407, in base_caller
    return api_call(*args)
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\api_callable.py", line 368, in inner
    return a_func(*args, **kwargs)
   File "C:\src\GitHub\python-docs-samples\env\lib\site-packages\google\gax\retry.py", line 126, in inner
    ' classified as transient', exception)
 RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded)>)

@dhermes
Copy link
Contributor

dhermes commented May 9, 2017

@dr3x That just sounds like bad luck (and unrelated to the new release). I doubt the DEADLINE_EXCEEDED is related (though I'm surprised the gRPC layer doesn't automatically retry that for you).

The fact that you are already refreshing is a bit strange to see:

Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.

This is exactly the error that I've reproduced when the invalid expiry keeps a token around too long.

@dr3x
Copy link
Author

dr3x commented May 9, 2017

Ok was able to run the data load script for two large tables (>1hr) and they both successfully completed so I'll consider it fixed. Good news

@dhermes
Copy link
Contributor

dhermes commented May 9, 2017

Great news. Please let us know if you are having other issues and thanks for providing such good info!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. auth type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

2 participants