-
Notifications
You must be signed in to change notification settings - Fork 48
feat: use default session connection #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -162,3 +164,25 @@ def _get_service_account_if_connection_exists( | |||
pass | |||
|
|||
return service_account | |||
|
|||
|
|||
def get_connection_name_full( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be tidier to make it a class method in BqConnectionManager
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the function is only a helper that does string manipulation. Instead of the BqConnectionManager which contains the states of the clients. So separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO it would still make sense. A @classmethod
would mean that it does not depend on the instance level state. But conceptually it would fit in nicely - BqConnectionManager sounds like the entity which is supposed to know the low level details of a connection and can provide helper functions about the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work. Will address in another PR.
@@ -465,6 +465,36 @@ def square(x): | |||
assert_pandas_df_equal_ignore_ordering(bf_result, pd_result) | |||
|
|||
|
|||
@pytest.mark.flaky(retries=2, delay=120) | |||
def test_remote_function_default_connection(scalars_dfs, dataset_id): | |||
@rf.remote_function([int], int, dataset=dataset_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
direct version is deprecated, recommend testing session.remote_function
or bigframes.pandas.remote_function
interface.
This is an internal method. Please use :func:`bigframes.pandas.remote_function` instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to bigframes.pandas.remote_function.
But in that case, if all the use cases are through session.remote_function or bigframes.pandas.remote_function, then it will be always the case that rf.remote_function's input session is not None. And we won't need to resolve to bpd.get_global_session() in rf.remote_function.
I'll leave it for you to resolve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay, the goal of this test is to assert that default connection takes effect irrespective of the session.
We can add a separate large test test_remote_function_direct_defaults_to_global_session
specifically for that change.
@@ -162,3 +164,25 @@ def _get_service_account_if_connection_exists( | |||
pass | |||
|
|||
return service_account | |||
|
|||
|
|||
def get_connection_name_full( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO it would still make sense. A @classmethod
would mean that it does not depend on the instance level state. But conceptually it would fit in nicely - BqConnectionManager sounds like the entity which is supposed to know the low level details of a connection and can provide helper functions about the same.
|
||
|
||
def get_connection_name_full( | ||
connection_name: Optional[str], default_project: str, default_location: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awkward to have optional argument followed by mandatory arguments, would be nice to have it in the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional only means it can be value None. It is OK to put it at the front, as it is the most important param of the function.
Not really "optional", which means it has a default value. Then it has to be at the end.
@@ -48,7 +49,14 @@ def __init__( | |||
connection_name: Optional[str] = None, | |||
): | |||
self.session = session or bpd.get_global_session() | |||
self.connection_name = connection_name or self.session._bq_connection | |||
|
|||
connection_name = connection_name or self.session._bq_connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this a bit, we are resolving between 3 potential sources of connection:
- provided by the user to llm, i.e.
connection_name
arg - defined by the user in the session, i.e.
self.session._bq_connection
- the default defined in
clients.py
How about we set the default in Session.__init__
like below?
self._bq_connection = context.bq_connection or "bigframes-default-connection"
We are assuming other session defaults there, such as self._location = "US"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yea, I did this way at first place, but then I found remote_function doesn't always have a session object. So I had to put the default away from session. Then we added default session...
We can move it to session.
@@ -162,3 +164,25 @@ def _get_service_account_if_connection_exists( | |||
pass | |||
|
|||
return service_account | |||
|
|||
|
|||
def get_connection_name_full( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] since it takes a connection and returns a connection after resolving the format, would prefer naming it resolve_full_connection_name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address in another PR.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕
http://b/303283207