-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[ty] Implement equivalence for protocols with method members #18659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hooray... more protocol-related stack overflows... |
In the corpus test, we're overflowing our stack when pulling types for |
Minimal repro for the overflow: from typing_extensions import Protocol, Self
class _HashObject(Protocol):
def copy(self) -> Self: ...
class Foo: ...
x: Foo | _HashObject |
The stack overflow is because |
I fixed the issue with |
The stack overflow in mypy_primer is on psycopg. I'm attempting to minimize it. |
Self-contained repro for the psycopg overflow: from typing_extensions import Protocol, Self
class PGconn(Protocol):
def connect(self) -> Self: ...
class Connection:
pgconn: PGconn
def is_crdb(conn: PGconn) -> bool:
isinstance(conn, Connection) |
This also repros the overflow: it's the fact that the from typing_extensions import Protocol
class PGconn(Protocol):
def connect[T: PGconn](self: T) -> T: ...
class Connection:
pgconn: PGconn
def f(x: PGconn):
isinstance(x, Connection) |
4094ae6
to
f1b4402
Compare
|
@AlexWaygood I wouldn't spend too much time digging into those psycopg2 stack overflows -- they seem like pretty straightforward cases of recursive protocols, where I'd expect us to stack overflow currently. I think either adding psycopg2 to |
Alrighty -- in that case, this PR is ready for review. It adds 3 projects to |
(I do still wish I understood more specifically what difference the |
f1b4402
to
da8a85b
Compare
da8a85b
to
f7d7bcd
Compare
f7d7bcd
to
85cbc33
Compare
#18659 (comment) and #18659 (comment) no longer cause stack overflows now this branch has been rebased on top of #19003... but there are still stack overflows on kopf/jinja that are causing the mypy_primer run to fail... |
CodSpeed WallTime Performance ReportMerging #18659 will not alter performanceComparing Summary
|
Here's a minimized repro for the overflow on jinja on this branch: from typing import cast, Protocol
class Iterator[T](Protocol):
def __iter__(self) -> Iterator[T]: ...
def f(value: Iterator):
cast(Iterator, value) It only occurs if the |
Okay, the overflow goes away if I apply this change: diff --git a/crates/ty_python_semantic/src/types/function.rs b/crates/ty_python_semantic/src/types/function.rs
index 8149dad8d3..895ef45377 100644
--- a/crates/ty_python_semantic/src/types/function.rs
+++ b/crates/ty_python_semantic/src/types/function.rs
@@ -1148,8 +1148,6 @@ impl KnownFunction {
let contains_unknown_or_todo =
|ty| matches!(ty, Type::Dynamic(dynamic) if dynamic != DynamicType::Any);
if source_type.is_equivalent_to(db, *casted_type)
- && !casted_type.any_over_type(db, &|ty| contains_unknown_or_todo(ty))
- && !source_type.any_over_type(db, &|ty| contains_unknown_or_todo(ty))
{
let builder = context.report_lint(&REDUNDANT_CAST, call_expression)?;
builder.into_diagnostic(format_args!( So it's the This possibly makes the idea of a generalized |
I'll clean this up a bit more |
On second thought, I do think that a recursive fallback of
If we were implementing some kind of more generic recursive type traversal that we might use for |
Couldn't you check to see whether a type is itself recursive by calling I think I prefer #19094 as a solution over what I have here now though:
The codspeed report for #19094 does overall seem to be better than on this PR, though this PR obviously does a few other things aside from the |
82ce2de
to
db7ced5
Compare
This is once again ready for review (and really quite a minimal change at this point!) |
This PR unfortunately appears to cause catastrophic execution time on the following snippet, where we're asked to determine in the final line whether one (very large) protocol is assignable to another (also very large) protocol: from __future__ import annotations
from typing import TypeVar, overload, Protocol
from ty_extensions import is_assignable_to, static_assert
class Foo: ...
class Bar: ...
class Baz: ...
_D = TypeVar("_D", bound="Date")
_GMaybeTZT = TypeVar("_GMaybeTZT", bound=Baz|None, covariant=True)
_GMaybeTZDT = TypeVar("_GMaybeTZDT", bound=Baz|None, covariant=True)
_FuncTZ = TypeVar("_FuncTZ", bound=Baz)
_FuncOptionalTZ = TypeVar("_FuncOptionalTZ", bound=Baz|None)
Self = TypeVar("Self")
class Date(Protocol):
def _subclass_check_hook(cls, instance: object) -> bool: ...
min: Date
max: Date
resolution: Bar
def fromtimestamp(cls, __timestamp: float) -> Date: ...
def today(cls) -> Date: ...
def fromordinal(cls, __n: int) -> Date: ...
def fromisoformat(cls, __date_string: str) -> Date: ...
def fromisocalendar(cls, year: int, week: int, day: int) -> Date: ...
def year(self) -> int: ...
def month(self) -> int: ...
def day(self) -> int: ...
def ctime(self) -> str: ...
def strftime(self, __format: str) -> str: ...
def __format__(self, __fmt: str) -> str: ...
def isoformat(self) -> str: ...
def timetuple(self) -> Foo: ...
def toordinal(self) -> int: ...
def replace(self: Self, year: int = ..., month: int = ..., day: int = ...) -> Self: ...
def __le__(self, __other: Date) -> bool: ...
def __lt__(self, __other: Date) -> bool: ...
def __ge__(self, __other: Date) -> bool: ...
def __gt__(self, __other: Date) -> bool: ...
def __add__(self: Self, __other: Bar) -> Self: ...
def __radd__(self: Self, __other: Bar) -> Self: ...
def __hash__(self) -> int: ...
def weekday(self) -> int: ...
def isoweekday(self) -> int: ...
@overload
def __sub__(self: Self, __other: Bar) -> Self: ...
@overload
def __sub__(self: _D, __other: _D) -> Bar: ...
class Time(Protocol[_GMaybeTZT]):
min: Time[None]
max: Time[None]
resolution: Bar
def hour(self) -> int: ...
def minute(self) -> int: ...
def second(self) -> int: ...
def microsecond(self) -> int: ...
def tzinfo(self) -> _GMaybeTZT: ...
def fold(self) -> int: ...
def __le__(self: Self, __other: Self) -> bool: ...
def __lt__(self: Self, __other: Self) -> bool: ...
def __ge__(self: Self, __other: Self) -> bool: ...
def __gt__(self: Self, __other: Self) -> bool: ...
def __hash__(self) -> int: ...
def isoformat(self, timespec: str = ...) -> str: ...
def strftime(self, __format: str) -> str: ...
def __format__(self, __fmt: str) -> str: ...
def utcoffset(self) -> Bar | None: ...
def tzname(self) -> str | None: ...
def dst(self) -> Bar | None: ...
@overload
def replace(
self: Self,
hour: int = ...,
minute: int = ...,
second: int = ...,
microsecond: int = ...,
*,
fold: int = ...,
) -> Self: ...
@overload
def replace(
self,
hour: int = ...,
minute: int = ...,
second: int = ...,
microsecond: int = ...,
*,
tzinfo: _FuncTZ,
fold: int = ...,
) -> Time[_FuncTZ]: ...
@overload
def replace(
self,
hour: int = ...,
minute: int = ...,
second: int = ...,
microsecond: int = ...,
*,
tzinfo: None,
fold: int = ...,
) -> Time[None]: ...
@overload
def replace(
self,
hour: int,
minute: int,
second: int,
microsecond: int,
tzinfo: None,
*,
fold: int,
) -> Time[None]: ...
@overload
def replace(
self,
hour: int,
minute: int,
second: int,
microsecond: int,
tzinfo: _FuncTZ,
*,
fold: int,
) -> Time[_FuncTZ]: ...
def fromisoformat(cls, __time_string: str) -> Time[Baz | None]: ...
class NaiveTime(Time[None], Protocol): ...
DTSelf = TypeVar("DTSelf", bound="DateTime")
class DateTime(Protocol[_GMaybeTZDT]):
resolution: Bar
def hour(self) -> int: ...
def minute(self) -> int: ...
def second(self) -> int: ...
def microsecond(self) -> int: ...
def tzinfo(self) -> _GMaybeTZDT: ...
def fold(self) -> int: ...
def timestamp(self) -> float: ...
def utctimetuple(self) -> Foo: ...
def date(self) -> Date: ...
def time(self) -> NaiveTime: ...
@overload
def replace(
self: Self,
year: int = ...,
month: int = ...,
day: int = ...,
hour: int = ...,
minute: int = ...,
second: int = ...,
microsecond: int = ...,
*,
tzinfo: _FuncTZ,
fold: int = ...,
) -> DateTime[_FuncTZ]: ...
@overload
def replace(
self,
year: int,
month: int,
day: int,
hour: int,
minute: int,
second: int,
microsecond: int,
tzinfo: _FuncTZ,
*,
fold: int,
) -> DateTime[_FuncTZ]: ...
@overload
def replace(
self,
year: int = ...,
month: int = ...,
day: int = ...,
hour: int = ...,
minute: int = ...,
second: int = ...,
microsecond: int = ...,
*,
tzinfo: None,
fold: int = ...,
) -> NaiveDateTime: ...
@overload
def replace(
self: Self,
year: int,
month: int,
day: int,
hour: int,
minute: int,
second: int,
microsecond: int,
tzinfo: None,
*,
fold: int,
) -> NaiveDateTime: ...
@overload
def replace(
self: Self,
year: int = ...,
month: int = ...,
day: int = ...,
hour: int = ...,
minute: int = ...,
second: int = ...,
microsecond: int = ...,
*,
fold: int = ...,
) -> Self: ...
def fromtimestamp(
cls: type[Self], __timestamp: float, tz: _FuncOptionalTZ
) -> DateTime[_FuncOptionalTZ]: ...
@overload
def now(cls, tz: _FuncOptionalTZ) -> DateTime[_FuncOptionalTZ]: ...
@overload
def now(cls) -> DateTime[None]: ...
def now(cls, tz: Baz | None = None) -> DateTime[Baz | None]: ...
class NaiveDateTime(DateTime[None], Protocol): ...
static_assert(is_assignable_to(NaiveDateTime, DateTime[Baz | None])) (The snippet above uses simplified versions of some protocols found in the DateType library. If you comment out the final line, where we're asked to determine whether one protocol is a subtype of the other, we check the code almost instantly on this branch.) Since I'll look into adding something similar to the above snippet as a regression benchmark. |
db7ced5
to
400dd74
Compare
## Summary The [`DateType`](https://github.com/glyph/DateType) library has some very large protocols in it. Currently we type-check it quite quickly, but the current version of #18659 makes our execution time on this library pathologically slow. That PR doesn't seem to have a big impact on any of our current benchmarks, however, so it seems we have some missing coverage in this area; I therefore propose that we add `DateType` as a benchmark. Currently the benchmark runs pretty quickly (about half the runtime of attrs, which is our fastest real-world benchmark currently), and the library has 0 third-party dependencies, so the benchmark is quick to setup. ## Test Plan `cargo bench -p ruff_benchmark --bench=ty`
400dd74
to
58f1b19
Compare
CodSpeed Instrumentation Performance ReportMerging #18659 will degrade performances by 20.45%Comparing Summary
Benchmarks breakdown
|
Which is nonetheless much better than the previous commit on this PR branch, on which this benchmark timed out 😆 I'm looking to see if there are further optimisations we could apply that are reasonable and effective |
None of my other ideas made any significant difference to the runtime of that benchmark. I think ultimately we're just doing a lot more (necessary!) work on this branch. So I think this is once again ready for review. Things I tried:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I observed that the reason we never finished when trying to type-check DateType
on earlier versions of this PR was that we were repeatedly calling Type::normalized_impl
on the same types over and over again. Caching the results within any one call to Type::normalized()
fixes this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I'm not 100% sure I understand why this new cache
in CycleDetector
is necessary in addition to seen
. And if it is necessary, if seen
and cache
couldn't be merged into a single HashMap
?
If the type we're trying to normalize is present in But if the type we're trying to normalize is present in I'll add some doc-comments. |
5a9645a
to
f81ba79
Compare
…c_tokens * 'main' of https://github.com/astral-sh/ruff: (27 commits) [ty] First cut at semantic token provider (astral-sh#19108) [`flake8-simplify`] Make example error out-of-the-box (`SIM116`) (astral-sh#19111) [`flake8-use-pathlib`] Make example error out-of-the-box (`PTH210`) (astral-sh#19189) [`flake8-use-pathlib`] Add autofixes for `PTH203`, `PTH204`, `PTH205` (astral-sh#18922) [`flake8-type-checking`] Fix syntax error introduced by fix (`TC008`) (astral-sh#19150) [`flake8-pyi`] Make example error out-of-the-box (`PYI007`, `PYI008`) (astral-sh#19103) Update Rust crate indicatif to 0.18.0 (astral-sh#19165) [ty] Add separate CI job for memory usage stats (astral-sh#19134) [ty] Add documentation for server traits (astral-sh#19137) Rename to `SessionSnapshot`, move unwind assertion closer (astral-sh#19177) [`flake8-type-checking`] Make example error out-of-the-box (`TC001`) (astral-sh#19151) [ty] Bare `ClassVar` annotations (astral-sh#15768) [ty] Re-enable multithreaded pydantic benchmark (astral-sh#19176) [ty] Implement equivalence for protocols with method members (astral-sh#18659) [ty] Use RHS inferred type for bare `Final` symbols (astral-sh#19142) [ty] Support declaration-only attributes (astral-sh#19048) [ty] Sync vendored typeshed stubs (astral-sh#19174) Update dependency pyodide to ^0.28.0 (astral-sh#19164) Update NPM Development dependencies (astral-sh#19170) Update taiki-e/install-action action to v2.56.7 (astral-sh#19169) ...
Summary
This PR implements the following pieces of
Protocol
semantics:Foo.x
has no return type; we previously incorrectly considered that it was:P1
andP2
, both with method membersx
, should be considered equivalent if the signature ofP1.x
is equivalent to the signature ofP2.x
. Currently we do not recognize this.Implementing these semantics requires distinguishing between method members and non-method members. The stored type of a method member must be eagerly upcast to a
Callable
type when collecting the protocol's interface: doing otherwise would mean that it would be hard to implement equivalence of protocols even in the face of differently ordered unions, since the two equivalent protocols would have different Salsa IDs even when normalized.The semantics implemented by this PR are that we consider something a method member if:
__get__
method, meaning it can be used as a method when accessed on instances.Note that the spec has complicated things to say about classmethod members and staticmethod members. These semantics are not implemented by this PR; they are all deferred for now.
The infrastructure added in this PR fixes bugs in its own right, but also lays the groundwork for implementing subtyping and assignability rules for method members of protocols. A (currently failing) test is added to verify this.
Test Plan
mdtests