-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
PEP 675: Arbitrary Literal Strings #2167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@JelleZijlstra When you land this, make sure the commit title is more descriptive than "Initial commit." :-) |
6353961
to
8ca76b4
Compare
I'll request some changes and then merge it as "subsequent commit". Thanks for sending the PR! I'll review it later today and hopefully merge it. |
Hehe, changed the commit title. |
pep-0675.rst
Outdated
doesn't change inference for other ``str`` methods such as | ||
``literal_string.upper()``. If this PEP is accepted, we could also | ||
overload the typeshed stubs for ``str`` to preserve literal-ness in a | ||
broader set of scenarios where it makes sense. For example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be more of a specification of "where it makes sense"? How should typeshed decide when to use Literal[str]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. We faced a bit of a dilemma here. We wanted to limit the specification, which is why we enumerated the 4 most-frequently used means of composing literal strings. Listing all of the str
methods seemed like overkill and we didn't want the PEP to be rejected because of too many changes to typeshed.
But users might ask for more convenient changes in the future - such as my_literal_str.upper()
. So, we wanted to leave this open.
I've replaced "where it makes sense" with "where all inputs are literals" and clarified that this is merely for convenience and is not required by the PEP.
What do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a concrete suggestion, but as a typeshed maintainer I'd have to come up with some rules, so I'd appreciate if the PEP offered more guidance on when exactly a function should use Literal[str]
annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my attempt to make it more concrete. I've sorted all the str
functions from typeshed into categories:
Overload Literal[str] output if Literal[str] inputs
def __new__(cls, object: Literal[str] = ...) -> Literal[str]: ...
def capitalize(self: Literal[str]) -> Literal[str]: ...
def casefold(self: Literal[str]) -> Literal[str]: ...
def center(self: Literal[str], __width: SupportsIndex, __fillchar: Literal[str] = ...) -> Literal[str]: ...
if sys.version_info >= (3, 8):
def expandtabs(self: Literal[str], tabsize: SupportsIndex = ...) -> Literal[str]: ...
else:
def expandtabs(self: Literal[str], tabsize: int = ...) -> Literal[str]: ...
def format(self: Literal[str], *args: Literal[str], **kwargs: Literal[str]) -> Literal[str]: ...
def format_map(self: Literal[str], map: Mapping[str, Literal[str]]) -> Literal[str]: ...
def join(self: Literal[str], __iterable: Iterable[Literal[str]]) -> Literal[str]: ...
def ljust(self: Literal[str], __width: SupportsIndex, __fillchar: Literal[str] = ...) -> Literal[str]: ...
def lower(self: Literal[str]) -> Literal[str]: ...
def lstrip(self: Literal[str], __chars: Literal[str] | None = ...) -> Literal[str]: ...
def partition(self: Literal[str], __sep: Literal[str]) -> tuple[Literal[str], Literal[str], Literal[str]]: ... # Biasing to overly restrictive: '__sep'
def replace(self: Literal[str], __old: Literal[str], __new: Literal[str], __count: SupportsIndex = ...) -> Literal[str]: ... # Biasing to overly restrictive: '__old'
if sys.version_info >= (3, 9):
def removeprefix(self: Literal[str], __prefix: Literal[str]) -> Literal[str]: ... # Biasing to overly restrictive: '__prefix'
def removesuffix(self: Literal[str], __suffix: Literal[str]) -> Literal[str]: ... # Biasing to overly restrictive: '__suffix'
def rjust(self: Literal[str], __width: SupportsIndex, __fillchar: Literal[str] = ...) -> Literal[str]: ...
def rpartition(self: Literal[str], __sep: Literal[str]) -> tuple[Literal[str], Literal[str], Literal[str]]: ... # Biasing to overly restrictive: '__sep'
def rsplit(self: Literal[str], sep: Literal[str] | None = ..., maxsplit: SupportsIndex = ...) -> list[Literal[str]]: ... # Biasing to overly restrictive: 'sep'
def rstrip(self: Literal[str], __chars: Literal[str] | None = ...) -> Literal[str]: ... # Biasing to overly restrictive: '__chars'
def split(self: Literal[str], sep: Literal[str] | None = ..., maxsplit: SupportsIndex = ...) -> list[Literal[str]]: ... #
def splitlines(self: Literal[str], keepends: bool = ...) -> list[Literal[str]]: ...
def strip(self: Literal[str], __chars: Literal[str] | None = ...) -> Literal[str]: ... #
def swapcase(self: Literal[str]) -> Literal[str]: ...
def title(self: Literal[str]) -> Literal[str]: ...
def upper(self: Literal[str]) -> Literal[str]: ...
def zfill(self: Literal[str], __width: SupportsIndex) -> Literal[str]: ...
def __add__(self: Literal[str], __s: Literal[str]) -> Literal[str]: ...
def __iter__(self: Literal[str]) -> Iterator[str]: ...
def __mod__(self: Literal[str], __x: Union[Literal[str], Tuple[Literal[str], ...]]) -> str: ...
def __mul__(self: Literal[str], __n: SupportsIndex) -> Literal[str]: ...
def __repr__(self: Literal[str]) -> Literal[str]: ...
def __rmul__(self: Literal[str], n: SupportsIndex) -> Literal[str]: ...
def __str__(self: Literal[str]) -> Literal[str]: ...
def __getnewargs__(self: Literal[str]) -> tuple[Literal[str]]: ...
Explicitly ruing out override
# Can only be made safe if `Literal[int]` were supported for table
def translate(self, __table: Mapping[int, int | str | None] | Sequence[int | str | None]) -> str: ...
# Allows selecting arbitrary strs (constrained by the contents of self)if given arbitrary index / slice
def __getitem__(self: Literal[str], __i: SupportsIndex | slice) -> str: ...
Override doesn't make sense (ie. doesn't return str). Note that some of these could make sense for Literal[int]
, Literal[bytes]
, etc. if we choose to support those.
def count(self, x: str, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...) -> int: ...
def encode(self, encoding: str = ..., errors: str = ...) -> bytes: ...
def endswith(
self, __suffix: str | Tuple[str, ...], __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...
) -> bool: ...
def find(self, __sub: str, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...) -> int: ...
def index(self, __sub: str, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...) -> int: ...
def isalnum(self) -> bool: ...
def isalpha(self) -> bool: ...
if sys.version_info >= (3, 7):
def isascii(self) -> bool: ...
def isdecimal(self) -> bool: ...
def isdigit(self) -> bool: ...
def isidentifier(self) -> bool: ...
def islower(self) -> bool: ...
def isnumeric(self) -> bool: ...
def isprintable(self) -> bool: ...
def isspace(self) -> bool: ...
def istitle(self) -> bool: ...
def isupper(self) -> bool: ...
def rfind(self, __sub: str, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...) -> int: ...
def rindex(self, __sub: str, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...) -> int: ...
def startswith(
self, __prefix: str | Tuple[str, ...], __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...
) -> bool: ...
@staticmethod
@overload
def maketrans(__x: dict[int, _T] | dict[str, _T] | dict[str | int, _T]) -> dict[int, _T]: ...
@staticmethod
@overload
def maketrans(__x: str, __y: str, __z: str | None = ...) -> dict[int, int | None]: ...
def __contains__(self, __o: str) -> bool: ... # type: ignore[override]
def __eq__(self, __x: object) -> bool: ...
def __ge__(self, __x: str) -> bool: ...
def __gt__(self, __x: str) -> bool: ...
def __hash__(self) -> int: ...
def __le__(self, __x: str) -> bool: ...
def __len__(self) -> int: ...
def __lt__(self, __x: str) -> bool: ...
def __ne__(self, __x: object) -> bool: ...
@pradeep90 perhaps we incorporate the suggested overrides from the first section as an appendix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a concrete suggestion, but as a typeshed maintainer I'd have to come up with some rules, so I'd appreciate if the PEP offered more guidance on when exactly a function should use Literal[str] annotations.
@gbleaney Thanks for classifying the str
methods.
@JelleZijlstra For now, I've removed the paragraph about typeshed and other str
methods. We need to discuss some tradeoffs for those scenarios. Basically, adding Literal[str]
overload to a method in str
would affect any class that subclasses str
- users could see spurious override errors (snippet) and could violate type safety by returning a value of non-literal type. I'll make that a separate PR since we might indeed decide not to change str
in typeshed.
Does that unblock this initial PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, let's just land the PEP first. I have some concerns about how this would interact with typeshed but that's probably better discussed elsewhere.
|
||
:: | ||
|
||
def execute(self, sql: Literal[str], parameters: Iterable[str] = ...) -> Cursor: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would you make this change? Would typeshed's stubs for sqlite change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we would have to update its typeshed stub. Linked to it.
8ca76b4
to
4d32411
Compare
4d32411
to
13025d5
Compare
+ Add the draft PEP. + Turn unicode quotes to ASCII.
13025d5
to
643ee18
Compare
@pradeep90 I'd love to see a proof of this, can you provide the reference? |
@NotWearingPants we don't have a formalized proof for you, only the intuitive comparison: Similarly to how pathological program can decide whether or not to halt based on what the analyzing program will predict, a pathological program could decide to perform a malicious operation or not based on whether or not the analyzing program predicted it would. Perhaps we should have just invoked Rice's Theorem, or made a less strong statement. The key is just that it's impossible to know for some programs if they are going to do something malicious or not |
cc @gbleaney @JelleZijlstra