Skip to content

[red-knot] Reachability analysis #17199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1502,13 +1502,14 @@ if True:
from module import symbol
```

## Unsupported features
## Unreachable code

We do not support full unreachable code analysis yet. We also raise diagnostics from
statically-known to be false branches:
A closely related feature is the ability to detect unreachable code. For example, we do not emit a
diagnostic here:

```py
if False:
# error: [unresolved-reference]
x
```

See [unreachable.md](unreachable.md) for more tests on this topic.
242 changes: 218 additions & 24 deletions crates/red_knot_python_semantic/resources/mdtest/unreachable.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
# Unreachable code

This document describes our approach to handling unreachable code. There are two aspects to this.
One is to detect and mark blocks of code that are unreachable. This is useful for notifying the
user, as it can often be indicative of an error. The second aspect of this is to make sure that we
do not emit (incorrect) diagnostics in unreachable code.

## Detecting unreachable code

In this section, we look at various scenarios how sections of code can become unreachable. We should
eventually introduce a new diagnostic that would detect unreachable code.
eventually introduce a new diagnostic that would detect unreachable code. In an editor/LSP context,
there are ways to 'gray out' sections of code, which is helpful for blocks of code that are not
'dead' code, but inactive under certain conditions, like platform-specific code.

### Terminal statements

Expand Down Expand Up @@ -85,21 +92,21 @@ def f():
print("unreachable")
```

## Python version and platform checks
### Python version and platform checks

It is common to have code that is specific to a certain Python version or platform. This case is
special because whether or not the code is reachable depends on externally configured constants. And
if we are checking for a set of parameters that makes one of these branches unreachable, that is
likely not something that the user wants to be warned about, because there are probably other sets
of parameters that make the branch reachable.

### `sys.version_info` branches
#### `sys.version_info` branches

Consider the following example. If we check with a Python version lower than 3.11, the import
statement is unreachable. If we check with a Python version equal to or greater than 3.11, the
import statement is definitely reachable. We should not emit any diagnostics in either case.

#### Checking with Python version 3.10
##### Checking with Python version 3.10

```toml
[environment]
Expand All @@ -115,7 +122,7 @@ if sys.version_info >= (3, 11):
from typing import Self
```

#### Checking with Python version 3.12
##### Checking with Python version 3.12

```toml
[environment]
Expand All @@ -129,12 +136,12 @@ if sys.version_info >= (3, 11):
from typing import Self
```

### `sys.platform` branches
#### `sys.platform` branches

The problem is even more pronounced with `sys.platform` branches, since we don't necessarily have
the platform information available.

#### Checking with platform `win32`
##### Checking with platform `win32`

```toml
[environment]
Expand All @@ -148,7 +155,7 @@ if sys.platform == "win32":
sys.getwindowsversion()
```

#### Checking with platform `linux`
##### Checking with platform `linux`

```toml
[environment]
Expand All @@ -164,13 +171,21 @@ if sys.platform == "win32":
sys.getwindowsversion()
```

#### Checking without a specified platform
##### Checking with platform set to `all`

```toml
[environment]
# python-platform not specified
python-platform = "all"
```

If `python-platform` is set to `all`, we treat the platform as unspecified. This means that we do
not infer a literal type like `Literal["win32"]` for `sys.platform`, but instead fall back to
`LiteralString` (the `typeshed` annotation for `sys.platform`). This means that we can not
statically determine the truthiness of a branch like `sys.platform == "win32"`.

See <https://github.com/astral-sh/ruff/issues/16983#issuecomment-2777146188> for a plan on how this
could be improved.

```py
import sys

Expand All @@ -180,11 +195,13 @@ if sys.platform == "win32":
sys.getwindowsversion()
```

#### Checking with platform set to `all`
##### Checking without a specified platform

If `python-platform` is not specified, we currently default to `all`:

```toml
[environment]
python-platform = "all"
# python-platform not specified
```

```py
Expand All @@ -196,9 +213,29 @@ if sys.platform == "win32":
sys.getwindowsversion()
```

## No false positive diagnostics in unreachable code
## No (incorrect) diagnostics in unreachable code

```toml
[environment]
python-version = "3.10"
```

In this section, we demonstrate that we do not emit (incorrect) diagnostics in unreachable sections
of code.

It could be argued that no diagnostics at all should be emitted in unreachable code. The reasoning
is that any issues inside the unreachable section would not cause problems at runtime. And type
checking the unreachable code under the assumption that it *is* reachable might lead to false
positives (see the "Global constants" example below).

In this section, we make sure that we do not emit false positive diagnostics in unreachable code.
On the other hand, it could be argued that code like `1 + "a"` is incorrect, no matter if it is
reachable or not. Some developers like to use things like early `return` statements while debugging,
and for this use case, it is helpful to still see some diagnostics in unreachable sections.

We currently follow the second approach, but we do not attempt to provide the full set of
diagnostics in unreachable sections. In fact, we silence a certain category of diagnostics
(`unresolved-reference`, `unresolved-attribute`, …), in order to avoid *incorrect* diagnostics. In
the future, we may revisit this decision.

### Use of variables in unreachable code

Expand All @@ -218,26 +255,24 @@ def f():

In the example below, since we use `x` in the `inner` function, we use the "public" type of `x`,
which currently refers to the end-of-scope type of `x`. Since the end of the `outer` scope is
unreachable, we treat `x` as if it was not defined. This behavior can certainly be improved.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing changed in this test, but this explanation was outdated. And the return x # … below as confusing.

unreachable, we need to make sure that we do not emit an `unresolved-reference` diagnostic:

```py
def outer():
x = 1

def inner():
return x # Name `x` used when not defined
reveal_type(x) # revealed: Unknown
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that we reveal Unknown here is tracked in https://github.com/astral-sh/ruff/issues/15777

while True:
pass
```

## No diagnostics in unreachable code

In general, no diagnostics should be emitted in unreachable code. The reasoning is that any issues
inside the unreachable section would not cause problems at runtime. And type checking the
unreachable code under the assumption that it *is* reachable might lead to false positives:
### Global constants

```py
FEATURE_X_ACTIVATED = False
from typing import Literal

FEATURE_X_ACTIVATED: Literal[False] = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support Final yet, so I needed to declare this as Literal[False]. Otherwise, the public type (which we use below) would be Unknown | Literal[False].


if FEATURE_X_ACTIVATED:
def feature_x():
Expand All @@ -248,7 +283,166 @@ def f():
# Type checking this particular section as if it were reachable would
# lead to a false positive, so we should not emit diagnostics here.

# TODO: no error should be emitted here
# error: [unresolved-reference]
feature_x()
```

### Exhaustive check of syntactic constructs

We include some more examples here to make sure that silencing of diagnostics works for
syntactically different cases. To test this, we use `ExceptionGroup`, which is only available in
Python 3.11 and later. We have set the Python version to 3.10 for this whole section, to have
`match` statements available, but not `ExceptionGroup`.

To start, we make sure that we do not emit a diagnostic in this simple case:

```py
import sys

if sys.version_info >= (3, 11):
ExceptionGroup # no error here
```

Similarly, if we negate the logic, we also emit no error:

```py
if sys.version_info < (3, 11):
pass
else:
ExceptionGroup # no error here
```

This also works for more complex `if`-`elif`-`else` chains:

```py
if sys.version_info >= (3, 13):
ExceptionGroup # no error here
elif sys.version_info >= (3, 12):
ExceptionGroup # no error here
elif sys.version_info >= (3, 11):
ExceptionGroup # no error here
elif sys.version_info >= (3, 10):
pass
else:
pass
```

The same works for ternary expressions:

```py
class ExceptionGroupPolyfill: ...

MyExceptionGroup1 = ExceptionGroup if sys.version_info >= (3, 11) else ExceptionGroupPolyfill
MyExceptionGroup1 = ExceptionGroupPolyfill if sys.version_info < (3, 11) else ExceptionGroup
```

Due to short-circuiting, this also works for Boolean operators:

```py
sys.version_info >= (3, 11) and ExceptionGroup
sys.version_info < (3, 11) or ExceptionGroup
```

And in `match` statements:

```py
reveal_type(sys.version_info.minor) # revealed: Literal[10]

match sys.version_info.minor:
case 13:
ExceptionGroup
case 12:
ExceptionGroup
case 11:
ExceptionGroup
case _:
pass
```

Terminal statements can also lead to unreachable code:

```py
def f():
if sys.version_info < (3, 11):
raise RuntimeError("this code only works for Python 3.11+")

ExceptionGroup
```

Finally, not that anyone would ever use it, but it also works for `while` loops:

```py
while sys.version_info >= (3, 11):
ExceptionGroup
```

### Silencing errors for actually unknown symbols

We currently also silence diagnostics for symbols that are not actually defined anywhere. It is
conceivable that this could be improved, but is not a priority for now.

```py
if False:
does_not_exist

def f():
return
does_not_exist
```

### Attributes

When attribute expressions appear in unreachable code, we should not emit `unresolved-attribute`
diagnostics:

```py
import sys
import builtins

if sys.version_info >= (3, 11):
# TODO
# error: [unresolved-attribute]
builtins.ExceptionGroup
```

### Imports

When import statements appear in unreachable code, we should not emit `unresolved-import`
diagnostics:

```py
import sys

if sys.version_info >= (3, 11):
# TODO
# error: [unresolved-import]
from builtins import ExceptionGroup

# TODO
# error: [unresolved-import]
import builtins.ExceptionGroup

# See https://docs.python.org/3/whatsnew/3.11.html#new-modules

# TODO
# error: [unresolved-import]
import tomllib

# TODO
# error: [unresolved-import]
import wsgiref.types
```
Comment on lines +392 to +433
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave attributes and imports for a follow up. I want to get some feedback on this first.


### Emit diagnostics for definitely wrong code

Even though the expressions in the snippet below are unreachable, we still emit diagnostics for
them:

```py
if False:
1 + "a" # error: [unsupported-operator]

def f():
return

1 / 0 # error: [division-by-zero]
```
Loading
Loading