Skip to content

Commit 7ae5d31

Browse files
committed
[red-knot] Reachability analysis
1 parent 5cee346 commit 7ae5d31

File tree

5 files changed

+374
-37
lines changed

5 files changed

+374
-37
lines changed

crates/red_knot_python_semantic/resources/mdtest/statically_known_branches.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1502,13 +1502,14 @@ if True:
15021502
from module import symbol
15031503
```
15041504

1505-
## Unsupported features
1505+
## Unreachable code
15061506

1507-
We do not support full unreachable code analysis yet. We also raise diagnostics from
1508-
statically-known to be false branches:
1507+
A closely related feature is the ability to detect unreachable code. For example, we do not emit a
1508+
diagnostic here:
15091509

15101510
```py
15111511
if False:
1512-
# error: [unresolved-reference]
15131512
x
15141513
```
1514+
1515+
See [unreachable.md](unreachable.md) for more tests on this topic.

crates/red_knot_python_semantic/resources/mdtest/unreachable.md

Lines changed: 217 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,16 @@
11
# Unreachable code
22

3+
This document describes our approach to handling unreachable code. There are two aspects to this.
4+
One is to detect and mark blocks of code that are unreachable. This is useful for notifying the
5+
user, as it can often be indicative of an error. The second aspect of this is to make sure that we
6+
do not emit (incorrect) diagnostics in unreachable code.
7+
38
## Detecting unreachable code
49

510
In this section, we look at various scenarios how sections of code can become unreachable. We should
6-
eventually introduce a new diagnostic that would detect unreachable code.
11+
eventually introduce a new diagnostic that would detect unreachable code. In an editor/LSP context,
12+
there are ways to 'gray out' sections of code, which is helpful for blocks of code that are not
13+
'dead' code, but inactive under certain conditions, like platform-specific code.
714

815
### Terminal statements
916

@@ -85,21 +92,21 @@ def f():
8592
print("unreachable")
8693
```
8794

88-
## Python version and platform checks
95+
### Python version and platform checks
8996

9097
It is common to have code that is specific to a certain Python version or platform. This case is
9198
special because whether or not the code is reachable depends on externally configured constants. And
9299
if we are checking for a set of parameters that makes one of these branches unreachable, that is
93100
likely not something that the user wants to be warned about, because there are probably other sets
94101
of parameters that make the branch reachable.
95102

96-
### `sys.version_info` branches
103+
#### `sys.version_info` branches
97104

98105
Consider the following example. If we check with a Python version lower than 3.11, the import
99106
statement is unreachable. If we check with a Python version equal to or greater than 3.11, the
100107
import statement is definitely reachable. We should not emit any diagnostics in either case.
101108

102-
#### Checking with Python version 3.10
109+
##### Checking with Python version 3.10
103110

104111
```toml
105112
[environment]
@@ -115,7 +122,7 @@ if sys.version_info >= (3, 11):
115122
from typing import Self
116123
```
117124

118-
#### Checking with Python version 3.12
125+
##### Checking with Python version 3.12
119126

120127
```toml
121128
[environment]
@@ -129,12 +136,12 @@ if sys.version_info >= (3, 11):
129136
from typing import Self
130137
```
131138

132-
### `sys.platform` branches
139+
#### `sys.platform` branches
133140

134141
The problem is even more pronounced with `sys.platform` branches, since we don't necessarily have
135142
the platform information available.
136143

137-
#### Checking with platform `win32`
144+
##### Checking with platform `win32`
138145

139146
```toml
140147
[environment]
@@ -148,7 +155,7 @@ if sys.platform == "win32":
148155
sys.getwindowsversion()
149156
```
150157

151-
#### Checking with platform `linux`
158+
##### Checking with platform `linux`
152159

153160
```toml
154161
[environment]
@@ -164,13 +171,21 @@ if sys.platform == "win32":
164171
sys.getwindowsversion()
165172
```
166173

167-
#### Checking without a specified platform
174+
##### Checking with platform set to `all`
168175

169176
```toml
170177
[environment]
171-
# python-platform not specified
178+
python-platform = "all"
172179
```
173180

181+
If `python-platform` is set to `all`, we treat the platform as unspecified. This means that we do
182+
not infer a literal type like `Literal["win32"]` for `sys.platform`, but instead fall back to
183+
`LiteralString` (the `typeshed` annotation for `sys.platform`). This means that we can not
184+
statically determine the truthiness of a branch like `sys.platform == "win32"`.
185+
186+
See <https://github.com/astral-sh/ruff/issues/16983#issuecomment-2777146188> for a plan on how this
187+
could be improved.
188+
174189
```py
175190
import sys
176191

@@ -180,11 +195,13 @@ if sys.platform == "win32":
180195
sys.getwindowsversion()
181196
```
182197

183-
#### Checking with platform set to `all`
198+
##### Checking without a specified platform
199+
200+
If `python-platform` is not specified, we currently default to `all`:
184201

185202
```toml
186203
[environment]
187-
python-platform = "all"
204+
# python-platform not specified
188205
```
189206

190207
```py
@@ -196,9 +213,29 @@ if sys.platform == "win32":
196213
sys.getwindowsversion()
197214
```
198215

199-
## No false positive diagnostics in unreachable code
216+
## No (incorrect) diagnostics in unreachable code
217+
218+
```toml
219+
[environment]
220+
python-version = "3.10"
221+
```
222+
223+
In this section, we demonstrate that we do not emit (incorrect) diagnostics in unreachable sections
224+
of code.
225+
226+
It could be argued that no diagnostics at all should be emitted in unreachable code. The reasoning
227+
is that any issues inside the unreachable section would not cause problems at runtime. And type
228+
checking the unreachable code under the assumption that it *is* reachable might lead to false
229+
positives (see the "Global constants" example below).
200230

201-
In this section, we make sure that we do not emit false positive diagnostics in unreachable code.
231+
On the other hand, it could be argued that code like `1 + "a"` is incorrect, no matter if it is
232+
reachable or not. Some developers like to use things like early `return` statements while debugging,
233+
and for this use case, it is helpful to still see some diagnostics in unreachable sections.
234+
235+
We currently follow the second approach, but we do not attempt to provide the full set of
236+
diagnostics in unreachable sections. In fact, we silence a certain category of diagnostics
237+
(`unresolved-reference`, `unresolved-attribute`, …), in order to avoid *incorrect* diagnostics. In
238+
the future, we may revisit this decision.
202239

203240
### Use of variables in unreachable code
204241

@@ -225,19 +262,17 @@ def outer():
225262
x = 1
226263

227264
def inner():
228-
return x # Name `x` used when not defined
265+
reveal_type(x) # revealed: Unknown
229266
while True:
230267
pass
231268
```
232269

233-
## No diagnostics in unreachable code
234-
235-
In general, no diagnostics should be emitted in unreachable code. The reasoning is that any issues
236-
inside the unreachable section would not cause problems at runtime. And type checking the
237-
unreachable code under the assumption that it *is* reachable might lead to false positives:
270+
### Global constants
238271

239272
```py
240-
FEATURE_X_ACTIVATED = False
273+
from typing import Literal
274+
275+
FEATURE_X_ACTIVATED: Literal[False] = False
241276

242277
if FEATURE_X_ACTIVATED:
243278
def feature_x():
@@ -248,7 +283,166 @@ def f():
248283
# Type checking this particular section as if it were reachable would
249284
# lead to a false positive, so we should not emit diagnostics here.
250285

251-
# TODO: no error should be emitted here
252-
# error: [unresolved-reference]
253286
feature_x()
254287
```
288+
289+
### Exhaustive check of syntactic constructs
290+
291+
We include some more examples here to make sure that silencing of diagnostics works for
292+
syntactically different cases. To test this, we use `ExceptionGroup`, which is only available in
293+
Python 3.11 and later. We have set the Python version to 3.10 for this whole section, to have
294+
`match` statements available, but not `ExceptionGroup`.
295+
296+
To start, we make sure that we do not emit a diagnostic in this simple case:
297+
298+
```py
299+
import sys
300+
301+
if sys.version_info >= (3, 11):
302+
ExceptionGroup # no error here
303+
```
304+
305+
Similarly, if we negate the logic, we also emit no error:
306+
307+
```py
308+
if sys.version_info < (3, 11):
309+
pass
310+
else:
311+
ExceptionGroup # no error here
312+
```
313+
314+
This also works for more complex `if`-`elif`-`else` chains:
315+
316+
```py
317+
if sys.version_info >= (3, 13):
318+
ExceptionGroup # no error here
319+
elif sys.version_info >= (3, 12):
320+
ExceptionGroup # no error here
321+
elif sys.version_info >= (3, 11):
322+
ExceptionGroup # no error here
323+
elif sys.version_info >= (3, 10):
324+
pass
325+
else:
326+
pass
327+
```
328+
329+
The same works for ternary expressions:
330+
331+
```py
332+
class ExceptionGroupPolyfill: ...
333+
334+
MyExceptionGroup1 = ExceptionGroup if sys.version_info >= (3, 11) else ExceptionGroupPolyfill
335+
MyExceptionGroup1 = ExceptionGroupPolyfill if sys.version_info < (3, 11) else ExceptionGroup
336+
```
337+
338+
Due to short-circuiting, this also works for Boolean operators:
339+
340+
```py
341+
sys.version_info >= (3, 11) and ExceptionGroup
342+
sys.version_info < (3, 11) or ExceptionGroup
343+
```
344+
345+
And in `match` statements:
346+
347+
```py
348+
reveal_type(sys.version_info.minor) # revealed: Literal[10]
349+
350+
match sys.version_info.minor:
351+
case 13:
352+
ExceptionGroup
353+
case 12:
354+
ExceptionGroup
355+
case 11:
356+
ExceptionGroup
357+
case _:
358+
pass
359+
```
360+
361+
Terminal statements can also lead to unreachable code:
362+
363+
```py
364+
def f():
365+
if sys.version_info < (3, 11):
366+
raise RuntimeError("this code only works for Python 3.11+")
367+
368+
ExceptionGroup
369+
```
370+
371+
Finally, not that anyone would ever use it, but it also works for `while` loops:
372+
373+
```py
374+
while sys.version_info >= (3, 11):
375+
ExceptionGroup
376+
```
377+
378+
### Silencing errors for actually unknown symbols
379+
380+
We currently also silence diagnostics for symbols that are not actually defined anywhere. It is
381+
conceivable that this could be improved, but is not a priority for now.
382+
383+
```py
384+
if False:
385+
does_not_exist
386+
387+
def f():
388+
return
389+
does_not_exist
390+
```
391+
392+
### Attributes
393+
394+
When attribute expressions appear in unreachable code, we should not emit `unresolved-attribute`
395+
diagnostics:
396+
397+
```py
398+
import sys
399+
import builtins
400+
401+
if sys.version_info >= (3, 11):
402+
# TODO
403+
# error: [unresolved-attribute]
404+
builtins.ExceptionGroup
405+
```
406+
407+
### Imports
408+
409+
When import statements appear in unreachable code, we should not emit `unresolved-import`
410+
diagnostics:
411+
412+
```py
413+
import sys
414+
415+
if sys.version_info >= (3, 11):
416+
# TODO
417+
# error: [unresolved-import]
418+
from builtins import ExceptionGroup
419+
420+
# TODO
421+
# error: [unresolved-import]
422+
import builtins.ExceptionGroup
423+
424+
# See https://docs.python.org/3/whatsnew/3.11.html#new-modules
425+
426+
# TODO
427+
# error: [unresolved-import]
428+
import tomllib
429+
430+
# TODO
431+
# error: [unresolved-import]
432+
import wsgiref.types
433+
```
434+
435+
### Emit diagnostics for definitely wrong code
436+
437+
Even though the expressions in the snippet below are unreachable, we still emit diagnostics for
438+
them:
439+
440+
```py
441+
if False:
442+
1 + "a" # error: [unsupported-operator]
443+
444+
def f():
445+
return
446+
447+
1 / 0 # error: [division-by-zero]
448+
```

0 commit comments

Comments
 (0)