Skip to content

[red-knot] Support custom typeshed Markdown tests #15683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion crates/red_knot/tests/file_watching.rs
Original file line number Diff line number Diff line change
Expand Up @@ -322,7 +322,7 @@ where
.search_paths
.extra_paths
.iter()
.chain(program_settings.search_paths.typeshed.as_ref())
.chain(program_settings.search_paths.custom_typeshed.as_ref())
{
std::fs::create_dir_all(path.as_std_path())
.with_context(|| format!("Failed to create search path `{path}`"))?;
Expand Down
2 changes: 1 addition & 1 deletion crates/red_knot_project/src/metadata/options.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ impl Options {
.map(|path| path.absolute(project_root, system))
.collect(),
src_roots,
typeshed: typeshed.map(|path| path.absolute(project_root, system)),
custom_typeshed: typeshed.map(|path| path.absolute(project_root, system)),
site_packages: python
.map(|venv_path| SitePackages::Derived {
venv_path: venv_path.absolute(project_root, system),
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,70 @@
# Importing builtin module
# Builtins

## Importing builtin module

Builtin symbols can be explicitly imported:

```py
import builtins

x = builtins.chr
reveal_type(x) # revealed: Literal[chr]
reveal_type(builtins.chr) # revealed: Literal[chr]
```

## Implicit use of builtin

Or used implicitly:

```py
reveal_type(chr) # revealed: Literal[chr]
reveal_type(str) # revealed: Literal[str]
```

## Builtin symbol from custom typeshed

If we specify a custom typeshed, we can use the builtin symbol from it, and no longer access the
builtins from the "actual" vendored typeshed:

```toml
[environment]
typeshed = "/typeshed"
```

```pyi path=/typeshed/stdlib/builtins.pyi
class Custom: ...

custom_builtin: Custom
```

```pyi path=/typeshed/stdlib/typing_extensions.pyi
def reveal_type(obj, /): ...
```

```py
Comment on lines +38 to +41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if mdtest should just automatically include this in any custom typeshed (if you don't specify your own typing_extensions), so we don't have to specify it in each test?

But I guess there's value in simple, explicit behavior; we can wait and see how painful this is; maybe not too bad if we don't have too many tests using custom typeshed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think ideally we would have at least a few more tests using custom typesheds, so I wouldn't want to make it too painful to write these tests)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, we already discussed this at #15683 (comment) ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that convo but I didn't see any specific proposal there to do something to make this easier. I guess we could do the override-files-instead-of-replace-entirely thing, but I kind of like the idea that we can write tests that more fully control typeshed contents; I think replacing just one file without causing lots of cascading import errors would be hard (though I guess maybe import errors are fine, we won't surface them in typeshed.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing files is probably also more expensive to implement because it requires reading and copying all vendored files to the memory file system (not hugely expensive but not free).

I do like the idea of an extra environment option that e.g. includes whatever's necessary for reveal_type

reveal_type(custom_builtin) # revealed: Custom

# error: [unresolved-reference]
reveal_type(str) # revealed: Unknown
```

## Unknown builtin (later defined)

`foo` has a type of `Unknown` in this example, as it relies on `bar` which has not been defined at
that point:

```toml
[environment]
typeshed = "/typeshed"
```

```pyi path=/typeshed/stdlib/builtins.pyi
foo = bar
bar = 1
```

```pyi path=/typeshed/stdlib/typing_extensions.pyi
def reveal_type(obj, /): ...
```

```py
reveal_type(foo) # revealed: Unknown
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took me a second to clock that this was actually an integration test for the feature, as well as documentation 😆

Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Custom typeshed

The `environment.typeshed` configuration option can be used to specify a custom typeshed directory
for Markdown-based tests. Custom typeshed stubs can then be placed in the specified directory using
fenced code blocks with language `pyi`, and will be used instead of the vendored copy of typeshed.

A fenced code block with language `text` can be used to provide a `stdlib/VERSIONS` file in the
custom typeshed root. If no such file is created explicitly, it will be automatically created with
entries enabling all specified `<typeshed-root>/stdlib` files for all supported Python versions.

## Basic example (auto-generated `VERSIONS` file)

First, we specify `/typeshed` as the custom typeshed directory:

```toml
[environment]
typeshed = "/typeshed"
```

We can then place custom stub files in `/typeshed/stdlib`, for example:

```pyi path=/typeshed/stdlib/builtins.pyi
class BuiltinClass: ...

builtin_symbol: BuiltinClass
```

```pyi path=/typeshed/stdlib/sys/__init__.pyi
version = "my custom Python"
```

And finally write a normal Python code block that makes use of the custom stubs:

```py
b: BuiltinClass = builtin_symbol

class OtherClass: ...

o: OtherClass = builtin_symbol # error: [invalid-assignment]

# Make sure that 'sys' has a proper entry in the auto-generated 'VERSIONS' file
import sys
```

## Custom `VERSIONS` file

If we want to specify a custom `VERSIONS` file, we can do so by creating a fenced code block with
language `text`. In the following test, we set the Python version to `3.10` and then make sure that
we can *not* import `new_module` with a version requirement of `3.11-`:

```toml
[environment]
python-version = "3.10"
typeshed = "/typeshed"
```

```pyi path=/typeshed/stdlib/old_module.pyi
class OldClass: ...
```

```pyi path=/typeshed/stdlib/new_module.pyi
class NewClass: ...
```

```text path=/typeshed/stdlib/VERSIONS
old_module: 3.0-
new_module: 3.11-
```

```py
from old_module import OldClass

# error: [unresolved-import] "Cannot resolve import `new_module`"
from new_module import NewClass
```

## Using `reveal_type` with a custom typeshed

When providing a custom typeshed directory, basic things like `reveal_type` will stop working
because we rely on being able to import it from `typing_extensions`. The actual definition of
`reveal_type` in typeshed is slightly involved (depends on generics, `TypeVar`, etc.), but a very
simple untyped definition is enough to make `reveal_type` work in tests:
Comment on lines +79 to +82
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, that's annoying 😆 but fine for now, I think, since I imagine making reveal_type work even if it's not defined in a custom typeshed would require much more invasive special casing for the function.

Mypy's approach might be interesting to you though -- it by default uses a custom typeshed for its tests, which you can find in https://github.com/python/mypy/tree/master/test-data/unit/lib-stub. The reason for this is largely to improve the runtime of the tests -- the default custom typeshed is very simple, and mypy's tests would be much slower if they all used the much-more-complex "real" typeshed in every test. Mypy's approach has many downsides: there have often been PRs that appear to fix bugs, and pass all tests, but don't actually fix the bug when mypy runs on real user code, because of the difference between mypy's mock typeshed and the real typeshed mypy uses when checking user code. But it also has upsides apart from performance:

  • Mypy's tests are better isolated from random typeshed changes that cause tests to fail
  • It's very easy to override a single file or two from mypy's custom typeshed in a test case, without having to provide definitions for reveal_type in every test case.

Overall I'm definitely not advocating for mypy's approach; I think the costs outweigh the benefits given how much faster our tests are than mypy's. But it is an interesting approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mypy's tests are better isolated from random typeshed changes that cause tests to fail

To clarify my understanding: This part we're fairly well insulated from, since we're vendoring specific SHAs of typeshed. We'll get test failures when we update our vendored copy, if there have been incompatible changes — but that's a good thing, since those are real things we have to fix, and CI forces us to fix them before we can commit the typeshed update. Do I have all of that right?

It's very easy to override a single file or two from mypy's custom typeshed in a test case, without having to provide definitions for reveal_type in every test case.

Does mypy's ability to override individual files depend on it using a custom typeshed by default? i.e. I could imagine an additional environment.typeshed_override = true mdtest config that lets us override individual files but retain the rest of our vendored real typeshed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mypy's tests are better isolated from random typeshed changes that cause tests to fail

To clarify my understanding: This part we're fairly well insulated from, since we're vendoring specific SHAs of typeshed. We'll get test failures when we update our vendored copy, if there have been incompatible changes — but that's a good thing, since those are real things we have to fix, and CI forces us to fix them before we can commit the typeshed update. Do I have all of that right?

Yes, that's all correct! Our tests are more vulnerable to breaking whenever we do a typeshed sync, and already have broken several times when we've tried to sync typeshed. So mypy's tests are closer to being unit tests, in that they control the input to their tests totally; ours are more like integration tests. But as you say, this makes our tests more representative of how we'll actually type-check our users' code, so it's a good thing!

Does mypy's ability to override individual files depend on it using a custom typeshed by default? i.e. I could imagine an additional environment.typeshed_override = true mdtest config that lets us override individual files but retain the rest of our vendored real typeshed.

That's true, I suppose that could work!


```toml
[environment]
typeshed = "/typeshed"
```

```pyi path=/typeshed/stdlib/typing_extensions.pyi
def reveal_type(obj, /): ...
```

```py
reveal_type(()) # revealed: tuple[()]
```
11 changes: 3 additions & 8 deletions crates/red_knot_python_semantic/src/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ pub(crate) mod tests {
/// Target Python platform
python_platform: PythonPlatform,
/// Path to a custom typeshed directory
custom_typeshed: Option<SystemPathBuf>,
typeshed: Option<SystemPathBuf>,
/// Path and content pairs for files that should be present
files: Vec<(&'a str, &'a str)>,
}
Expand All @@ -146,7 +146,7 @@ pub(crate) mod tests {
Self {
python_version: PythonVersion::default(),
python_platform: PythonPlatform::default(),
custom_typeshed: None,
typeshed: None,
files: vec![],
}
}
Expand All @@ -156,11 +156,6 @@ pub(crate) mod tests {
self
}

pub(crate) fn with_custom_typeshed(mut self, path: &str) -> Self {
self.custom_typeshed = Some(SystemPathBuf::from(path));
self
}

pub(crate) fn with_file(mut self, path: &'a str, content: &'a str) -> Self {
self.files.push((path, content));
self
Expand All @@ -176,7 +171,7 @@ pub(crate) mod tests {
.context("Failed to write test files")?;

let mut search_paths = SearchPathSettings::new(vec![src_root]);
search_paths.typeshed = self.custom_typeshed;
search_paths.custom_typeshed = self.typeshed;

Program::from_settings(
&db,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ impl SearchPaths {
let SearchPathSettings {
extra_paths,
src_roots,
typeshed,
custom_typeshed: typeshed,
site_packages: site_packages_paths,
} = settings;

Expand Down Expand Up @@ -1308,7 +1308,7 @@ mod tests {
search_paths: SearchPathSettings {
extra_paths: vec![],
src_roots: vec![src.clone()],
typeshed: Some(custom_typeshed),
custom_typeshed: Some(custom_typeshed),
site_packages: SitePackages::Known(vec![site_packages]),
},
},
Expand Down Expand Up @@ -1814,7 +1814,7 @@ not_a_directory
search_paths: SearchPathSettings {
extra_paths: vec![],
src_roots: vec![SystemPathBuf::from("/src")],
typeshed: None,
custom_typeshed: None,
site_packages: SitePackages::Known(vec![
venv_site_packages,
system_site_packages,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ pub(crate) struct UnspecifiedTypeshed;
///
/// For tests checking that standard-library module resolution is working
/// correctly, you should usually create a [`MockedTypeshed`] instance
/// and pass it to the [`TestCaseBuilder::with_custom_typeshed`] method.
/// and pass it to the [`TestCaseBuilder::with_mocked_typeshed`] method.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by fix. Unrelated to this PR.

/// If you need to check something that involves the vendored typeshed stubs
/// we include as part of the binary, you can instead use the
/// [`TestCaseBuilder::with_vendored_typeshed`] method.
Expand Down Expand Up @@ -238,7 +238,7 @@ impl TestCaseBuilder<MockedTypeshed> {
search_paths: SearchPathSettings {
extra_paths: vec![],
src_roots: vec![src.clone()],
typeshed: Some(typeshed.clone()),
custom_typeshed: Some(typeshed.clone()),
site_packages: SitePackages::Known(vec![site_packages.clone()]),
},
},
Expand Down
4 changes: 2 additions & 2 deletions crates/red_knot_python_semantic/src/program.rs
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ pub struct SearchPathSettings {
/// Optional path to a "custom typeshed" directory on disk for us to use for standard-library types.
/// If this is not provided, we will fallback to our vendored typeshed stubs for the stdlib,
/// bundled as a zip file in the binary
pub typeshed: Option<SystemPathBuf>,
pub custom_typeshed: Option<SystemPathBuf>,

/// The path to the user's `site-packages` directory, where third-party packages from ``PyPI`` are installed.
pub site_packages: SitePackages,
Expand All @@ -119,7 +119,7 @@ impl SearchPathSettings {
Self {
src_roots,
extra_paths: vec![],
typeshed: None,
custom_typeshed: None,
site_packages: SitePackages::Known(vec![]),
}
}
Expand Down
53 changes: 1 addition & 52 deletions crates/red_knot_python_semantic/src/types/infer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6003,14 +6003,13 @@ fn perform_membership_test_comparison<'db>(

#[cfg(test)]
mod tests {
use crate::db::tests::{setup_db, TestDb, TestDbBuilder};
use crate::db::tests::{setup_db, TestDb};
use crate::semantic_index::definition::Definition;
use crate::semantic_index::symbol::FileScopeId;
use crate::semantic_index::{global_scope, semantic_index, symbol_table, use_def_map};
use crate::types::check_types;
use crate::{HasType, SemanticModel};
use ruff_db::files::{system_path_to_file, File};
use ruff_db::parsed::parsed_module;
use ruff_db::system::DbWithTestSystem;
use ruff_db::testing::assert_function_query_was_not_run;

Expand Down Expand Up @@ -6281,56 +6280,6 @@ mod tests {
Ok(())
}

#[test]
fn builtin_symbol_vendored_stdlib() -> anyhow::Result<()> {
let mut db = setup_db();

db.write_file("/src/a.py", "c = chr")?;

assert_public_type(&db, "/src/a.py", "c", "Literal[chr]");

Ok(())
}

#[test]
fn builtin_symbol_custom_stdlib() -> anyhow::Result<()> {
let db = TestDbBuilder::new()
.with_custom_typeshed("/typeshed")
.with_file("/src/a.py", "c = copyright")
.with_file(
"/typeshed/stdlib/builtins.pyi",
"def copyright() -> None: ...",
)
.with_file("/typeshed/stdlib/VERSIONS", "builtins: 3.8-")
.build()?;

assert_public_type(&db, "/src/a.py", "c", "Literal[copyright]");

Ok(())
}

#[test]
fn unknown_builtin_later_defined() -> anyhow::Result<()> {
let db = TestDbBuilder::new()
.with_custom_typeshed("/typeshed")
.with_file("/src/a.py", "x = foo")
.with_file("/typeshed/stdlib/builtins.pyi", "foo = bar; bar = 1")
.with_file("/typeshed/stdlib/VERSIONS", "builtins: 3.8-")
.build()?;

assert_public_type(&db, "/src/a.py", "x", "Unknown");

Ok(())
}

#[test]
fn str_builtin() -> anyhow::Result<()> {
let mut db = setup_db();
db.write_file("/src/a.py", "x = str")?;
assert_public_type(&db, "/src/a.py", "x", "Literal[str]");
Ok(())
}

#[test]
fn deferred_annotation_builtin() -> anyhow::Result<()> {
let mut db = setup_db();
Expand Down
Loading
Loading