-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
test_interpolate_by.test_interpolate_vs_numpy
fails often
#22348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In the CI, |
It looks like the CI uses pytorch as the index-url, and the exact numpy package appears to be https://download.pytorch.org/whl/numpy-2.1.2-cp312-cp312-macosx_14_0_arm64.whl.metadata (I'm using this failed example). I'm on WSL so I can't repro the macosx installation unfortunately. I am not sure why we use pytorch instead of Pypi for these packages. There is a more recent version of numpy available (2.2.5) so it's possible that upgrading may resolve the issue. |
Ok, minor update: it looks like when |
FWIW, I can reproduce the same error values as CI on my Apple Silicon Mac,
Edit: I get the same values after upgrading to |
@anath2 are these enough sig figs to reproduce? Here is the test with the inputs provided from the failed case I linked to: import numpy as np
import polars as pl
dataframe = (
pl.DataFrame({
"ts": pl.Series([0.0, 0.0, 0.0, -1.1755e-38, -3.0], dtype=pl.Float64),
"value": pl.Series([0.0, None, None, None, 2.9753327e8], dtype=pl.Float64),
})
.fill_nan(None)
.unique("ts")
.sort("ts")
)
result = dataframe.select(pl.col("value").interpolate_by("ts"))["value"]
mask = dataframe["value"].is_not_null()
np_dtype = "float64"
x = dataframe["ts"].to_numpy().astype(np_dtype)
xp = dataframe["ts"].filter(mask).to_numpy().astype(np_dtype)
yp = dataframe["value"].filter(mask).to_numpy().astype("float64")
interp = np.interp(x, xp, yp)
# Polars preserves nulls on boundaries, but NumPy doesn't.
first_non_null = dataframe["value"].is_not_null().arg_max()
last_non_null = len(dataframe) - dataframe["value"][::-1].is_not_null().arg_max() # type: ignore[operator]
interp[:first_non_null] = float("nan")
interp[last_non_null:] = float("nan")
expected = dataframe.with_columns(value=pl.Series(interp, nan_to_null=True))["value"]
print(f"result: {result}")
print(f"expected: {expected}") |
test_interpolate_by.pytest_interpolate_vs_numpy
fails oftentest_interpolate_by.test_interpolate_vs_numpy
fails often
@mcrumiller Yep I can reproduce:
Produces the exact error
The assertion fails at |
@anath2 what's the result of |
Yes that's correct. |
Checks
Reproducible example
Log output
Issue description
The test
test_interpolate_vs_numpy
attests/unit/operations/test_interpolate_by.py
fails sometimes. Likely due to fp precision issue.Maybe more lenient tolerance is needed for comparison against
np.interp
for floating point values.Expected behavior
Test result should be deterministic.
Installed versions
The text was updated successfully, but these errors were encountered: