🚀 Performance improvements
- Switch eligible casts to non-strict in optimizer (#22850)
- Allow predicate passing set_sorted (#22797)
- Increase default cross-file parallelism limit for new-streaming multiscan (#22700)
- Add elementwise execution mode for
list.eval
(#22715) - Support optimised init from non-dict
Mapping
objects infrom_records
and frame/series constructors (#22638) - Add streaming cross-join node (#22581)
- Switch off
maintain_order
in group-by followed by sort (#22492)
✨ Enhancements
- Load AWS
endpoint_url
using boto3 (#22851) - Implemented
list.filter
(#22749) - Support binaryoffset in search sorted (#22786)
- Add
nulls_equal
flag tolist/arr.contains
(#22773) - Implement
LazyFrame.match_to_schema
(#22726) - Improved time-string parsing and inference (generally, and via the SQL interface) (#22606)
- Allow for
.over
to be called withoutpartition_by
(#22712) - Support
AnyValue
translation fromPyMapping
values (#22722) - Support optimised init from non-dict
Mapping
objects infrom_records
and frame/series constructors (#22638) - Support inference of
Int128
dtype from databases that support it (#22682) - Add options to write Parquet field metadata (#22652)
- Add
cast_options
parameter to control type casting inscan_parquet
(#22617) - Allow casting
List<UInt8>
toBinary
(#22611) - Allow setting of regex size limit using
POLARS_REGEX_SIZE_LIMIT
(#22651) - Support use of literal values as "other" when evaluating
Series.zip_with
(#22632) - Allow to read and write custom file-level parquet metadata (#21806)
- Support PEP702
@deprecated
decorator behaviour (#22594) - Support grouping by
pl.Array
(#22575) - Preserve exception type and traceback for errors raised from Python (#22561)
- Use fixed-width font in streaming phys plan graph (#22540)
🐞 Bug fixes
- Fix RuntimeError when serializing the same DataFrame from multiple threads (#22844)
- Fix map_elements predicate pushdown (#22833)
- Fix reverse list type (#22832)
- Don't require numpy for search_sorted (#22817)
- Add type equality checking for relevant methods (#22802)
- Invalid output for
fill_null
afterwhen.then
on structs (#22798) - Don't panic for cross join with misaligned chunking (#22799)
- Panic on quantile over nulls in rolling window (#22792)
- Respect BinaryOffset metadata (#22785)
- Correct the output order of
PartitionByKey
andPartitionParted
(#22778) - Fallback to non-strict casting for deprecated casts (#22760)
- Clippy on new stable version (#22771)
- Handle sliced out remainder for bitmaps (#22759)
- Don't merge
Enum
categories on append (#22765) - Fix unnest() not working on empty struct columns (#22391)
- Fix the default value type in
Schema
init (#22589) - Correct name in
unnest
error message (#22740) - Provide "schema" to
DataFrame
, even if empty JSON (#22739) - Properly account for nulls in the
is_not_nan
check made indrop_nans
(#22707) - Incorrect result from SQL
count(*)
withpartition by
(#22728) - Fix deadlock joining scanned tables with low thread count (#22672)
- Don't allow deserializing incompatible DSL (#22644)
- Incorrect null dtype from binary ops in empty group_by (#22721)
- Don't mark
str.replace_many
with Mapping as deprecated (#22697) - Gzip has maximum compression of 9, not 10 (#22685)
- Fix predicate pushdown of fallible expressions (#22669)
- Fix
index out of bounds
panic when scanning hugging face (#22661) - Panic on
group_by
with literal and empty rows (#22621) - Return input instead of panicking if empty subset in
drop_nulls()
anddrop_nans()
(#22469) - Bump argminmax to 0.6.3 (#22649)
- DSL version deserialization endianness (#22642)
- Allow Expr.round() to be called on integer dtypes (#22622)
- Fix panic when filtering based on row index column in parquet (#22616)
- WASM and PyOdide compile (#22613)
- Resolve
get()
SchemaMismatch panic (#22350) - Panic in group_by_dynamic on single-row df with group_by (#22597)
- Add
new_streaming
feature topolars
crate (#22601) - Consistently use Unix epoch as origin for
dt.truncate
(except weekly buckets which start on Mondays) (#22592) - Fix interpolate on dtype Decimal (#22541)
- CSV count rows skipped last line if file did not end with newline (#22577)
- Make nested strict casting actually strict (#22497)
- Make
replace
andreplace_strict
mapping use list literals (#22566) - Allow pivot on
Time
column (#22550) - Fix error when providing CSV schema with extra columns (#22544)
- Panic on bitwise op between Series and Expr (#22527)
- Multi-selector regex expansion (#22542)
📖 Documentation
- Add pre-release policy (#22808)
- Fix broken link to service account page in Polars Cloud docs (#22762)
- Add
match_to_schema
to API reference (#22777) - Provide additional explanation and examples for the
value_counts
"normalize" parameter (#22756) - Rework documentation for
drop
/fill
for nulls/nans (#22657) - Add documentation to new
RoundMode
parameter inround
(#22555) - Add missing
repeat_by
to API reference, fixuplist.get
(#22698) - Fix non-rendering bullet points in
scan_iceberg
(#22694) - Improve
insert_column
docstring (description and examples) (#22551) - Improve
join
documentation (#22556)
📦 Build system
- Fix building
polars-lazy
with certain features (#22846) - Add missing features (#22839)
- Patch pyo3 to disable recompilation (#22796)
🛠️ Other improvements
- Update Rust Polars versions (#22854)
- Add basic smoke test for free-threaded python (#22481)
- Update Polars Rust versions (#22834)
- Fix
nix build
(#22809) - Fix flake.nix to work on macos (#22803)
- Unused variables on release build (#22800)
- Update cloud docs (#22624)
- Fix unstable
list.eval
performance test (#22729) - Add proptest implementations for all Array types (#22711)
- Dispatch
.write_*
to.lazy().sink_*(engine='in-memory')
(#22582) - Move to all optimization flags to
QueryOptFlags
(#22680) - Add test for
str.replace_many
(#22615) - Stabilize
sink_*
(#22643) - Add proptest for row-encode (#22626)
- Update rust version in nix flake (#22627)
- Add a nix flake with a devShell and package (#22246)
- Use a wrapper struct to store time zone (#22523)
- Add
proptest
testing for for parquet decoding kernels (#22608) - Include equiprobable as valid quantile method (#22571)
- Remove confusing error context calling
.collect(_eager=True)
(#22602) - Fix test_truncate_path test case (#22598)
- Unify function flags into 1 bitset (#22573)
- Display the operation behind
in-memory-map
(#22552)
Thank you to all our contributors for making this release possible!
@IvanIsCoding, @JakubValtar, @Julian-J-S, @LucioFranco, @MarcoGorelli, @WH-2099, @alexander-beedie, @borchero, @bschoenmaeckers, @cmdlineluser, @coastalwhite, @etiennebacher, @florian-klein, @itamarst, @kdn36, @mcrumiller, @nameexhaustion, @nikaltipar, @orlp, @pavelzw, @r-brink, @ritchie46, @stijnherfst, @teotwaki, @timkpaine and @wence-