Skip to content

Python Polars 1.30.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 21 May 13:33
ee0903b

🚀 Performance improvements

  • Switch eligible casts to non-strict in optimizer (#22850)
  • Allow predicate passing set_sorted (#22797)
  • Increase default cross-file parallelism limit for new-streaming multiscan (#22700)
  • Add elementwise execution mode for list.eval (#22715)
  • Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
  • Add streaming cross-join node (#22581)
  • Switch off maintain_order in group-by followed by sort (#22492)

✨ Enhancements

  • Load AWS endpoint_url using boto3 (#22851)
  • Implemented list.filter (#22749)
  • Support binaryoffset in search sorted (#22786)
  • Add nulls_equal flag to list/arr.contains (#22773)
  • Implement LazyFrame.match_to_schema (#22726)
  • Improved time-string parsing and inference (generally, and via the SQL interface) (#22606)
  • Allow for .over to be called without partition_by (#22712)
  • Support AnyValue translation from PyMapping values (#22722)
  • Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
  • Support inference of Int128 dtype from databases that support it (#22682)
  • Add options to write Parquet field metadata (#22652)
  • Add cast_options parameter to control type casting in scan_parquet (#22617)
  • Allow casting List<UInt8> to Binary (#22611)
  • Allow setting of regex size limit using POLARS_REGEX_SIZE_LIMIT (#22651)
  • Support use of literal values as "other" when evaluating Series.zip_with (#22632)
  • Allow to read and write custom file-level parquet metadata (#21806)
  • Support PEP702 @deprecated decorator behaviour (#22594)
  • Support grouping by pl.Array (#22575)
  • Preserve exception type and traceback for errors raised from Python (#22561)
  • Use fixed-width font in streaming phys plan graph (#22540)

🐞 Bug fixes

  • Fix RuntimeError when serializing the same DataFrame from multiple threads (#22844)
  • Fix map_elements predicate pushdown (#22833)
  • Fix reverse list type (#22832)
  • Don't require numpy for search_sorted (#22817)
  • Add type equality checking for relevant methods (#22802)
  • Invalid output for fill_null after when.then on structs (#22798)
  • Don't panic for cross join with misaligned chunking (#22799)
  • Panic on quantile over nulls in rolling window (#22792)
  • Respect BinaryOffset metadata (#22785)
  • Correct the output order of PartitionByKey and PartitionParted (#22778)
  • Fallback to non-strict casting for deprecated casts (#22760)
  • Clippy on new stable version (#22771)
  • Handle sliced out remainder for bitmaps (#22759)
  • Don't merge Enum categories on append (#22765)
  • Fix unnest() not working on empty struct columns (#22391)
  • Fix the default value type in Schema init (#22589)
  • Correct name in unnest error message (#22740)
  • Provide "schema" to DataFrame, even if empty JSON (#22739)
  • Properly account for nulls in the is_not_nan check made in drop_nans (#22707)
  • Incorrect result from SQL count(*) with partition by (#22728)
  • Fix deadlock joining scanned tables with low thread count (#22672)
  • Don't allow deserializing incompatible DSL (#22644)
  • Incorrect null dtype from binary ops in empty group_by (#22721)
  • Don't mark str.replace_many with Mapping as deprecated (#22697)
  • Gzip has maximum compression of 9, not 10 (#22685)
  • Fix predicate pushdown of fallible expressions (#22669)
  • Fix index out of bounds panic when scanning hugging face (#22661)
  • Panic on group_by with literal and empty rows (#22621)
  • Return input instead of panicking if empty subset in drop_nulls() and drop_nans() (#22469)
  • Bump argminmax to 0.6.3 (#22649)
  • DSL version deserialization endianness (#22642)
  • Allow Expr.round() to be called on integer dtypes (#22622)
  • Fix panic when filtering based on row index column in parquet (#22616)
  • WASM and PyOdide compile (#22613)
  • Resolve get() SchemaMismatch panic (#22350)
  • Panic in group_by_dynamic on single-row df with group_by (#22597)
  • Add new_streaming feature to polars crate (#22601)
  • Consistently use Unix epoch as origin for dt.truncate (except weekly buckets which start on Mondays) (#22592)
  • Fix interpolate on dtype Decimal (#22541)
  • CSV count rows skipped last line if file did not end with newline (#22577)
  • Make nested strict casting actually strict (#22497)
  • Make replace and replace_strict mapping use list literals (#22566)
  • Allow pivot on Time column (#22550)
  • Fix error when providing CSV schema with extra columns (#22544)
  • Panic on bitwise op between Series and Expr (#22527)
  • Multi-selector regex expansion (#22542)

📖 Documentation

  • Add pre-release policy (#22808)
  • Fix broken link to service account page in Polars Cloud docs (#22762)
  • Add match_to_schema to API reference (#22777)
  • Provide additional explanation and examples for the value_counts "normalize" parameter (#22756)
  • Rework documentation for drop/fill for nulls/nans (#22657)
  • Add documentation to new RoundMode parameter in round (#22555)
  • Add missing repeat_by to API reference, fixup list.get (#22698)
  • Fix non-rendering bullet points in scan_iceberg (#22694)
  • Improve insert_column docstring (description and examples) (#22551)
  • Improve join documentation (#22556)

📦 Build system

  • Fix building polars-lazy with certain features (#22846)
  • Add missing features (#22839)
  • Patch pyo3 to disable recompilation (#22796)

🛠️ Other improvements

  • Update Rust Polars versions (#22854)
  • Add basic smoke test for free-threaded python (#22481)
  • Update Polars Rust versions (#22834)
  • Fix nix build (#22809)
  • Fix flake.nix to work on macos (#22803)
  • Unused variables on release build (#22800)
  • Update cloud docs (#22624)
  • Fix unstable list.eval performance test (#22729)
  • Add proptest implementations for all Array types (#22711)
  • Dispatch .write_* to .lazy().sink_*(engine='in-memory') (#22582)
  • Move to all optimization flags to QueryOptFlags (#22680)
  • Add test for str.replace_many (#22615)
  • Stabilize sink_* (#22643)
  • Add proptest for row-encode (#22626)
  • Update rust version in nix flake (#22627)
  • Add a nix flake with a devShell and package (#22246)
  • Use a wrapper struct to store time zone (#22523)
  • Add proptest testing for for parquet decoding kernels (#22608)
  • Include equiprobable as valid quantile method (#22571)
  • Remove confusing error context calling .collect(_eager=True) (#22602)
  • Fix test_truncate_path test case (#22598)
  • Unify function flags into 1 bitset (#22573)
  • Display the operation behind in-memory-map (#22552)

Thank you to all our contributors for making this release possible!
@IvanIsCoding, @JakubValtar, @Julian-J-S, @LucioFranco, @MarcoGorelli, @WH-2099, @alexander-beedie, @borchero, @bschoenmaeckers, @cmdlineluser, @coastalwhite, @etiennebacher, @florian-klein, @itamarst, @kdn36, @mcrumiller, @nameexhaustion, @nikaltipar, @orlp, @pavelzw, @r-brink, @ritchie46, @stijnherfst, @teotwaki, @timkpaine and @wence-