@@ -7,6 +7,8 @@ Release notes
7
7
Release 0.6.0
8
8
=============
9
9
10
+ Thanks to our new contributers: Kim Hammar and Joshua Goller!
11
+
10
12
Breaking changes
11
13
----------------
12
14
- ``petastorm.etl.dataset_metadata.materialize_dataset() `` should be passed a filesystem factory method
@@ -15,12 +17,20 @@ Breaking changes
15
17
16
18
New features and bug fixes
17
19
--------------------------
20
+ - Added functionality for transform-on-worker thread/pool. The transform enables PyTorch users to run preprocessing
21
+ code on worker processes/threads. It enables Tensorflow users to parallelize Python preprocessing code on
22
+ a process pool, as part of the training/evaluation graph. Users now specify a ``transform_spec `` when calling
23
+ ``make_reader() `` or ``make_batch_reader() ``.
24
+ - Added ``hdfs_driver `` argument to the following functions: ``get_schema_from_dataset_url ``, ``FilesystemResolver ``,
25
+ ``generate_petastorm_metadata ``, ``build_rowgroup_index ``, ``RowGroupLoader ``, ``dataset_as_rdd `` and ``copy_dataset ``
18
26
- the Docker container in ``/docker `` has been made into a workspace container aimed to support development on MacOS.
19
27
- New `hello_world ` examples added for using non-Petastorm datasets.
20
- - Added functionality for transform-on-worker thread/pool. Users now specify a ``transform_spec `` when calling ``make_reader() ``
21
- or ``make_batch_reader() ``
22
- - Fixed a bug that caused all columns of a dataset to be read when ``schema_fields=NGram(...) `` was used.
23
28
- Allow for unicode strings to be passed as regex filters in Unischema when selecting which columns to read.
29
+ - Fixed a bug that caused all columns of a dataset to be read when ``schema_fields=NGram(...) `` was used.
30
+ - Fixed type of an argument passed to a predicate when the predicate is defined on a numeric partition field
31
+ - Support regular unicode strings as expressions as a value of make_reader's schema_fields argument.
32
+ - Emit a warning when opening a Petastorm-created dataset using make_batch_reader (``make_batch_reader `` currently
33
+ does not support Petastorm specific types, such as tensors).
24
34
25
35
Release 0.5.1
26
36
=============
0 commit comments