-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Dear experts,
I am starting to use dask
and dask_histogram
, but I am facing an error when I want to fill a dask_histogram.boost
with a dataframe
as below:
import numpy as np
import dask.dataframe as dd
import dask_histogram.boost as dhb
# this is reproducible
d = {
'A': np.random.normal(0., 1., 100000),
'W': np.random.uniform(0.2, 0.8, 100000),
}
ddf = dd.from_dict(d, npartitions=10)
h = dhb.Histogram(
dhb.axis.Regular(10, -3, 3),
storage=dhb.storage.Weight()
).fill(ddf['A'], weight=ddf['W']).compute()
print(h)
This example gives me :
Traceback (most recent call last):
File "/gpfs/home/belle2/rlebouch/darkphotontodimuons/background_rejection/testdask.py", line 15, in <module>
).fill(ddf['A'], weight=ddf['W']).compute()
^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/base.py", line 372, in compute
(result,) = compute(self, traverse=False, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/base.py", line 653, in compute
dsk = collections_to_dsk(collections, optimize_graph, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/base.py", line 422, in collections_to_dsk
dsk = opt(dsk, keys, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask_histogram/core.py", line 514, in optimize
dsk = fuse_roots(dsk, keys=keys) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/blockwise.py", line 1564, in fuse_roots
new = toolz.merge(layer, *[layers[dep] for dep in deps])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/toolz/dicttoolz.py", line 39, in merge
rv.update(d)
File "<frozen _collections_abc>", line 836, in __iter__
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/blockwise.py", line 641, in __iter__
return iter(self._dict)
^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/blockwise.py", line 607, in _dict
dsk = _make_blockwise_graph(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/blockwise.py", line 958, in _make_blockwise_graph
itertools.product(*[range(dims[i]) for i in out_indices])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/belle2/rlebouch/.local/lib/python3.11/site-packages/dask/blockwise.py", line 958, in <listcomp>
itertools.product(*[range(dims[i]) for i in out_indices])
~~~~^^^
KeyError: '.0'
Is It really possible to fill a histogram from a data frame?
I currently use:
Name: dask-histogram
Version: 2024.12.1
Name: dask
Version: 2024.12.1
Name: boost_histogram
Version: 1.4.1
Metadata
Metadata
Assignees
Labels
No labels