Closed
Description
Short description explaining the high-level reason for the new issue.
Current behavior
This should work:
@load_from.csv(path=source("weekday_file"))
def weekday_data(data: pd.DataFrame) -> pd.DataFrame:
return data
@load_from.csv(path=source("weekend_file"))
def weekend_data(data: pd.DataFrame) -> pd.DataFrame:
return data
But it does not. This is sloppy -- it produces a node called data
, which should have some namespace associated with it.
Stack Traces
Traceback (most recent call last):
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/examples/parallelism/run.py", line 17, in <module>
main()
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/examples/parallelism/run.py", line 7, in main
dr = driver.Builder(). \
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/driver.py", line 51, in wrapped_fn
return call_fn(*args, **kwargs)
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/driver.py", line 954, in build
return DriverV2(
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/driver.py", line 720, in __init__
super(DriverV2, self).__init__(config, *modules)
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/driver.py", line 123, in __init__
raise e
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/driver.py", line 118, in __init__
self.graph = graph.FunctionGraph(*modules, config=config, adapter=adapter)
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/graph.py", line 221, in __init__
self.nodes = create_function_graph(*modules, config=self._config, adapter=adapter)
File "/Users/elijahbenizzy/dev/dagworks/os/hamilton/hamilton/graph.py", line 95, in create_function_graph
raise ValueError(
ValueError: Cannot define function data more than once. Already defined by function <function weekend_data at 0x16c059310>```
# Expected behavior
Should not try to create a node called `data` -- `NodeInjector` should be smart enough to handle this with namespaces.
# Additional context
Add any other context about the problem here.