Skip to content

Commit a64393e

Browse files
committed
Updates first dataflow docs
They weren't up to date. Also added note about disabling plugins.
1 parent a483a07 commit a64393e

File tree

1 file changed

+18
-6
lines changed

1 file changed

+18
-6
lines changed

docs/get-started/your-first-dataflow.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -72,8 +72,11 @@ To actually run the dataflow, we'll need to write :doc:`a driver <../concepts/dr
7272
7373
import pandas as pd
7474
75+
# We add this to speed up running things if you have a lot in your python environment.
76+
from hamilton import registry; registry.disable_autoload()
77+
from hamilton import driver, base
7578
import my_functions # we import the module here!
76-
from hamilton import driver
79+
7780
7881
logger = logging.getLogger(__name__)
7982
logging.basicConfig(stream=sys.stdout)
@@ -86,10 +89,14 @@ To actually run the dataflow, we'll need to write :doc:`a driver <../concepts/dr
8689
'signups': pd.Series([1, 10, 50, 100, 200, 400], index=index),
8790
'spend': pd.Series([10, 10, 20, 40, 40, 50], index=index),
8891
}
89-
# we need to tell hamilton where to load function definitions from
90-
config = {} # we don't have any configuration or invariant data for this example.
91-
dr = driver.Driver(config, my_functions) # can pass in multiple modules
92-
# we need to specify what we want in the final dataframe.
92+
dr = (
93+
driver.Builder()
94+
.with_config({}) # we don't have any configuration or invariant data for this example.
95+
.with_modules(my_functions) # we need to tell hamilton where to load function definitions from
96+
.with_adapters(base.PandasDataFrameResult()) # we want a pandas dataframe as output
97+
.build()
98+
)
99+
# we need to specify what we want in the final dataframe (these could be function pointers).
93100
output_columns = [
94101
'spend',
95102
'signups',
@@ -99,7 +106,7 @@ To actually run the dataflow, we'll need to write :doc:`a driver <../concepts/dr
99106
# let's create the dataframe!
100107
df = dr.execute(output_columns, inputs=initial_columns)
101108
# `pip install sf-hamilton[visualization]` earlier you can also do
102-
# dr.visualize_execution(output_columns,'./my_dag.dot', {})
109+
# dr.visualize_execution(output_columns,'./my_dag.png', {})
103110
print(df)
104111
105112
Run the script with the following command:
@@ -122,3 +129,8 @@ Not only is your spend to signup ratio decreasing exponentially (your product is
122129
successfully run your first Hamilton Dataflow. Kudos!
123130

124131
See, wasn't that quick and easy?
132+
133+
Note: if you're ever like "why are things taking a while to execute?", then you might have too much
134+
in your python environment and Hamilton is auto-loading all the extensions. You can disable this by
135+
setting the environment variable ``HAMILTON_AUTOLOAD_EXTENSIONS=0`` or programmatically via
136+
``from hamilton import registry; registry.disable_autoload()`` - for more see :doc:`../how-tos/extensions-autoloading`.

0 commit comments

Comments
 (0)