Skip to content

Commit 33d307a

Browse files
authored
Switch to orbitalml name
Let's start switching to orbitalml name, it's easier to move to `orbital` (if the name becomes available) from `orbitalml` than from `mustela` in terms of communication and confusion.
1 parent 9fdebf6 commit 33d307a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

61 files changed

+177
-177
lines changed

.github/workflows/tests.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,10 @@ jobs:
3030
- name: Set up PostgreSQL
3131
uses: harmon758/postgresql-action@v1
3232
with:
33-
postgresql db: mustelatestdb
34-
postgresql user: mustelatestuser
35-
postgresql password: mustelatestpassword
33+
postgresql db: orbitalmltestdb
34+
postgresql user: orbitalmltestuser
35+
postgresql password: orbitalmltestpassword
3636

3737
- name: Run Test Suite
3838
run: |
39-
uv run pytest -v --tb=short --disable-warnings --maxfail=1 --cov=mustela
39+
uv run pytest -v --tb=short --disable-warnings --maxfail=1 --cov=orbitalml

LICENSE.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# MIT License
22

3-
Copyright (c) 2024 Mustela authors
3+
Copyright (c) 2024 OrbitalML authors
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.rst

+17-17
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Mustela
1+
OrbitalML
22
=======
33

44
Convert SKLearn pipelines into SQL queries for execution in a database
@@ -14,15 +14,15 @@ See `examples` directory for example pipelines.
1414
**Note**::
1515

1616
Not all transformations and models can be represented as SQL queries,
17-
so Mustela might not be able to implement the specific pipeline you are using.
17+
so OrbitalML might not be able to implement the specific pipeline you are using.
1818

1919
Getting Started
2020
----------------
2121

22-
Install Mustela::
22+
Install OrbitalML::
2323

24-
$ git clone https://github.com/posit-dev/mustela.git
25-
$ pip install ./mustela
24+
$ git clone https://github.com/posit-dev/orbitalml.git
25+
$ pip install ./orbitalml
2626

2727
Prepare some data::
2828

@@ -34,7 +34,7 @@ Prepare some data::
3434
iris = load_iris(as_frame=True)
3535
iris_x = iris.data.set_axis(COLUMNS, axis=1)
3636

37-
# SQL and Mustela don't like dots in column names, replace them with underscores
37+
# SQL and OrbitalML don't like dots in column names, replace them with underscores
3838
iris_x.columns = COLUMNS = [cname.replace(".", "_") for cname in COLUMNS]
3939

4040
X_train, X_test, y_train, y_test = train_test_split(
@@ -57,21 +57,21 @@ Define a Scikit-Learn pipeline and train it::
5757
)
5858
pipeline.fit(X_train, y_train)
5959

60-
Convert the pipeline to Mustela::
60+
Convert the pipeline to OrbitalML::
6161

62-
import mustela
63-
import mustela.types
62+
import orbitalml
63+
import orbitalml.types
6464

65-
mustela_pipeline = mustela.parse_pipeline(pipeline, features={
66-
"sepal_length": mustela.types.DoubleColumnType(),
67-
"sepal_width": mustela.types.DoubleColumnType(),
68-
"petal_length": mustela.types.DoubleColumnType(),
69-
"petal_width": mustela.types.DoubleColumnType(),
65+
orbitalml_pipeline = orbitalml.parse_pipeline(pipeline, features={
66+
"sepal_length": orbitalml.types.DoubleColumnType(),
67+
"sepal_width": orbitalml.types.DoubleColumnType(),
68+
"petal_length": orbitalml.types.DoubleColumnType(),
69+
"petal_width": orbitalml.types.DoubleColumnType(),
7070
})
7171

7272
You can print the pipeline to see the result::
7373

74-
>>> print(mustela_pipeline)
74+
>>> print(orbitalml_pipeline)
7575

7676
ParsedPipeline(
7777
features={
@@ -107,7 +107,7 @@ You can print the pipeline to see the result::
107107

108108
Now we can generate the SQL from the pipeline::
109109

110-
sql = mustela.export_sql("DATA_TABLE", mustela_pipeline, dialect="duckdb")
110+
sql = orbitalml.export_sql("DATA_TABLE", orbitalml_pipeline, dialect="duckdb")
111111

112112
And check the resulting query::
113113

@@ -141,7 +141,7 @@ by running the scikitlearn pipeline on the same set of data::
141141
Supported Models
142142
-----------------
143143

144-
Mustela currently supports the following models:
144+
OrbitalML currently supports the following models:
145145

146146
- Linear Regression
147147
- Logistic Regression

examples/README.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
A few examples with significant test cases
2-
that show how to use the mustela library.
2+
that show how to use the orbitalml library.

examples/minimal.py

+10-10
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@
77

88
import duckdb
99

10-
import mustela
11-
import mustela.types
10+
import orbitalml
11+
import orbitalml.types
1212

1313
COLUMNS = ["sepal.length", "sepal.width", "petal.length", "petal.width"]
1414

1515
iris = load_iris(as_frame=True)
1616
iris_x = iris.data.set_axis(COLUMNS, axis=1)
1717

18-
# SQL and Mustela don't like dots in column names, replace them with underscores
18+
# SQL and OrbitalML don't like dots in column names, replace them with underscores
1919
iris_x.columns = COLUMNS = [cname.replace(".", "_") for cname in COLUMNS]
2020

2121
X_train, X_test, y_train, y_test = train_test_split(
@@ -33,15 +33,15 @@
3333
pipeline.fit(X_train, y_train)
3434

3535

36-
mustela_pipeline = mustela.parse_pipeline(pipeline, features={
37-
"sepal_length": mustela.types.DoubleColumnType(),
38-
"sepal_width": mustela.types.DoubleColumnType(),
39-
"petal_length": mustela.types.DoubleColumnType(),
40-
"petal_width": mustela.types.DoubleColumnType(),
36+
orbitalml_pipeline = orbitalml.parse_pipeline(pipeline, features={
37+
"sepal_length": orbitalml.types.DoubleColumnType(),
38+
"sepal_width": orbitalml.types.DoubleColumnType(),
39+
"petal_length": orbitalml.types.DoubleColumnType(),
40+
"petal_width": orbitalml.types.DoubleColumnType(),
4141
})
42-
print(mustela_pipeline)
42+
print(orbitalml_pipeline)
4343

44-
sql = mustela.export_sql("DATA_TABLE", mustela_pipeline, dialect="duckdb")
44+
sql = orbitalml.export_sql("DATA_TABLE", orbitalml_pipeline, dialect="duckdb")
4545
print("\nGenerated Query for DuckDB:")
4646
print(sql)
4747
print("\nPrediction with SQL")

examples/pipeline_boosted_tree_classifier.py

+9-9
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,14 @@
1212
from sklearn.pipeline import Pipeline
1313
from sklearn.preprocessing import OneHotEncoder, StandardScaler
1414

15-
import mustela
16-
import mustela.types
15+
import orbitalml
16+
import orbitalml.types
1717

1818
PRINT_SQL = int(os.environ.get("PRINTSQL", "0"))
1919
ASSERT = int(os.environ.get("ASSERT", "0"))
2020

2121
logging.basicConfig(level=logging.INFO)
22-
logging.getLogger("mustela").setLevel(logging.INFO) # Set DEBUG to see translation process.
22+
logging.getLogger("orbitalml").setLevel(logging.INFO) # Set DEBUG to see translation process.
2323

2424
# Load Ames Housing for classification
2525
ames = fetch_openml(name="house_prices", as_frame=True)
@@ -92,23 +92,23 @@ def categorize_price(price: float) -> str:
9292

9393
model.fit(X, y)
9494

95-
# Convert types from numpy to mustela types
96-
features = mustela.types.guess_datatypes(X)
95+
# Convert types from numpy to orbitalml types
96+
features = orbitalml.types.guess_datatypes(X)
9797

9898
# Target only 5 rows, so that it's easier for a human to understand
9999
data_sample = X.head(5)
100100

101101
# Convert the model to an execution pipeline
102-
mustela_pipeline = mustela.parse_pipeline(model, features=features)
103-
print(mustela_pipeline)
102+
orbitalml_pipeline = orbitalml.parse_pipeline(model, features=features)
103+
print(orbitalml_pipeline)
104104

105105
# Translate the pipeline to a query
106106
ibis_table = ibis.memtable(data_sample, name="DATA_TABLE")
107-
ibis_expression = mustela.translate(ibis_table, mustela_pipeline)
107+
ibis_expression = orbitalml.translate(ibis_table, orbitalml_pipeline)
108108

109109
con = ibis.duckdb.connect()
110110
if PRINT_SQL:
111-
sql = mustela.export_sql("DATA_TABLE", mustela_pipeline, dialect="duckdb")
111+
sql = orbitalml.export_sql("DATA_TABLE", orbitalml_pipeline, dialect="duckdb")
112112
print("\nGenerated Query for DuckDB:")
113113
print(sql)
114114
print("\nPrediction with SQL")

examples/pipeline_boosted_tree_regressor.py

+10-10
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,14 @@
1212
from sklearn.pipeline import Pipeline
1313
from sklearn.preprocessing import OneHotEncoder, StandardScaler
1414

15-
import mustela
16-
import mustela.types
15+
import orbitalml
16+
import orbitalml.types
1717

1818
PRINT_SQL = int(os.environ.get("PRINTSQL", "0"))
1919
ASSERT = int(os.environ.get("ASSERT", "0"))
2020

2121
logging.basicConfig(level=logging.INFO)
22-
logging.getLogger("mustela").setLevel(logging.INFO) # Set DEBUG to see translation process.
22+
logging.getLogger("orbitalml").setLevel(logging.INFO) # Set DEBUG to see translation process.
2323

2424
ames = fetch_openml(name="house_prices", as_frame=True)
2525
ames = ames.frame
@@ -36,7 +36,7 @@
3636
]
3737
categorical_features = ames.select_dtypes(include=["object", "category"]).columns
3838

39-
# Mustela requires the input and outputs of an imputer to
39+
# OrbitalML requires the input and outputs of an imputer to
4040
# be of the same type, as SimpleImputer has to compute the mean
4141
# the result is always a float. Which makes sense.
4242
# Let's convert all numeric features to doubles so
@@ -87,19 +87,19 @@
8787
# It's easier to understand if it's small
8888
data_sample = X.head(5)
8989

90-
features = mustela.types.guess_datatypes(X)
91-
mustela_pipeline = mustela.parse_pipeline(model, features=features)
92-
print(mustela_pipeline)
90+
features = orbitalml.types.guess_datatypes(X)
91+
orbitalml_pipeline = orbitalml.parse_pipeline(model, features=features)
92+
print(orbitalml_pipeline)
9393

9494
ibis_table = ibis.memtable(data_sample, name="DATA_TABLE")
95-
ibis_expression = mustela.translate(ibis_table, mustela_pipeline)
95+
ibis_expression = orbitalml.translate(ibis_table, orbitalml_pipeline)
9696
con = ibis.duckdb.connect()
9797

9898
if PRINT_SQL:
9999
con = ibis.duckdb.connect()
100100
print(con.compile(ibis_expression))
101101

102-
sql = mustela.export_sql("DATA_TABLE", mustela_pipeline, dialect="duckdb")
102+
sql = orbitalml.export_sql("DATA_TABLE", orbitalml_pipeline, dialect="duckdb")
103103
print("\nGenerated Query for DuckDB:")
104104
print(sql)
105105
print("\nPrediction with SQL")
@@ -114,7 +114,7 @@
114114
# NOTE: Interestingly the DuckDB optimizer has a bug on this query too
115115
# and unless disabled the query never completes.
116116
# That's why we run using SQLite.
117-
# The Mustela optimizer when enabled is able to preoptimize the query
117+
# The OrbitalML optimizer when enabled is able to preoptimize the query
118118
# which seems to allow DuckDB to complete the query as probably the DuckDB
119119
# optimizer has less work to do in that case.
120120
print("\nPrediction with Ibis")

examples/pipeline_decision_tree_classifier.py

+9-9
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,14 @@
1212
from sklearn.preprocessing import OneHotEncoder
1313
from sklearn.tree import DecisionTreeClassifier
1414

15-
import mustela
16-
import mustela.types
15+
import orbitalml
16+
import orbitalml.types
1717

1818
PRINT_SQL = int(os.environ.get("PRINTSQL", "0"))
1919
ASSERT = int(os.environ.get("ASSERT", "0"))
2020

2121
logging.basicConfig(level=logging.INFO)
22-
logging.getLogger("mustela").setLevel(logging.INFO) # Change to DEBUG to see each translation step.
22+
logging.getLogger("orbitalml").setLevel(logging.INFO) # Change to DEBUG to see each translation step.
2323

2424
iris = load_iris()
2525
df = pd.DataFrame(
@@ -82,11 +82,11 @@ def categorize_area(a: float) -> str:
8282

8383
pipeline.fit(X, y)
8484

85-
features = mustela.types.guess_datatypes(X)
86-
print("Mustela Features:", features)
85+
features = orbitalml.types.guess_datatypes(X)
86+
print("OrbitalML Features:", features)
8787

88-
mustela_pipeline = mustela.parse_pipeline(pipeline, features=features)
89-
print(mustela_pipeline)
88+
orbitalml_pipeline = orbitalml.parse_pipeline(pipeline, features=features)
89+
print(orbitalml_pipeline)
9090

9191
# Test data
9292
example_data = pa.table(
@@ -101,10 +101,10 @@ def categorize_area(a: float) -> str:
101101

102102
con = ibis.duckdb.connect()
103103
ibis_table = ibis.memtable(example_data, name="DATA_TABLE")
104-
ibis_expression = mustela.translate(ibis_table, mustela_pipeline)
104+
ibis_expression = orbitalml.translate(ibis_table, orbitalml_pipeline)
105105

106106
if PRINT_SQL:
107-
sql = mustela.export_sql("DATA_TABLE", mustela_pipeline, dialect="duckdb")
107+
sql = orbitalml.export_sql("DATA_TABLE", orbitalml_pipeline, dialect="duckdb")
108108
print("\nGenerated Query for DuckDB:")
109109
print(sql)
110110
print("\nPrediction with SQL")

examples/pipeline_decision_tree_regressor.py

+12-12
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,14 @@
1212
from sklearn.preprocessing import OneHotEncoder
1313
from sklearn.tree import DecisionTreeRegressor
1414

15-
import mustela
16-
import mustela.types
15+
import orbitalml
16+
import orbitalml.types
1717

1818
PRINT_SQL = int(os.environ.get("PRINTSQL", "0"))
1919
ASSERT = int(os.environ.get("ASSERT", "0"))
2020

2121
logging.basicConfig(level=logging.INFO)
22-
logging.getLogger("mustela").setLevel(logging.INFO) # Set DEBUG to see translation process.
22+
logging.getLogger("orbitalml").setLevel(logging.INFO) # Set DEBUG to see translation process.
2323

2424
# Carica il dataset
2525
iris = load_iris()
@@ -63,13 +63,13 @@
6363

6464
pipeline.fit(X, y)
6565

66-
# Converti le feature per Mustela
67-
features = mustela.types.guess_datatypes(X)
68-
print("Mustela Features:", features)
66+
# Converti le feature per OrbitalML
67+
features = orbitalml.types.guess_datatypes(X)
68+
print("OrbitalML Features:", features)
6969

70-
# Converti la pipeline in SQL con Mustela
71-
mustela_pipeline = mustela.parse_pipeline(pipeline, features=features)
72-
print(mustela_pipeline)
70+
# Converti la pipeline in SQL con OrbitalML
71+
orbitalml_pipeline = orbitalml.parse_pipeline(pipeline, features=features)
72+
print(orbitalml_pipeline)
7373

7474
# Test data
7575
example_data = pa.table(
@@ -81,13 +81,13 @@
8181
}
8282
)
8383

84-
# Genera la query SQL con Mustela
84+
# Genera la query SQL con OrbitalML
8585
ibis_table = ibis.memtable(example_data, name="DATA_TABLE")
86-
ibis_expression = mustela.translate(ibis_table, mustela_pipeline)
86+
ibis_expression = orbitalml.translate(ibis_table, orbitalml_pipeline)
8787

8888
con = ibis.duckdb.connect()
8989
if PRINT_SQL:
90-
sql = mustela.export_sql("DATA_TABLE", mustela_pipeline, dialect="duckdb")
90+
sql = orbitalml.export_sql("DATA_TABLE", orbitalml_pipeline, dialect="duckdb")
9191
print("\nGenerated Query for DuckDB:")
9292
print(sql)
9393
print("\nPrediction with SQL")

0 commit comments

Comments
 (0)