Skip to content

Commit 1e1eb9f

Browse files
authored
Merge pull request #19 from st-tech/feat/update-synthetic-generator
update synthetic generator
2 parents 2713321 + 16bfda8 commit 1e1eb9f

File tree

6 files changed

+539
-119
lines changed

6 files changed

+539
-119
lines changed

examples/examples_with_synthetic/README.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,14 @@ python evaluate_off_policy_estimators.py\
3030
--n_rounds $n_rounds\
3131
--n_actions $n_actions\
3232
--dim_context $dim_context\
33-
--dim_action_context $dim_action_context\
3433
--base_model_for_evaluation_policy $base_model_for_evaluation_policy\
3534
--base_model_for_reg_model $base_model_for_reg_model\
3635
--n_jobs $n_jobs\
3736
--random_state $random_state
3837
```
3938
- `$n_runs` specifies the number of simulation runs in the experiment to estimate standard deviations of the performance of OPE estimators.
4039
- `$n_rounds` and `$n_actions` specify the number of rounds (or samples) and the number of actions of the synthetic bandit data.
41-
- `$dim_context` and `$dim_action_context` specify the number of dimensions of context vectors characterizing each round and action, respectively.
40+
- `$dim_context` specifies the number of dimensions of context vectors.
4241
- `$base_model_for_evaluation_policy` specifies the base ML model for defining evaluation policy and should be one of "logistic_regression", "random_forest", or "lightgbm".
4342
- `$base_model_for_reg_model` specifies the base ML model for defining regression model and should be one of "logistic_regression", "random_forest", or "lightgbm".
4443
- `$n_jobs` is the maximum number of concurrently running jobs.
@@ -51,30 +50,29 @@ python evaluate_off_policy_estimators.py\
5150
--n_rounds 100000\
5251
--n_actions 30\
5352
--dim_context 5\
54-
--dim_action_context 5\
5553
--base_model_for_evaluation_policy logistic_regression\
5654
--base_model_for_reg_model logistic_regression\
5755
--n_jobs -1\
5856
--random_state 12345
5957

6058
# relative estimation errors of OPE estimators and their standard deviations (lower is better).
61-
# our evaluation of OPE procedure suggests that Switch-IPW (tau=100) performs better than the other estimators.
59+
# our evaluation of OPE procedure suggests that DR and Switch-DR (tau=100) perform better than the other estimators.
6260
# Moreover, it appears that the performances of some OPE estimators depend on the choice of hyperparameters.
6361
# =============================================
6462
# random_state=12345
6563
# ---------------------------------------------
6664
# mean std
67-
# dm 0.016460 0.005503
68-
# ipw 0.006724 0.000955
69-
# snipw 0.006394 0.000793
70-
# dr 0.006275 0.003067
71-
# sndr 0.005942 0.001321
72-
# switch-ipw (tau=1) 0.392871 0.001192
73-
# switch-ipw (tau=100) 0.000768 0.000436
74-
# switch-dr (tau=1) 0.019167 0.005687
75-
# switch-dr (tau=100) 0.008104 0.001072
76-
# dr-os (lambda=1) 0.017385 0.005749
77-
# dr-os (lambda=100) 0.004148 0.000415
65+
# dm 0.029343 0.000410
66+
# ipw 0.002255 0.000587
67+
# snipw 0.001914 0.001268
68+
# dr 0.001645 0.000919
69+
# sndr 0.002550 0.000035
70+
# switch-ipw (tau=1) 0.195059 0.000136
71+
# switch-ipw (tau=100) 0.002255 0.000587
72+
# switch-dr (tau=1) 0.046846 0.001251
73+
# switch-dr (tau=100) 0.001645 0.000919
74+
# dr-os (lambda=1) 0.028386 0.000369
75+
# dr-os (lambda=100) 0.002516 0.001351
7876
# =============================================
7977
```
8078

examples/examples_with_synthetic/evaluate_off_policy_estimators.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -79,12 +79,6 @@
7979
default=5,
8080
help="dimensions of context vectors characterizing each round.",
8181
)
82-
parser.add_argument(
83-
"--dim_action_context",
84-
type=int,
85-
default=5,
86-
help="dimensions of context vectors characterizing each action.",
87-
)
8882
parser.add_argument(
8983
"--base_model_for_evaluation_policy",
9084
type=str,
@@ -114,7 +108,6 @@
114108
n_rounds = args.n_rounds
115109
n_actions = args.n_actions
116110
dim_context = args.dim_context
117-
dim_action_context = args.dim_action_context
118111
base_model_for_evaluation_policy = args.base_model_for_evaluation_policy
119112
base_model_for_reg_model = args.base_model_for_reg_model
120113
n_jobs = args.n_jobs
@@ -125,7 +118,6 @@
125118
dataset = SyntheticBanditDataset(
126119
n_actions=n_actions,
127120
dim_context=dim_context,
128-
dim_action_context=dim_action_context,
129121
reward_function=logistic_reward_function,
130122
behavior_policy_function=linear_behavior_policy,
131123
random_state=random_state,

examples/quickstart/quickstart_synthetic.ipynb

Lines changed: 81 additions & 38 deletions
Large diffs are not rendered by default.

obp/dataset/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
from .base import *
22
from .real import *
33
from .synthetic import *
4+
from .multiclass import *

0 commit comments

Comments
 (0)