feature: slate ope estimators #88

fullflu · 2021-04-10T08:15:29Z

% pytest -s tests/ope/test_ipw_estimators_slate.py
=========================================================== test session starts ============================================================
platform darwin -- Python 3.9.1, pytest-6.2.1, py-1.10.0, pluggy-0.13.1
rootdir: /Users/kt/projects/zr-obp
collected 61 items                                                                                                                         

[sample_action_and_obtain_pscore]: 100%|████████████████████████████████████████████████████████████████| 1000/1000 [00:37<00:00, 26.67it/s]
[sample_action_and_obtain_pscore]: 100%|██████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 6504.97it/s]
Cascade additive
gt_mean: 1.858, 3 * gt_std / sqrt(n): 0.08164060992661183
estimated_value: 1.821030610225593 ------ estimator: sips, 
estimated_value: 1.91435928121275 ------ estimator: iips, 
estimated_value: 1.8354943639350463 ------ estimator: rips, 
[sample_action_and_obtain_pscore]: 100%|████████████████████████████████████████████████████████████████| 1000/1000 [00:38<00:00, 25.95it/s]
[sample_action_and_obtain_pscore]: 100%|██████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 6537.97it/s]
Independent
gt_mean: 1.89, 3 * gt_std / sqrt(n): 0.08467744841615311
estimated_value: 1.8635144688428213 ------ estimator: sips, 
estimated_value: 1.9260419198232916 ------ estimator: iips, 
estimated_value: 1.8520557384244258 ------ estimator: rips, 
[sample_action_and_obtain_pscore]: 100%|████████████████████████████████████████████████████████████████| 1000/1000 [00:38<00:00, 25.90it/s]
[sample_action_and_obtain_pscore]: 100%|██████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 6498.48it/s]
Standard additive
gt_mean: 1.825, 3 * gt_std / sqrt(n): 0.0965333812403445
estimated_value: 1.8013433082242036 ------ estimator: sips, 
estimated_value: 1.8968268102209969 ------ estimator: iips, 
estimated_value: 1.8084251396365023 ------ estimator: rips, 
.

====================================================== 61 passed in 118.21s (0:01:58) ======================================================

…ootstrap method

usaito · 2021-05-03T07:14:43Z

@fullflu

Thanks! Can you address the following points?

general

for each round and slot -> given round (slate_id) and slot (position)
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L59
:math:r_{t}(k)
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L64
IDs to differentiate slot (i.e., position) in each slate.
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L67
IDs to differentiate slates (i.e., rounds or lists of actions).
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L98
SlateRecursiveIPS -> SlateRewardInteractionIPS (following the Spotify paper)
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L417

obp/utils.py (checks)

[must]

evaluation policy can be deterministic, meaning that below should be evaluation_policy_pscore < 0 (please also fix the corresponding error messages)
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L396
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L494
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L581

[imo]

the three check functions for sips, rips, and iips share many lines (such as checks for position, reward, and some others). I think it is better to create an additional function to put these repetitive lines together.

[nits]

Check inputs of SlateStandardIPS.
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L349
Check inputs of SlateIndependentIPS.
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L447
Check inputs of SlateRewardInteractionIPS.
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L534
check_sips_inputs(, check_iips_inputs(, check_rips_inputs( are sufficient
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L342
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L440
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/utils.py#L527

obp/ope/estimators_slate.py

[nits]

"Base Class of Slate Inverse Probability Weighting Estimators."
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L41
"Slate Standard Inverse Propensity Scoring (SIPS) estimates the policy value of a given evaluation policy :math:\\pi_e without any assumption about user behavior."
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L138
"Slate Independent Inverse Propensity Scoring (IIPS) estimates the policy value of a given evaluation policy :math:\\pi_e assuming the item-position click model."
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L281
Marginal probabilities that action :math:a is chosen at position (slot) :math:k by a behavior policy given context :math:x, i.e., :math:\\pi_e(a_{t}(k) |x_t).
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L320
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L323
"Slate Recursive Inverse Propensity Scoring (RIPS) estimates the policy value of a given evaluation policy :math:\\pi_e assuming the cascade click model."
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L422
"Action choice probabilities under the cascade behavior model"
:math:\\pi_b(\\{a_{t, j}\\}_{j \\le k}|x_t) -> :math:\\pi_b(a_t(k) | x_t, a_t(1), \ldots, a_t(k-1))
:math:\\pi_e(\\{a_{t, j}\\}_{j \\le k}|x_t) -> :math:\\pi_e(a_t(k) | x_t, a_t(1), \ldots, a_t(k-1))
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L461
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/estimators_slate.py#L464

examples/quickstart/README.md

add the following line

- [`synthetic_slate.ipynb`](./synthetic_slate.ipynb): a quickstart guide to implement off-policy evaluation (OPE) and the evaluation of OPE procedures for the slate recommendation setting with the Open Bandit Pipeline.

ope/meta_slate.py

"Logged bandit feedback data used for off-policy evaluation for the slate recommendation setting."
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/meta_slate.py#L29
this Examples section has to be adjusted to the slate implementation
https://github.com/fullflu/zr-obp/blob/837ea49eebfbc2be6426ccc5e31722d7de3fd155/obp/ope/meta_slate.py#L35

fullflu added 5 commits April 10, 2021 17:14

add slate ope estimators

9f69925

fix confidence interval of slate ope

3a7aa95

add comment of SIPS

c8df15c

fix slate ope estimators and add validations

88ff2a9

add slate ope tests

59b7e95

fullflu changed the title ~~[WIP] feature: slate ope estimators~~ feature: slate ope estimators Apr 18, 2021

fullflu added 5 commits April 19, 2021 02:27

Merge branch 'master' into feature/estimators_slate

5771954

fix synthetic slate bug

4258c7c

add slate ope performance test

6c245cd

fix column names related to evaluation_policy_pscore; add efficient b…

6d13599

…ootstrap method

add meta_slate.py and test the meta module

837ea49

usaito merged commit 3781edc into st-tech:master May 17, 2021

aiueola mentioned this pull request May 17, 2021

Review: Slate OPE Estimators #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature: slate ope estimators #88

feature: slate ope estimators #88

Uh oh!

fullflu commented Apr 10, 2021 •

edited

Loading

Uh oh!

usaito commented May 3, 2021 •

edited

Loading

Uh oh!

Uh oh!

feature: slate ope estimators #88

feature: slate ope estimators #88

Uh oh!

Conversation

fullflu commented Apr 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

usaito commented May 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

general

obp/utils.py (checks)

obp/ope/estimators_slate.py

examples/quickstart/README.md

ope/meta_slate.py

Uh oh!

Uh oh!

fullflu commented Apr 10, 2021 •

edited

Loading

usaito commented May 3, 2021 •

edited

Loading