Review: SyntheticSlateBanditDataset #93

aiueola · 2021-05-05T12:19:39Z

important

It may be better to check the calculation of pscore_item_position_i_l:

# from
if sampled_action_index not in action_list:
# to
if sampled_action != action_list[position_]:

as pscore is calculated as :math: \\pi_k(a_k \\mid x) = \\sum_{a} \\pi(a \\mid x) \\mathbbm{1}{a(k)=a_k}.
https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L376

Also, I revised pscore_item_position_i_l when using uniform random policy (self.behavior_policy_function is None):

# from
pscore_item_position_i_l = self.len_list / self.n_unique_action
#to
pscore_item_position_i_l = 1 / self.n_unique_action

https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L368

refactor

Removed return_exact_uniform_pscore_item_position parameter from obtain_batch_bandit_feedback() and sample_action_and_obtain_pscore().

# from
if return_exact_uniform_pscore_item_position: 
# to
if self.behavior_policy_function is None:  # uniform random

It might be more intuitive that we use ground-truth pscore (not approximated one) as default for uniform random policy.
https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L447

Renamed from self.behavior_policy to self.uniform_behavior_policy as it always indicates uniform behavior policy.

# from
self.behavior_policy = np.ones(self.n_unique_action) / self.n_unique_action
# to
self.uniform_behavior_policy = np.ones(self.n_unique_action) / self.n_unique_action

https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L245

Renamed from action_interaction_matrix to action_interaction_weight_matrix.

https://github.com/aiueola/zr-obp/blob/35f893fd660f627b0bb470d24eaf970aeb541109/obp/dataset/synthetic_slate.py#L49

tests

Fix pscore_item_position target of uniform random policy:

# from 
pscore_item_position = len_list / n_unique_action
#to
pscore_item_position = 1 / n_unique_action

https://github.com/aiueola/zr-obp/blob/d46cd4c9a626020a14c73014a21b8749b9dd88ad/tests/dataset/test_synthetic_slate.py#L229

others

Minor fix on typos and docstrings.

…trix

aiueola added 19 commits May 5, 2021 15:53

fix typo

1ccbaec

fix typo

fff4f8c

fix pscore_item_position calculation

0b0df8c

minor fix

fb58f67

bug fix

cf228ce

bug fix

f958a77

bug fix

944c2a8

fix test

beb8036

black

94a1836

bug fix

226e588

minor fix

225f625

fix test

2d5bf95

rm return_exact_uniform_pscore_item_position

57cae57

fix test

d46cd4c

rename from action_interaction_matrix to action_interaction_weight_ma…

6d904ed

…trix

rename from action_interaction_matrix to action_interaction_weight_ma…

9dc24a6

…trix

refactor

35f893f

fix test

3d9ee6d

minor fix

1cbb84a

usaito merged commit 57b4bc7 into st-tech:master May 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Review: SyntheticSlateBanditDataset #93

Review: SyntheticSlateBanditDataset #93

Uh oh!

aiueola commented May 5, 2021 •

edited

Loading

Uh oh!

Uh oh!

Review: SyntheticSlateBanditDataset #93

Review: SyntheticSlateBanditDataset #93

Uh oh!

Conversation

aiueola commented May 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

important

refactor

tests

others

Uh oh!

Uh oh!

aiueola commented May 5, 2021 •

edited

Loading