Review: SyntheticSlateBanditDataset #93
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
important
pscore_item_position_i_l
:as pscore is calculated as :math:
\\pi_k(a_k \\mid x) = \\sum_{a} \\pi(a \\mid x) \\mathbbm{1}{a(k)=a_k}
.https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L376
pscore_item_position_i_l
when using uniform random policy (self.behavior_policy_function is None
):https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L368
refactor
return_exact_uniform_pscore_item_position
parameter fromobtain_batch_bandit_feedback()
andsample_action_and_obtain_pscore()
.It might be more intuitive that we use ground-truth pscore (not approximated one) as default for uniform random policy.
https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L447
self.behavior_policy
toself.uniform_behavior_policy
as it always indicates uniform behavior policy.https://github.com/aiueola/zr-obp/blob/57cae57736f00a8f4887c4d3ac41a6544b880b34/obp/dataset/synthetic_slate.py#L245
action_interaction_matrix
toaction_interaction_weight_matrix
.https://github.com/aiueola/zr-obp/blob/35f893fd660f627b0bb470d24eaf970aeb541109/obp/dataset/synthetic_slate.py#L49
tests
Fix
pscore_item_position
target of uniform random policy:https://github.com/aiueola/zr-obp/blob/d46cd4c9a626020a14c73014a21b8749b9dd88ad/tests/dataset/test_synthetic_slate.py#L229
others
Minor fix on typos and docstrings.