Skip to content

Commit 73e065c

Browse files
authored
Merge pull request #150 from st-tech/feature/new-datasets
Add/Modify Dataset Class for Handling Multiple Loggers and Deficient Data
2 parents 4f075e9 + 10d7c81 commit 73e065c

38 files changed

+2222
-1210
lines changed

README.md

Lines changed: 12 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,19 @@
99
[![arXiv](https://img.shields.io/badge/arXiv-2008.07146-b31b1b.svg)](https://arxiv.org/abs/2008.07146)
1010

1111
[[arXiv]](https://arxiv.org/abs/2008.07146)
12-
# Open Bandit Pipeline: a research framework for bandit algorithms and off-policy evaluation
12+
[[NeurIPS2021 Proceedings]](https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/33e75ff09dd601bbe69f351039152189-Abstract-round2.html)
13+
# Open Bandit Pipeline: a research framework for off-policy evaluation and learning
1314

1415
**[Docs](https://zr-obp.readthedocs.io/en/latest/)** | **[Google Group](https://groups.google.com/g/open-bandit-project)** | **[Tutorial](https://sites.google.com/cornell.edu/recsys2021tutorial)** | **[Installation](#installation)** | **[Usage](#usage)** | **[Slides](./slides/slides_EN.pdf)** | **[Quickstart](./examples/quickstart)** | **[Open Bandit Dataset](./obd)** | **[日本語](./README_JN.md)**
1516

1617
<details>
1718
<summary><strong>Table of Contents</strong></summary>
1819

19-
- [Open Bandit Pipeline: a research framework for bandit algorithms and off-policy evaluation](#open-bandit-pipeline-a-research-framework-for-bandit-algorithms-and-off-policy-evaluation)
20+
- [Open Bandit Pipeline: a research framework for off-policy evaluation and learning](#open-bandit-pipeline-a-research-framework-for-bandit-algorithms-and-off-policy-evaluation)
2021
- [Overview](#overview)
2122
- [Open Bandit Dataset (OBD)](#open-bandit-dataset-obd)
2223
- [Open Bandit Pipeline (OBP)](#open-bandit-pipeline-obp)
2324
- [Algorithms and OPE Estimators Supported](#algorithms-and-ope-estimators-supported)
24-
- [Topics and Tasks](#topics-and-tasks)
2525
- [Installation](#installation)
2626
- [Usage](#usage)
2727
- [(1) Data loading and preprocessing](#1-data-loading-and-preprocessing)
@@ -44,7 +44,7 @@
4444
*Open Bandit Dataset* is a public real-world logged bandit dataset.
4545
This dataset is provided by [ZOZO, Inc.](https://corp.zozo.com/en/about/profile/), the largest fashion e-commerce company in Japan.
4646
The company uses some multi-armed bandit algorithms to recommend fashion items to users in a large-scale fashion e-commerce platform called [ZOZOTOWN](https://zozo.jp/).
47-
The following figure presents examples of displayed fashion items as actions.
47+
The following figure presents the displayed fashion items as actions.
4848
The figure shows that there are three *positions* in the data.
4949

5050
<div align="center"><img src="https://raw.githubusercontent.com/st-tech/zr-obp/master/images/recommended_fashion_items.png" width="45%"/></div>
@@ -56,7 +56,7 @@ The figure shows that there are three *positions* in the data.
5656

5757
We collected the dataset in a 7-day experiment in late November 2019 on three “campaigns,” corresponding to all, men's, and women's items, respectively.
5858
Each campaign randomly used either the Uniform Random policy or the Bernoulli Thompson Sampling (Bernoulli TS) policy for the data collection.
59-
This dataset is unique in that it contains a set of *multiple* logged bandit datasets collected by running different policies on the same platform. This enables realistic and reproducible experimental comparisons of different OPE estimators for the first time (see Section 5 of the reference [paper](https://arxiv.org/abs/2008.07146) or the package [documentation](https://zr-obp.readthedocs.io/en/latest/evaluation_ope.html) for the details of the evaluation of OPE protocol with Open Bandit Dataset).
59+
Open Bandit Dataset is unique in that it contains a set of *multiple* logged bandit datasets collected by running different policies on the same platform. This enables realistic and reproducible experimental comparisons of different OPE estimators for the first time (see Section 5 of the reference [paper](https://arxiv.org/abs/2008.07146) or the package [documentation](https://zr-obp.readthedocs.io/en/latest/evaluation_ope.html) for the details of the evaluation of OPE protocol with Open Bandit Dataset).
6060

6161
<div align="center"><img src="https://raw.githubusercontent.com/st-tech/zr-obp/master/images/obd_stats.png" width="90%"/></div>
6262

@@ -123,45 +123,27 @@ Open Bandit Pipeline consists of the following main modules.
123123
- [Switch Estimators](https://arxiv.org/abs/1612.01205)
124124
- [More Robust Doubly Robust (MRDR)](https://arxiv.org/abs/1802.03493)
125125
- [Doubly Robust with Optimistic Shrinkage (DRos)](https://arxiv.org/abs/1907.09623)
126+
- [Sub-Gaussian Inverse Probability Weighting (SGIPW)](https://proceedings.neurips.cc/paper/2021/hash/4476b929e30dd0c4e8bdbcc82c6ba23a-Abstract.html)
127+
- [Sub-Gaussian Doubly Robust (SGDR)](https://proceedings.neurips.cc/paper/2021/hash/4476b929e30dd0c4e8bdbcc82c6ba23a-Abstract.html)
126128
- [Double Machine Learning (DML)](https://arxiv.org/abs/2002.08536)
127129
- OPE of Offline Slate Bandit Algorithms
128130
- [Independent Inverse Propensity Scoring (IIPS)](https://arxiv.org/abs/1804.10488)
129131
- [Reward Interaction Inverse Propensity Scoring (RIPS)](https://arxiv.org/abs/2007)
132+
- Cascade Doubly Robust (Cascade-DR)
130133
- OPE of Offline Bandit Algorithms with Continuous Actions
131134
- [Kernelized Inverse Probability Weighting](https://arxiv.org/abs/1802.06037)
132135
- [Kernelized Self-Normalized Inverse Probability Weighting](https://arxiv.org/abs/1802.06037)
133136
- [Kernelized Doubly Robust](https://arxiv.org/abs/1802.06037)
134137

135138
</details>
136139

137-
Please refer to Section 2/Appendix of the reference [paper](https://arxiv.org/abs/2008.07146) or the package [documentation](https://zr-obp.readthedocs.io/en/latest/ope.html) for the basic formulation of OPE and the definitions of supported OPE estimators.
140+
Please refer to Section 2/Appendix of the reference [paper](https://arxiv.org/abs/2008.07146) or the package [documentation](https://zr-obp.readthedocs.io/en/latest/ope.html) for the basic formulation of OPE and the supported estimators.
138141
Note that, in addition to the above algorithms and estimators, Open Bandit Pipeline provides flexible interfaces.
139142
Therefore, researchers can easily implement their own algorithms or estimators and evaluate them with our data and pipeline.
140143
Moreover, Open Bandit Pipeline provides an interface for handling real-world logged bandit data.
141144
Thus, practitioners can combine their own real-world data with Open Bandit Pipeline and easily evaluate bandit algorithms' performance in their settings with OPE.
142145

143146

144-
## Topics and Tasks
145-
Open Bandit Dataset and Pipeline facilitate the following research topics or practical tasks.
146-
147-
### Research
148-
149-
Researchers can evaluate the performance of their bandit algorithms (in bandit papers) or the accuracy of their OPE estimators (in OPE papers) in an easy, standardized manner with Open Bandit Pipeline. One can implement these types of experiments for their research papers using synthetic bandit data, multi-class classification data, or the real-world Open Bandit Dataset.
150-
151-
- **Evaluation of Bandit Algorithms with Synthetic/Classification/Open Bandit Data**
152-
- **Evaluation of OPE with Synthetic/Classification/Open Bandit Data**
153-
154-
In particular, we prepare some example experiments about the evaluation and comparison of OPE estimators in [examples](./examples/).
155-
156-
### Practice
157-
158-
Practitioners can improve their automated decision making systems using online/batch bandit policies implemented in the policy module. Moreover, they can easily evaluate such bandit policies using historical logged bandit data and OPE without A/B testing. Specifically, one can implement OPE of batch bandit algorithms with the standard OPE procedure introduced in [examples/quickstart/obd.ipynb](./examples/quickstart/obd.ipynb).
159-
160-
- **Implementing Online/Offline(Batch) Bandit Algorithms**
161-
- **Off-Policy Evaluation of Online Bandit Algorithms**
162-
- **Off-Policy Evaluation of Offline(Batch) Bandit Algorithms**
163-
164-
165147
# Installation
166148

167149
You can install OBP using Python's package manager `pip`.
@@ -179,7 +161,7 @@ python setup.py install
179161

180162
Open Bandit Pipeline supports Python 3.7 or newer. See [pyproject.toml](./pyproject.toml) for other requirements.
181163

182-
# Usage Examples
164+
# Usage
183165

184166
## Example with Synthetic Bandit Data
185167

@@ -343,6 +325,8 @@ Bibtex:
343325
}
344326
```
345327

328+
The paper has been accepted at *NeurIPS2021 Datasets and Benchmarks Track*. The camera-ready version of the paper is available [here](https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/33e75ff09dd601bbe69f351039152189-Abstract-round2.html).
329+
346330
# Google Group
347331
If you are interested in the Open Bandit Project, you can follow the updates at its google group: https://groups.google.com/g/open-bandit-project
348332

obp/dataset/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
from obp.dataset.synthetic_continuous import sign_synthetic_policy_continuous
1919
from obp.dataset.synthetic_continuous import SyntheticContinuousBanditDataset
2020
from obp.dataset.synthetic_continuous import threshold_synthetic_policy_continuous
21+
from obp.dataset.synthetic_multi import SyntheticBanditDatasetWithMultiLoggers
2122
from obp.dataset.synthetic_slate import action_interaction_reward_function
2223
from obp.dataset.synthetic_slate import linear_behavior_policy_logit
2324
from obp.dataset.synthetic_slate import SyntheticSlateBanditDataset
@@ -47,4 +48,5 @@
4748
"SyntheticSlateBanditDataset",
4849
"action_interaction_reward_function",
4950
"linear_behavior_policy_logit",
51+
"SyntheticBanditDatasetWithMultiLoggers",
5052
]

obp/dataset/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ class BaseBanditDataset(metaclass=ABCMeta):
1111

1212
@abstractmethod
1313
def obtain_batch_bandit_feedback(self) -> None:
14-
"""Obtain batch logged bandit feedback."""
14+
"""Obtain batch logged bandit data."""
1515
raise NotImplementedError
1616

1717

0 commit comments

Comments
 (0)