Training on test set?

Hi, on the README you write:

```python
estimated_rewards_by_reg_model = regression_model.fit_predict(
    context=bandit_feedback_test["context"],
    action=bandit_feedback_test["action"],
    reward=bandit_feedback_test["reward"],
)
```

But this is basically fitting on test rewards. Is this legal?