-
-
Notifications
You must be signed in to change notification settings - Fork 59
Closed
Labels
bugWOO I have to fix something urgent!WOO I have to fix something urgent!non-stationaryFor non-stationary bandits simulationsFor non-stationary bandits simulations
Description
My Environment
class is starting to be too messy, especially regarding how I handle the different cases of stationary and non stationary bandits.
- I should remove this optimization of
self._historyOfChangePoints
andself._historyOfMeans
dictionaries in NonStationaryMAB, and just store the full list of size K * T of means! - I should just store this full list (don't care about size, it will anyway be of the order of T which is the storage space needed to store the actual rewards!)
- And trick the
.maxArm
attribute ofenv
inEvaluator
so that the previously written methods for regret / last regret / accumulated rewards etc are the same.
For plots
- Giving number of break points in the ylabel is fine, but maybe I could add red star markers on the xaxis at each breakpoint ? Or dashed vertical line?
- Fixed for regret (I think, at least for
PieceWiseStationaryMAB
) - TODO fix for best arm pull selections!
Metadata
Metadata
Assignees
Labels
bugWOO I have to fix something urgent!WOO I have to fix something urgent!non-stationaryFor non-stationary bandits simulationsFor non-stationary bandits simulations