banditpylib.learners.mab_fbbai_learner
¶
Classes¶
MABFixedBudgetBAILearner
: Abstract class for best-arm identification learners playing with theUniform
: Uniform sampling policy
- class banditpylib.learners.mab_fbbai_learner.MABFixedBudgetBAILearner(arm_num: int, budget: int, name: Optional[str])[source]¶
Abstract class for best-arm identification learners playing with the ordinary multi-armed bandit
This kind of learners aim to identify the best arm with fixed budget.
- Parameters
arm_num (int) – number of arms
budget (int) – total number of pulls
name (Optional[str]) – alias name
Inheritance
- property arm_num: int¶
Number of arms
- abstract property best_arm: int¶
Index of the best arm identified by the learner
- property budget: int¶
Budget of the learner
- property goal: banditpylib.learners.utils.Goal¶
Goal of the learner
- property running_environment: Union[type, List[type]]¶
Type of bandit environment the learner plays with
- class banditpylib.learners.mab_fbbai_learner.Uniform(arm_num: int, budget: int, name: Optional[str] = None)[source]¶
Uniform sampling policy
Play each arm the same number of times and then output the arm with the highest empirical mean.
- Parameters
arm_num (int) – number of arms
budget (int) – total number of pulls
name (Optional[str]) – alias name
Inheritance
- actions(context: data_pb2.Context) → data_pb2.Actions[source]¶
Actions of the learner
- Parameters
context – contextual information about the bandit environment
- Returns
actions to take
- property best_arm: int¶
Index of the best arm identified by the learner
- class banditpylib.learners.mab_fbbai_learner.SH(arm_num: int, budget: int, threshold: int = 2, name: Optional[str] = None)[source]¶
Sequential halving policy [KKS13]
Eliminate half of the remaining arms in each round.
- Parameters
arm_num (int) – number of arms
budget (int) – total number of pulls
threshold (int) – do uniform sampling when the number of arms left is no greater than this number
name (Optional[str]) – alias name
Inheritance
- actions(context: data_pb2.Context) → data_pb2.Actions[source]¶
Actions of the learner
- Parameters
context – contextual information about the bandit environment
- Returns
actions to take
- property best_arm: int¶
Index of the best arm identified by the learner
- class banditpylib.learners.mab_fbbai_learner.SR(arm_num: int, budget: int, name: Optional[str] = None)[source]¶
Successive rejects policy [AB10]
Eliminate one arm in each round.
- Parameters
arm_num (int) – number of arms
budget (int) – total number of pulls
name (Optional[str]) – alias name
Inheritance
- actions(context: data_pb2.Context) → data_pb2.Actions[source]¶
Actions of the learner
- Parameters
context – contextual information about the bandit environment
- Returns
actions to take
- property best_arm: int¶
Index of the best arm identified by the learner