`banditpylib.learners.thresholding_bandit_learner`¶

Classes

Classes ¶

ThresholdingBanditLearner: Abstract class for learners playing with thresholding bandit
APT: Anytime Parameter-free Thresholding algorithm
Uniform: Uniform Sampling

class banditpylib.learners.thresholding_bandit_learner.ThresholdingBanditLearner(arm_num: int, name: Optional[str])[source]¶

Abstract class for learners playing with thresholding bandit

Parameters

arm_num (int) – number of arms
name (Optional[str]) – alias name

Inheritance

property arm_num: int¶: Number of arms

property running_environment: Union[type, List[type]]¶: Type of bandit environment the learner plays with

class banditpylib.learners.thresholding_bandit_learner.APT(arm_num: int, theta: float, eps: float, name: Optional[str] = None)[source]¶

Anytime Parameter-free Thresholding algorithm [LGC16]

Parameters

arm_num (int) – number of arms
theta (float) – threshold
eps (float) – radius of indifferent zone
name (Optional[str]) – alias name

Inheritance

actions(context: data_pb2.Context) → data_pb2.Actions[source]¶

Actions of the learner

Parameters: context – contextual information about the bandit environment
Returns: actions to take

property goal: banditpylib.learners.utils.Goal¶: Goal of the learner

reset()[source]¶: Reset the learner

Warning

This function should be called before the start of the game.

update(feedback: data_pb2.Feedback)[source]¶

Update the learner

Parameters: feedback – feedback returned by the bandit environment after actions() is executed

class banditpylib.learners.thresholding_bandit_learner.Uniform(arm_num: int, theta: float, eps: float, name: Optional[str] = None)[source]¶

Uniform Sampling

Sample each arm in a round-robin way.

Parameters

arm_num (int) – number of arms
theta (float) – threshold
eps (float) – radius of indifferent zone
name (Optional[str]) – alias name

Inheritance

actions(context: data_pb2.Context) → data_pb2.Actions[source]¶

Actions of the learner

Parameters: context – contextual information about the bandit environment
Returns: actions to take

property goal: banditpylib.learners.utils.Goal¶: Goal of the learner

reset()[source]¶: Reset the learner

Warning

This function should be called before the start of the game.

update(feedback: data_pb2.Feedback)[source]¶

Update the learner

Parameters: feedback – feedback returned by the bandit environment after actions() is executed

banditpylib.learners.thresholding_bandit_learner¶

Classes¶

`banditpylib.learners.thresholding_bandit_learner`¶

Classes ¶