banditpylib.learners.linear_bandit_learner
¶
Classes¶
LinearBanditLearner
: Abstract class for learners playing with linear banditLinUCB
: Linear Upper Confidence Bound policy
- class banditpylib.learners.linear_bandit_learner.LinearBanditLearner(arm_num: int, name: Optional[str])[source]¶
Abstract class for learners playing with linear bandit
- Parameters
arm_num (int) – number of arms
name (Optional[str]) – alias name
Inheritance
- property arm_num: int¶
Number of arms
- property goal: banditpylib.learners.utils.Goal¶
Goal of the learner
- property running_environment: Union[type, List[type]]¶
Type of bandit environment the learner plays with
- class banditpylib.learners.linear_bandit_learner.LinUCB(features: List[numpy.ndarray], delta: float, lambda_reg: float, name: Optional[str] = None)[source]¶
Linear Upper Confidence Bound policy
Todo
Add algorithm description.
- Parameters
features (List[np.ndarray]) – feature vector of each arm in a list
delta (float) – delta
lambda_reg (float) – lambda for regularization
name (Optional[str]) – alias name
Inheritance
- actions(context: data_pb2.Context) → data_pb2.Actions[source]¶
Actions of the learner
- Parameters
context – contextual information about the bandit environment
- Returns
actions to take