`banditpylib.learners.linear_bandit_learner`¶

Classes

Classes ¶

LinearBanditLearner: Abstract class for learners playing with linear bandit
LinUCB: Linear Upper Confidence Bound policy

class banditpylib.learners.linear_bandit_learner.LinearBanditLearner(arm_num: int, name: Optional[str])[source]¶

Abstract class for learners playing with linear bandit

Parameters

arm_num (int) – number of arms
name (Optional[str]) – alias name

Inheritance

property arm_num: int¶: Number of arms

property goal: banditpylib.learners.utils.Goal¶: Goal of the learner

property running_environment: Union[type, List[type]]¶: Type of bandit environment the learner plays with

class banditpylib.learners.linear_bandit_learner.LinUCB(features: List[numpy.ndarray], delta: float, lambda_reg: float, name: Optional[str] = None)[source]¶

Linear Upper Confidence Bound policy

Todo

Add algorithm description.

Parameters

features (List[np.ndarray]) – feature vector of each arm in a list
delta (float) – delta
lambda_reg (float) – lambda for regularization
name (Optional[str]) – alias name

Inheritance

actions(context: data_pb2.Context) → data_pb2.Actions[source]¶

Actions of the learner

Parameters: context – contextual information about the bandit environment
Returns: actions to take

reset()[source]¶: Reset the learner

Warning

This function should be called before the start of the game.

update(feedback: data_pb2.Feedback)[source]¶

Update the learner

Parameters: feedback – feedback returned by the bandit environment after actions() is executed

banditpylib.learners.linear_bandit_learner¶

Classes¶

`banditpylib.learners.linear_bandit_learner`¶

Classes ¶