banditpylib.learners.linear_bandit_learner

Classes

class banditpylib.learners.linear_bandit_learner.LinearBanditLearner(arm_num: int, name: Optional[str])[source]

Abstract class for learners playing with linear bandit

Parameters
  • arm_num (int) – number of arms

  • name (Optional[str]) – alias name

Inheritance

Inheritance diagram of LinearBanditLearner
property arm_num: int

Number of arms

property goal: banditpylib.learners.utils.Goal

Goal of the learner

property running_environment: Union[type, List[type]]

Type of bandit environment the learner plays with

class banditpylib.learners.linear_bandit_learner.LinUCB(features: List[numpy.ndarray], delta: float, lambda_reg: float, name: Optional[str] = None)[source]

Linear Upper Confidence Bound policy

Todo

Add algorithm description.

Parameters
  • features (List[np.ndarray]) – feature vector of each arm in a list

  • delta (float) – delta

  • lambda_reg (float) – lambda for regularization

  • name (Optional[str]) – alias name

Inheritance

Inheritance diagram of LinUCB
actions(context: data_pb2.Context)data_pb2.Actions[source]

Actions of the learner

Parameters

context – contextual information about the bandit environment

Returns

actions to take

reset()[source]

Reset the learner

Warning

This function should be called before the start of the game.

update(feedback: data_pb2.Feedback)[source]

Update the learner

Parameters

feedback – feedback returned by the bandit environment after actions() is executed