`banditpylib.learners`¶

Submodules
Classes

Submodules ¶

Classes ¶

Goal: Abstract class for the goal of a learner
IdentifyBestArm: Best arm identification
MaximizeTotalRewards: Reward maximization
MaximizeCorrectAnswers: Maximize correct answers
MakeAllAnswersCorrect: Make all answers correct
Learner: Abstract class for learners
SinglePlayerLearner: Abstract class for single player learners
CollaborativeLearner: Abstract class for collaborative learners
CollaborativeAgent: Abstract class for collaborative agents
CollaborativeMaster: Abstract class for collaborative masters that handle arm assignment and

class banditpylib.learners.Goal[source]¶

Abstract class for the goal of a learner

Inheritance

abstract property name: str¶: Name of the goal

class banditpylib.learners.IdentifyBestArm(best_arm: data_pb2.Arm)[source]¶

Best arm identification

Parameters: best_arm (Arm) – best arm identified by the learner

Inheritance

property name: str¶: Name of the goal

class banditpylib.learners.MaximizeTotalRewards[source]¶

Reward maximization

Inheritance

property name: str¶: Name of the goal

class banditpylib.learners.MaximizeCorrectAnswers(answers: List[int])[source]¶

Maximize correct answers

This is used by thresholding bandit learners.

Parameters: answers (List[int]) – answers obtained by the learner

Inheritance

property name: str¶: Name of the goal

class banditpylib.learners.MakeAllAnswersCorrect(answers: List[int])[source]¶

Make all answers correct

This is used by thresholding bandit learners.

Parameters: answers (List[int]) – answers obtained by the learner

Inheritance

property name: str¶: Name of the goal

class banditpylib.learners.Learner(name: Optional[str])[source]¶

Abstract class for learners

Parameters: name (Optional[str]) – alias name

Inheritance

abstract property goal: banditpylib.learners.utils.Goal¶: Goal of the learner

property name: str¶: Name of the learner

abstract reset()[source]¶: Reset the learner

Warning

This function should be called before the start of the game.

abstract property running_environment: Union[type, List[type]]¶: Type of bandit environment the learner plays with

class banditpylib.learners.SinglePlayerLearner(name: Optional[str])[source]¶

Abstract class for single player learners

Parameters: name (Optional[str]) – alias name

Inheritance

abstract actions(context: data_pb2.Context) → data_pb2.Actions[source]¶

Actions of the learner

Parameters: context – contextual information about the bandit environment
Returns: actions to take

abstract update(feedback: data_pb2.Feedback)[source]¶

Update the learner

Parameters: feedback – feedback returned by the bandit environment after actions() is executed

class banditpylib.learners.CollaborativeLearner(agent: banditpylib.learners.utils.CollaborativeAgent, master: banditpylib.learners.utils.CollaborativeMaster, num_agents: int, name: Optional[str] = None)[source]¶

Abstract class for collaborative learners

Parameters

agent (CollaborativeAgent) – one instance of a collaborative agent
master (CollaboratveMaster) – instance of a collaborative master
num_agents (int) – total number of agents involved
name (Optional[str]) – alias name

Inheritance

property agents: List[banditpylib.learners.utils.CollaborativeAgent]¶: Involved agents

property master: banditpylib.learners.utils.CollaborativeMaster¶: Controlling master

reset()[source]¶: Reset the learner

Warning

This function should be called before the start of the game.

class banditpylib.learners.CollaborativeAgent(name: Optional[str])[source]¶

Abstract class for collaborative agents

Parameters: name (Optional[str]) – alias name

Inheritance

abstract actions(context: data_pb2.Context) → data_pb2.Actions[source]¶

Actions of the agent

Parameters: context – contextual information about the bandit environment
Returns: actions to take

abstract broadcast() → Dict[int, Tuple[float, int]][source]¶

Broadcast information learnt in the current round

Returns

arm ids, corresponding average rewards seen, and numbers of pulls used to: deduce average rewards

property name: str¶: Name of the agent

abstract reset()[source]¶: Reset the agent

Warning

This function should be called before the start of each game.

abstract set_input_arms(arms: List[int])[source]¶

Assign a set of arms to the agent

Parameters: arms – arm indices that have been assigned

abstract update(feedback: data_pb2.Feedback)[source]¶

Update the agent

Parameters: feedback – feedback returned by the bandit environment after actions() is executed

class banditpylib.learners.CollaborativeMaster(name: Optional[str])[source]¶

Abstract class for collaborative masters that handle arm assignment and elimination

Parameters: name (Optional[str]) – alias name

Inheritance

abstract elimination(messages: Dict[int, Dict[int, Tuple[float, int]]]) → Dict[int, List[int]][source]¶

Update the set of active arms based on some criteria and return arm assignment

Parameters: messages – dict of messages broadcasted from agents, where key is agent_id
Returns: arm assignment per agent

abstract initial_arm_assignment() → Dict[int, List[int]][source]¶

The arm assignment before the first round

Returns: arm assignment per agent for all agents

property name: str¶: Name of the master

abstract reset()[source]¶: Reset the master

Warning

This function should be called before the start of each game.

banditpylib.learners¶

Submodules¶

Classes¶

`banditpylib.learners`¶

Submodules ¶

Classes ¶