banditpylib.learners
¶
Submodules¶
Classes¶
Goal
: Abstract class for the goal of a learnerIdentifyBestArm
: Best arm identificationMaximizeTotalRewards
: Reward maximizationMaximizeCorrectAnswers
: Maximize correct answersMakeAllAnswersCorrect
: Make all answers correctLearner
: Abstract class for learnersSinglePlayerLearner
: Abstract class for single player learnersCollaborativeLearner
: Abstract class for collaborative learnersCollaborativeAgent
: Abstract class for collaborative agentsCollaborativeMaster
: Abstract class for collaborative masters that handle arm assignment and
- class banditpylib.learners.Goal[source]¶
Abstract class for the goal of a learner
Inheritance
- abstract property name: str¶
Name of the goal
- class banditpylib.learners.IdentifyBestArm(best_arm: data_pb2.Arm)[source]¶
Best arm identification
- Parameters
best_arm (Arm) – best arm identified by the learner
Inheritance
- property name: str¶
Name of the goal
- class banditpylib.learners.MaximizeTotalRewards[source]¶
Reward maximization
Inheritance
- property name: str¶
Name of the goal
- class banditpylib.learners.MaximizeCorrectAnswers(answers: List[int])[source]¶
Maximize correct answers
This is used by thresholding bandit learners.
- Parameters
answers (List[int]) – answers obtained by the learner
Inheritance
- property name: str¶
Name of the goal
- class banditpylib.learners.MakeAllAnswersCorrect(answers: List[int])[source]¶
Make all answers correct
This is used by thresholding bandit learners.
- Parameters
answers (List[int]) – answers obtained by the learner
Inheritance
- property name: str¶
Name of the goal
- class banditpylib.learners.Learner(name: Optional[str])[source]¶
Abstract class for learners
- Parameters
name (Optional[str]) – alias name
Inheritance
- abstract property goal: banditpylib.learners.utils.Goal¶
Goal of the learner
- property name: str¶
Name of the learner
- abstract reset()[source]¶
Reset the learner
Warning
This function should be called before the start of the game.
- abstract property running_environment: Union[type, List[type]]¶
Type of bandit environment the learner plays with
- class banditpylib.learners.SinglePlayerLearner(name: Optional[str])[source]¶
Abstract class for single player learners
- Parameters
name (Optional[str]) – alias name
Inheritance
- class banditpylib.learners.CollaborativeLearner(agent: banditpylib.learners.utils.CollaborativeAgent, master: banditpylib.learners.utils.CollaborativeMaster, num_agents: int, name: Optional[str] = None)[source]¶
Abstract class for collaborative learners
- Parameters
agent (CollaborativeAgent) – one instance of a collaborative agent
master (CollaboratveMaster) – instance of a collaborative master
num_agents (int) – total number of agents involved
name (Optional[str]) – alias name
Inheritance
- property agents: List[banditpylib.learners.utils.CollaborativeAgent]¶
Involved agents
- property master: banditpylib.learners.utils.CollaborativeMaster¶
Controlling master
- class banditpylib.learners.CollaborativeAgent(name: Optional[str])[source]¶
Abstract class for collaborative agents
- Parameters
name (Optional[str]) – alias name
Inheritance
- abstract actions(context: data_pb2.Context) → data_pb2.Actions[source]¶
Actions of the agent
- Parameters
context – contextual information about the bandit environment
- Returns
actions to take
- abstract broadcast() → Dict[int, Tuple[float, int]][source]¶
Broadcast information learnt in the current round
- Returns
- arm ids, corresponding average rewards seen, and numbers of pulls used to
deduce average rewards
- property name: str¶
Name of the agent
- abstract reset()[source]¶
Reset the agent
Warning
This function should be called before the start of each game.
- class banditpylib.learners.CollaborativeMaster(name: Optional[str])[source]¶
Abstract class for collaborative masters that handle arm assignment and elimination
- Parameters
name (Optional[str]) – alias name
Inheritance
- abstract elimination(messages: Dict[int, Dict[int, Tuple[float, int]]]) → Dict[int, List[int]][source]¶
Update the set of active arms based on some criteria and return arm assignment
- Parameters
messages – dict of messages broadcasted from agents, where key is agent_id
- Returns
arm assignment per agent
- abstract initial_arm_assignment() → Dict[int, List[int]][source]¶
The arm assignment before the first round
- Returns
arm assignment per agent for all agents
- property name: str¶
Name of the master