

class banditpylib.learners.mab_collaborative_ftbai_learner.MABCollaborativeFixedTimeBAILearner(agent: banditpylib.learners.mab_collaborative_ftbai_learner.utils.MABCollaborativeFixedTimeBAIAgent, master: banditpylib.learners.mab_collaborative_ftbai_learner.utils.MABCollaborativeFixedTimeBAIMaster, num_agents: int, name: Optional[str] = None)[source]

Collaborative fixed-time learner aiming to identify the best arm in the ordinary multi-armed bandit environment

  • agent (CollaborativeAgent) – one instance of an agent

  • master (CollaboratveMaster) – instance of the master

  • num_agents (int) – total number of agents involved

  • name (Optional[str]) – alias name


Inheritance diagram of MABCollaborativeFixedTimeBAILearner
property goal: banditpylib.learners.utils.Goal

Goal of the learner

property running_environment: Union[type, List[type]]

Type of bandit environment the learner plays with

class banditpylib.learners.mab_collaborative_ftbai_learner.LilUCBHeuristicCollaborative(num_agents: int, arm_num: int, rounds: int, horizon: int, name: Optional[str] = None)[source]

Colaborative learner using lilucb heuristic as centralized policy

  • num_agents (int) – number of agents

  • arm_num (int) – number of arms of the bandit

  • rounds (int) – number of total rounds allowed

  • horizon (int) – maximum number of pulls the agent can make (over all rounds combined)

  • name (Optional[str]) – alias name


Inheritance diagram of LilUCBHeuristicCollaborative