Bayesian Bandits
bayesianbandits
is a Python library that provides a simple, elegant interface for implementing Bayesian multi-armed bandit algorithms. Users simply define their action space, reward function, and a prior distribution, and the library takes care of the rest.
from bayesianbandits import Bandit, Arm, epsilon_greedy, DirichletClassifier
def reward_func(x):
return x[..., 0]
class Agent(Bandit,
learner=DirichletClassifier({"yes": 1.0, "no": 1.0}),
policy=epsilon_greedy(0.1)):
arm1 = Arm("action 1", reward_func)
arm2 = Arm("action 2", reward_func)
agent = Agent()
action = agent.pull() # receive an action token
# act on the action token, observe reward
agent.update("yes") # update with observed reward
The library supports contextual bandits, where additional information (or context) is available for decision making, restless bandits that change their reward probabilities over time, and bandits with delayed rewards where the outcome of an action may not be immediately available.
bayesianbandits
allows users to create sophisticated reinforcement learning agents that can accumulate knowledge about different actions, or “arms”, and update beliefs based on the received rewards. The library comes bundled with a number of conjugate Bayesian models, including Bayesian linear regression using either Normal-Normal or Normal-Inverse-Gamma conjugate priors, or intercept-only Gamma-Poisson and Dirichlet-Multinomial models.
The library is useful in a variety of applications. For example, it can be leveraged to optimize click-through rates for an email newsletter. bayesianbandits
is being used in production at IntelyCare!
Source
The source code for bayesianbandits
is available on GitHub.
Introduction
bayesianbandits
is compatible with joblib
, a package widely used by scikit-learn to simplify model persistence. This compatibility makes it easy to store and retrieve learning agents for further use. Additionally, the library addresses memory management concerns by allowing the delayed_reward
cache to be stored in any dict-like object, facilitating efficient on-disk storage instead of in-memory.
Dependencies
- numpy
- scipy
Installation
Installing bayesianbandits
is straightforward using PyPi:
pip install bayesianbandits
Documentation
Detailed user guide and documentation can be found at readthedocs.