Scippy

SCIP

Solving Constraint Integer Programs

Detailed Description

methods for bandit algorithms

Epsilon greedy

Epsilon greedy is a randomized algorithm for the multi-armed bandit problem.

In every iteration, it either selects an action uniformly at random with probability \( \varepsilon_t\) or it greedily exploits the best action seen so far with probability \( 1 - \varepsilon_t \). In this implementation, \( \varepsilon_t \) decreases over time (number of selections performed), controlled by the epsilon parameter.

Exp.3

Exp.3 is a randomized selection method for the multi-armed bandit problem

Exp3 maintains a probability distribution according to which an action is drawn in every iteration. The probability distribution is a mixture between a uniform distribution and a softmax distribution based on the cumulative rewards of the actions. The weight of the uniform distribution in the mixture is controlled by the parameter \( \gamma \), ie., setting \( \gamma = 1\) uses a uniform distribution in every selection step. The cumulative reward for the actions can be fine-tuned by adding a general bias for all actions. The bias is given by the parameter \( \beta \).

Upper Confidence Bounds (UCB)

UCB (Upper confidence bounds) is a deterministic selection algorithm for the multi-armed bandit problem. In every iteration, UCB selects the action that maximizes a tradeoff between its performance in the past and a variance term. The influence of the variance (confidence width) can be controlled by the parameter \( \alpha \).

Functions

SCIP_RETCODE SCIPbanditSelect (SCIP_BANDIT *bandit, int *action)
 
SCIP_RETCODE SCIPbanditUpdate (SCIP_BANDIT *bandit, int action, SCIP_Real score)
 
const char * SCIPbanditvtableGetName (SCIP_BANDITVTABLE *banditvtable)
 
SCIP_RANDNUMGENSCIPbanditGetRandnumgen (SCIP_BANDIT *bandit)
 
int SCIPbanditGetNActions (SCIP_BANDIT *bandit)
 
SCIP_RETCODE SCIPcreateBanditEpsgreedy (SCIP *scip, SCIP_BANDIT **epsgreedy, SCIP_Real *priorities, SCIP_Real eps, SCIP_Bool preferrecent, SCIP_Real decayfactor, int avglim, int nactions, unsigned int initseed)
 
SCIP_RealSCIPgetWeightsEpsgreedy (SCIP_BANDIT *epsgreedy)
 
void SCIPsetEpsilonEpsgreedy (SCIP_BANDIT *epsgreedy, SCIP_Real eps)
 
SCIP_RETCODE SCIPcreateBanditExp3 (SCIP *scip, SCIP_BANDIT **exp3, SCIP_Real *priorities, SCIP_Real gammaparam, SCIP_Real beta, int nactions, unsigned int initseed)
 
void SCIPsetGammaExp3 (SCIP_BANDIT *exp3, SCIP_Real gammaparam)
 
void SCIPsetBetaExp3 (SCIP_BANDIT *exp3, SCIP_Real beta)
 
SCIP_Real SCIPgetProbabilityExp3 (SCIP_BANDIT *exp3, int action)
 
SCIP_RETCODE SCIPcreateBanditUcb (SCIP *scip, SCIP_BANDIT **ucb, SCIP_Real *priorities, SCIP_Real alpha, int nactions, unsigned int initseed)
 
SCIP_Real SCIPgetConfidenceBoundUcb (SCIP_BANDIT *ucb, int action)
 
int * SCIPgetStartPermutationUcb (SCIP_BANDIT *ucb)
 
SCIP_RETCODE SCIPincludeBanditvtable (SCIP *scip, SCIP_BANDITVTABLE **banditvtable, const char *name, SCIP_DECL_BANDITFREE((*banditfree)), SCIP_DECL_BANDITSELECT((*banditselect)), SCIP_DECL_BANDITUPDATE((*banditupdate)), SCIP_DECL_BANDITRESET((*banditreset)))
 
SCIP_BANDITVTABLESCIPfindBanditvtable (SCIP *scip, const char *name)
 
SCIP_RETCODE SCIPfreeBandit (SCIP *scip, SCIP_BANDIT **bandit)
 
SCIP_RETCODE SCIPresetBandit (SCIP *scip, SCIP_BANDIT *bandit, SCIP_Real *priorities, unsigned int seed)
 

Files

file  pub_bandit.h
 public methods for bandit algorithms
 
file  pub_bandit_epsgreedy.h
 public methods for the epsilon greedy bandit selector
 
file  pub_bandit_exp3.h
 public methods for Exp.3
 
file  pub_bandit_ucb.h
 public methods for UCB bandit selection
 

Function Documentation

◆ SCIPbanditSelect()

SCIP_RETCODE SCIPbanditSelect ( SCIP_BANDIT bandit,
int *  action 
)

select the next action

Parameters
banditbandit algorithm data structure
actionpointer to store the selected action

Definition at line 143 of file bandit.c.

References NULL, SCIP_CALL, SCIP_OKAY, SCIPbanditGetNActions(), and SCIP_Bandit::vtable.

Referenced by selectNeighborhood().

◆ SCIPbanditUpdate()

SCIP_RETCODE SCIPbanditUpdate ( SCIP_BANDIT bandit,
int  action,
SCIP_Real  score 
)

update the score of the selected action

Parameters
banditbandit algorithm data structure
actionindex of action for which the score should be updated
scoreobserved gain of the i'th action

Definition at line 164 of file bandit.c.

References NULL, SCIP_CALL, SCIP_OKAY, SCIPbanditGetNActions(), and SCIP_Bandit::vtable.

Referenced by updateBanditAlgorithm().

◆ SCIPbanditvtableGetName()

const char* SCIPbanditvtableGetName ( SCIP_BANDITVTABLE banditvtable)

return the name of this bandit virtual function table

Parameters
banditvtablevirtual table for bandit algorithm

Definition at line 272 of file bandit.c.

References SCIP_BanditVTable::name, and NULL.

Referenced by SCIPsetSortProps().

◆ SCIPbanditGetRandnumgen()

SCIP_RANDNUMGEN* SCIPbanditGetRandnumgen ( SCIP_BANDIT bandit)

return the random number generator of a bandit algorithm

Parameters
banditbandit algorithm data structure

Definition at line 283 of file bandit.c.

References NULL, and SCIP_Bandit::rng.

Referenced by alnsFixMoreVariables(), alnsUnfixVariables(), dataReset(), SCIP_DECL_BANDITRESET(), and SCIP_DECL_BANDITSELECT().

◆ SCIPbanditGetNActions()

int SCIPbanditGetNActions ( SCIP_BANDIT bandit)

return number of actions of this bandit algorithm

Parameters
banditbandit algorithm data structure

Definition at line 293 of file bandit.c.

References SCIP_Bandit::nactions, and NULL.

Referenced by SCIP_DECL_BANDITFREE(), SCIP_DECL_BANDITRESET(), SCIP_DECL_BANDITSELECT(), SCIP_DECL_BANDITUPDATE(), SCIP_DECL_HEURINITSOL(), SCIPbanditReset(), SCIPbanditSelect(), SCIPbanditUpdate(), SCIPgetConfidenceBoundUcb(), and SCIPgetProbabilityExp3().

◆ SCIPcreateBanditEpsgreedy()

SCIP_RETCODE SCIPcreateBanditEpsgreedy ( SCIP scip,
SCIP_BANDIT **  epsgreedy,
SCIP_Real priorities,
SCIP_Real  eps,
SCIP_Bool  preferrecent,
SCIP_Real  decayfactor,
int  avglim,
int  nactions,
unsigned int  initseed 
)

create and resets an epsilon greedy bandit algorithm

Parameters
scipSCIP data structure
epsgreedypointer to store the epsilon greedy bandit algorithm
prioritiesnonnegative priorities for each action, or NULL if not needed
epsparameter to increase probability for exploration between all actions
preferrecentshould the weights be updated in an exponentially decaying way?
decayfactorthe factor to reduce the weight of older observations if exponential decay is enabled
avglimnonnegative limit on observation number before the exponential decay starts, only relevant if exponential decay is enabled
nactionsthe positive number of possible actions
initseedinitial seed for random number generation

Definition at line 270 of file bandit_epsgreedy.c.

References BANDIT_NAME, NULL, SCIP_CALL, SCIP_INVALIDDATA, SCIP_OKAY, SCIPbanditCreateEpsgreedy(), SCIPblkmem(), SCIPbuffer(), SCIPerrorMessage, SCIPfindBanditvtable(), and SCIPinitializeRandomSeed().

Referenced by createBandit().

◆ SCIPgetWeightsEpsgreedy()

SCIP_Real* SCIPgetWeightsEpsgreedy ( SCIP_BANDIT epsgreedy)

get weights array of epsilon greedy bandit algorithm

Parameters
epsgreedyepsilon greedy bandit algorithm

Definition at line 302 of file bandit_epsgreedy.c.

References NULL, and SCIPbanditGetData().

Referenced by printNeighborhoodStatistics().

◆ SCIPsetEpsilonEpsgreedy()

void SCIPsetEpsilonEpsgreedy ( SCIP_BANDIT epsgreedy,
SCIP_Real  eps 
)

set epsilon parameter of epsilon greedy bandit algorithm

Parameters
epsgreedyepsilon greedy bandit algorithm
epsparameter to increase probability for exploration between all actions

Definition at line 315 of file bandit_epsgreedy.c.

References eps, NULL, and SCIPbanditGetData().

◆ SCIPcreateBanditExp3()

SCIP_RETCODE SCIPcreateBanditExp3 ( SCIP scip,
SCIP_BANDIT **  exp3,
SCIP_Real priorities,
SCIP_Real  gammaparam,
SCIP_Real  beta,
int  nactions,
unsigned int  initseed 
)

creates and resets an Exp.3 bandit algorithm using scip pointer

Parameters
scipSCIP data structure
exp3pointer to store bandit algorithm
prioritiesnonnegative priorities for each action, or NULL if not needed
gammaparamweight between uniform (gamma ~ 1) and weight driven (gamma ~ 0) probability distribution
betagain offset between 0 and 1 at every observation
nactionsthe positive number of actions for this bandit algorithm
initseedinitial seed for random number generation

Definition at line 301 of file bandit_exp3.c.

References BANDIT_NAME, NULL, SCIP_CALL, SCIP_INVALIDDATA, SCIP_OKAY, SCIPbanditCreateExp3(), SCIPblkmem(), SCIPbuffer(), SCIPerrorMessage, SCIPfindBanditvtable(), and SCIPinitializeRandomSeed().

Referenced by createBandit().

◆ SCIPsetGammaExp3()

void SCIPsetGammaExp3 ( SCIP_BANDIT exp3,
SCIP_Real  gammaparam 
)

set gamma parameter of Exp.3 bandit algorithm to increase weight of uniform distribution

Parameters
exp3bandit algorithm
gammaparamweight between uniform (gamma ~ 1) and weight driven (gamma ~ 0) probability distribution

Definition at line 327 of file bandit_exp3.c.

References SCIPbanditGetData().

◆ SCIPsetBetaExp3()

void SCIPsetBetaExp3 ( SCIP_BANDIT exp3,
SCIP_Real  beta 
)

set beta parameter of Exp.3 bandit algorithm to increase gain offset for actions that were not played

Parameters
exp3bandit algorithm
betagain offset between 0 and 1 at every observation

Definition at line 340 of file bandit_exp3.c.

References SCIPbanditGetData().

◆ SCIPgetProbabilityExp3()

SCIP_Real SCIPgetProbabilityExp3 ( SCIP_BANDIT exp3,
int  action 
)

returns probability to play an action

Parameters
exp3bandit algorithm
actionindex of the requested action

Definition at line 353 of file bandit_exp3.c.

References SCIP_Real, SCIPbanditGetData(), and SCIPbanditGetNActions().

Referenced by printNeighborhoodStatistics().

◆ SCIPcreateBanditUcb()

SCIP_RETCODE SCIPcreateBanditUcb ( SCIP scip,
SCIP_BANDIT **  ucb,
SCIP_Real priorities,
SCIP_Real  alpha,
int  nactions,
unsigned int  initseed 
)

create and reset UCB bandit algorithm

Parameters
scipSCIP data structure
ucbpointer to store bandit algorithm
prioritiesnonnegative priorities for each action, or NULL if not needed
alphaparameter to increase confidence width
nactionsthe positive number of actions for this bandit algorithm
initseedinitial random number seed

Definition at line 329 of file bandit_ucb.c.

References BANDIT_NAME, NULL, SCIP_CALL, SCIP_INVALIDDATA, SCIP_OKAY, SCIPbanditCreateUcb(), SCIPblkmem(), SCIPbuffer(), SCIPerrorMessage, SCIPfindBanditvtable(), and SCIPinitializeRandomSeed().

Referenced by createBandit().

◆ SCIPgetConfidenceBoundUcb()

SCIP_Real SCIPgetConfidenceBoundUcb ( SCIP_BANDIT ucb,
int  action 
)

returns the upper confidence bound of a selected action

Parameters
ucbUCB bandit algorithm
actionindex of the queried action

Definition at line 254 of file bandit_ucb.c.

References LOG1P, NULL, SCIP_Real, SCIPbanditGetData(), SCIPbanditGetNActions(), and sqrt().

Referenced by printNeighborhoodStatistics().

◆ SCIPgetStartPermutationUcb()

int* SCIPgetStartPermutationUcb ( SCIP_BANDIT ucb)

return start permutation of the UCB bandit algorithm

Parameters
ucbUCB bandit algorithm

Definition at line 283 of file bandit_ucb.c.

References NULL, and SCIPbanditGetData().

◆ SCIPincludeBanditvtable()

SCIP_RETCODE SCIPincludeBanditvtable ( SCIP scip,
SCIP_BANDITVTABLE **  banditvtable,
const char *  name,
SCIP_DECL_BANDITFREE((*banditfree))  ,
SCIP_DECL_BANDITSELECT((*banditselect))  ,
SCIP_DECL_BANDITUPDATE((*banditupdate))  ,
SCIP_DECL_BANDITRESET((*banditreset))   
)

includes a bandit algorithm virtual function table

Parameters
scipSCIP data structure
banditvtablebandit algorithm virtual function table
namea name for the algorithm represented by this vtable

Definition at line 32 of file scip_bandit.c.

References NULL, SCIP_CALL, SCIP_INVALIDDATA, SCIP_OKAY, SCIPbanditvtableCreate(), SCIPerrorMessage, SCIPfindBanditvtable(), SCIPsetIncludeBanditvtable(), and Scip::set.

Referenced by SCIPincludeBanditvtableEpsgreedy(), SCIPincludeBanditvtableExp3(), and SCIPincludeBanditvtableUcb().

◆ SCIPfindBanditvtable()

SCIP_BANDITVTABLE* SCIPfindBanditvtable ( SCIP scip,
const char *  name 
)

returns the bandit virtual function table of the given name, or NULL if not existing

Parameters
scipSCIP data structure
namename of bandit algorithm virtual function table

Definition at line 64 of file scip_bandit.c.

References NULL, SCIPsetFindBanditvtable(), and Scip::set.

Referenced by SCIPcreateBanditEpsgreedy(), SCIPcreateBanditExp3(), SCIPcreateBanditUcb(), and SCIPincludeBanditvtable().

◆ SCIPfreeBandit()

SCIP_RETCODE SCIPfreeBandit ( SCIP scip,
SCIP_BANDIT **  bandit 
)

calls destructor and frees memory of bandit algorithm

Parameters
scipSCIP data structure
banditpointer to bandit algorithm data structure

Definition at line 91 of file scip_bandit.c.

References NULL, SCIP_CALL, SCIP_OKAY, SCIPbanditFree(), and SCIPblkmem().

Referenced by SCIP_DECL_HEURFREE(), and SCIP_DECL_HEURINITSOL().

◆ SCIPresetBandit()

SCIP_RETCODE SCIPresetBandit ( SCIP scip,
SCIP_BANDIT bandit,
SCIP_Real priorities,
unsigned int  seed 
)

reset the bandit algorithm

Parameters
scipSCIP data structure
banditpointer to bandit algorithm data structure
prioritiespriorities for every action, or NULL if not needed
seedinitial random seed for bandit selection

Definition at line 75 of file scip_bandit.c.

References NULL, SCIP_CALL, SCIP_OKAY, SCIPbanditReset(), SCIPbuffer(), and SCIPinitializeRandomSeed().

Referenced by SCIP_DECL_HEURINITSOL().