Reinforcement Learning for Selfish Load Balancing
in a Distributed Memory Environment
Stephen M. Majercik and Michael L. Littman
December 1996
Abstract
Load balancing is a difficult problem whose solution can greatly
increase the speedup one achieves in a parallel distributed memory
environment. The necessity for load balancing can arise not only from
the structure or dynamics of one's problem, but from the need to
compete for processor time with other users. Given a lengthy
computation, the ability to exploit changes in processor loads when
allocating work or deciding whether to reallocate work is critical in
making the computation time-feasible. We show how this aspect of the
load balancing problem can be formulated as a Markov decision process
(MDP), and describe some preliminary attempts to solve this MDP using
guided off-line Q- earning and a linear value-function approximator.
In particular, we describe difficulties with value-function
approximator divergence and techniques we applied to correct this
problem.