This paper investigates learning algorithm design
in potential game theoretic cooperative control, where it is
in general required for agents’ collective action to converge
to the most efficient equilibria while standard game theory
aims at just computing a Nash equilibrium. In particular,
the equilibria maximizing the potential function should be
selected in case the utility functions are already aligned to a
global objective function. In order to meet the requirement,
this paper develops a learning algorithm called Payoff-based
Inhomogeneous Partially Irrational Play (PIPIP). The main
feature of PIPIP is to allow agents to make irrational decisions
with a specified probability, i.e. agents can choose an action with
a low utility from the past actions stored in the memory. We
then prove convergence in probability of the collective action
to the potential function maximizers. Finally, the effectiveness
of the present algorithm is demonstrated through simulation
on a sensor coverage problem.