The concept of replicator dynamics is used to express the evolutionary dynamics of an entity called replicator which has means of making more or less accurate copies of itself. The replicator can be a gene, an organism, a strategy in a game, a belief, a technique, a convention, or any institutional or cultural form. In the following, game strategies will be considered.
The concept assumes a large population of replicators, in which different types meet in proportion to their share in the population. This meeting - i.e. the interaction of different replicators (e.g. different strategies in a game) - generates payoffs, which are interpreted as an replicator's fitness. Replicators reproduce with regard to their fitness in relation to the fitness of others. The general idea is that replicators whose fitness is larger (smaller) than the average fitness of the population will increase (decrease) their share in the population.
In evolutionary game theory replicators are strategies, which compete for dominance according to the payoff they yield in interaction. Typical examples are the strategies of cooperation and defection in games like the Prisoners Delemma or the Public Good Game. Similar to dominant strategies bringing forth Nash equilibria when games are repeated, strategies in replicator dynamics can become evolutionary stable.
An Evolutionarily Stable Strategy (ESS) is a strategy which, if adopted by a population in a given environment, cannot be invaded by any alternative strategy that is initially rare.
Maynard Smith, J.; Price, G.R. (1973). The logic of animal conflict. Nature. 246 (5427): 15â€“8.
Maynard Smith, J. (1972). Game Theory and The Evolution of Fighting. On Evolution. Edinburgh University Press.
Mathematically, replicator dynamics are expressed in the form of so called replicator equations, which is a set of differential equations used to study dynamics in evolutionary game theory. The replicator dynamics provide a simple model of evolution and success-driven (or prestige-biased) learning in games.
Consider a large population with $N$ replicators. In each period, each replicator is randomly matched with another replicator for playing a two-players game.
Replicators are assigned strategies $A$ or $B$.
The share of the population playing strategy $A$ is $x_A$, so: $x_A = \frac{N_A}{N}$, respectively $x_B = \frac{N_B}{N}$
The state of the population is given by $(x_A, x_B)$ where $x_A â‰¥ 0, x_B â‰¥ 0$, and $x_A + x_B = 1$, thus $x_B = 1 - x_A$.
A replicatorâ€™s fitness $f$ is determined by the payoff (as expressed in the payoff table) and the share of each strategy in the population.
For example, consider the payoff table of the coordination game:
\begin{array}{|c|c|} \hline & A & B \\\hline A & a, a & b, c \\\hline B & c, b & d, d \\\hline \end{array}
with $a > c$ and $b < d$ and the starting frequencies $x_A, x_B$
The fitnes for the player who is playing A is $f_A$
Since $f_A$ depends on $x_A$ and $x_B$, we write $f_A(x_A, x_B)$ and $\pi$ for payoff ($\pi_A(A, A)$ is the payoff of an $A$-player if the opponent also plays $A$)
$f_A(x_A, x_B)$ = (probability of interacting with $A$ player)$*\pi_A(A, A)$ + (probability of interacting with $B$ player)$*\pi_A(A, B)$
$= x_A*a + x_B*b = x_A*a + (1 - x_A)*b$
Correspondingly, the fitness for the player who is playing $B$ is $f_B$:
$f_B(x_A, x_B)$ = (probability of interacting with $A$ player)$*\pi_B(B, A)$ + (probability of interacting with $B$ player)$*\pi_B(B, B)$
$= x_A*c + x_B*d = x_A*c + (1 - x_A)*d $
# shares in the population
xA = .75 # probability of interacting with A player
xB = (1 - xA) # probability of interacting with B player
# payoffs
a = 2; b = 3; c = 1; d = 4
# A B
# A 2, 2 3, 1
# B 1, 3 4, 4
# fitness for the player who is playing A
fA = xA * a + xB * b
# fitness for the player who is playing B
fB = xA * c + xB * d
print(fA)
print(fB)
Fitness is interpreted as rate of reproduction
The average fitness, $\bar{f}$, of a population is the weighted average of the two fitness values.
$\bar{f}(x_A, x_B) = x_A*f_A(x_A, x_B) + x_B*f_B(x_A, x_B) = x_A*f_A(x_A, x_B) + (1 - x_A)*f_B(x_A, x_B) $
# average fitness
f = xA * fA + (1 - xA) * fB
print(f)
Recall $x_A = \frac{N_A}{N}$
First, how fast does $N_A$ grow?
Each individual reproduces at a rate $f_A$, and there are $N_A$ of them. So:
$\frac{dN_A}{dt} = N_A * f_A(x_A, x_B)$
Next, how fast does $N$ grow. By the same logic:
$\frac{dN}{dt} = N * \bar{f}(x_A, x_B)$
With the quotient rule, and with a little simplification we get
where $\frac{dx_A}{dt}$ is the growth rate of $A$
$x_A$ is the current frequency (proportion) of strategy $A$ in the population (indicating how many $A$-players can reproduce)
$f_A(x_A, x_B)$ is the payoff, resp. fitness of an $A$-player,
$\bar{f}(x_A, x_B)$ is the average fitness of the population, and
$(f_A(x_A, x_B) â€“ \bar{f}(x_A, x_B))$ is an $A$-player's fitness relative to the average fitness (i.e. the key property: More successful strategies grow faster)
If:
$x_A > 0$: The proportion of $A$-players is non-zero
$f_A > \bar{f}$: The fitness of $A$-players is above average
then:
# simulate
import matplotlib.pyplot as plt
%matplotlib inline
# payoffs a > c and b < d
#a = 2; b = 3; c = 1; d = 4
a = 4; b = 3; c = 1; d = 5
xA = [0.55]
xB = [1 - xA[0]]
dt = 0.1
# fitness of A and B
FA = [(xA[0] * a + xB[0] * b) * dt]
FB = [(xA[0] * c + xB[0] * d) * dt]
# average fitness
F = [(xA[0] * (xA[0] * a + xB[0] * b) + xB[0] * (xA[0] * c + xB[0] * d)) * dt]
for t in range(100):
# fitnesses
fA = xA[t] * a + xB[t] * b
fB = xA[t] * c + xB[t] * d
f = xA[t] * fA + xB[t] * fB
FA.append(fA*dt)
FB.append(fB*dt)
F.append(f*dt)
# differential equations
xA.append(xA[t] + (xA[t] * (fA - f)) * dt)
xB.append(xB[t] + (xB[t] * (fB - f)) * dt)
plt.plot(xA, 'r', label ='share of strategy A')
plt.plot(xB, 'b', label ='share of strategy B')
plt.plot(FA, 'r--', label ='fitness of strategy A')
plt.plot(FB, 'b--', label ='fitness of strategy B')
plt.plot(F, 'g--', label ='mean population fitness')
plt.grid()
plt.ylim(0, 1)
plt.legend(loc='best')
The pure Nash-equilibria are (a, a) and (d, d)
and the mixed strategy equilibrium is: $x_A = \frac{d â€“ b} {d â€“ b + a â€“ c} $
# use sympy to calculate steady states
from sympy import *
#a = 2; b = 3; c = 1; d = 4
a = 4; b = 3; c = 1; d = 5
xA, xB = symbols('xA, xB')
dA = xA * ((a*xA + b*xB) - (xA * (a*xA + b*xB) + xB * (c*xA + d*xB)))
dB = xB * ((c*xA + d*xB) - (xA * (a*xA + b*xB) + xB * (c*xA + d*xB)))
# use sympy's way of setting equations to zero
AEqual = Eq(dA, 0)
BEqual = Eq(dB, 0)
# compute fixed points
equilibria = solve([AEqual, BEqual], [xA, xB])
print(equilibria)
# simulate the non-trivial equilibrium ( = mixed strategy equilibrium (mse)
import matplotlib.pyplot as plt
%matplotlib inline
# payoffs a > c and b < d
# a = 2; b = 3; c = 1; d = 4
a = 4; b = 3; c = 1; d = 5
mse = (d - b) / float(d - b + a - c) # mixed strategy equilibrium
print(mse)
# deviate slightly from the mse
xA = [2/5. + 0.000000001]
xB = [1 - xA[0]]
dt = 0.1
for t in range(200):
fA = xA[t] * a + xB[t] * b
fB = xA[t] * c + xB[t] * d
f = xA[t] * fA + xB[t] * fB
xA.append(xA[t] + (xA[t] * (fA - f)) * dt)
xB.append(xB[t] + (xB[t] * (fB - f)) * dt)
plt.plot(xA, 'r', label ='share of strategy A')
plt.plot(xB, 'b', label ='share of strategy B')
plt.grid()
plt.ylim(0, 1)
plt.legend(loc='best')
Note: the non-trivial equilibrium (the mixed strategy equilibrium) is not an asymptotic equilibrium (i.e. not a stable fixed point. see: stability analysis)
See here for an overview.
Payoffs are
\begin{array}{|c|c|} \hline & C & D \\\hline C & E - I_i + f * \frac{\sum_{j=1}^{n} I_j}{n}, E - I_i + f * \frac{\sum_{j=1}^{n} I_j}{n} & E - I_i + f * \frac{\sum_{j=1}^{n} I_j}{n}, E + f * \frac{\sum_{j=1}^{n} I_j}{n} \\\hline D & E + f * \frac{\sum_{j=1}^{n} I_j}{n}, E - I_i + f * \frac{\sum_{j=1}^{n} I_j}{n} & E + f * \frac{\sum_{j=1}^{n} I_j}{n}, E + f * \frac{\sum_{j=1}^{n} I_j}{n} \\\hline \end{array}
with $n = 2$, $E = 1$ and $I = (0, 1)$
\begin{array}{|c|c|} \hline & C & D \\\hline C & E - I + f * \frac{2*I}{2}, E - I + f * \frac{2*I}{2} & E - I + f * \frac{I}{2}, E + f * \frac{I}{2} \\\hline D & E + f * \frac{I}{2}, E - I + f * \frac{I}{2} & E, E \\\hline \end{array}
The share of the population cooperating, i.e. playing strategy $C$, is $x_C$, so: $x_C = \frac{N_C}{N}, x_D = \frac{N_D}{N}$
The state of the population is given by $(x_C, x_D)$ where $x_C â‰¥ 0, x_D â‰¥ 0$, and $x_C + x_D = 1$, thus $x_D = 1 - x_C$.
The fitness of a cooperating player is $f_C$
Since $f_C$ depends on $x_C$ and $x_D$ and the payoff $P_C$, we write $f_C(x_C, x_D)$
$f_C(x_C, x_D)$ = (probability of interacting with a $C$-player)$*P_C(C, C)$ + (probability of interacting with a $D$- player)$*P_C(C, D)$
$= x_C*(E - I + f * \frac{2*I}{2}) + x_D*(E - I + f * \frac{I}{2}) = x_C*(E - I + f * \frac{2*I}{2}) + (1 - x_D)*(E - I + f * \frac{I}{2})$
Correspondingly, the fitness for the player who is defecting (playing $D$) is $f_D$:
$f_D(x_C, x_D)$ = (probability of interacting with a $C$-player)$*P_D(D, C)$ + (probability of interacting with $D$-player)$*P_D(D, D)$
$= x_C*(E + f * \frac{I}{2}) + x_D*E = x_C*(E + f * \frac{I}{2}) + (1 - x_C)*E $
# shares in the population
xC = .5 # probability of interacting with C player
xD = (1 - xC) # probability of interacting with D player
# paras
E = 1; I = 1; f = 1.6
# A B
# A 1.6, 1.6 0.8, 1.8
# B 1.8, 0.8 1, 1
# fitness of cooperators
fC = xC * (E - I + f*2*I/2.) + xD * (E - I + f*I/2.)
# fitness of defectors
fD = xC * (E + f*I/2.) + xD * E
print(fC)
print(fD)
Fitness is interpreted as rate of reproduction
The average fitness $\bar{f}$ of the population is the weighted average of the two fitness values.
$\bar{f}(x_C, x_D) = x_C*f_C(x_C, x_D) + x_D*f_D(x_C, x_D) = x_C*f_C(x_C, x_D) + (1 - x_C)*f_D(x_C, x_D)$
# average fitness
F = xC * fC + (1 - xC) * fD
print(F)
# simulate
import matplotlib.pyplot as plt
%matplotlib inline
# paras
E = 1; I = 1; f = 1.6
xC = [0.7]
xD = [1 - xC[0]]
dt = 0.1
# fitnesses
FC = [(xC[0] * (E - I + f*2*I/2.) + xD[0] * (E - I + f*I/2.)) * dt]
FD = [(xC[0] * (E + f*I/2.) + xD[0] * E) * dt]
F = [(xC[0] * FC[0] + xD[0] * FD[0]) * dt]
for t in range(300):
# fitnesses
fC = xC[t] * (E - I + f*2*I/2.) + xD[t] * (E - I + f*I/2.)
fD = xC[t] * (E + f*I/2.) + xD[t] * E
ff = xC[t] * fC + xD[t] * fD
FC.append(fC*dt)
FD.append(fD*dt)
F.append(ff*dt)
# differential equations for shares
xC.append(xC[t] + (xC[t] * (fC - ff)) * dt)
xD.append(xD[t] + (xD[t] * (fD - ff)) * dt)
plt.plot(xC, 'b', label = 'share of cooperators')
plt.plot(xD, 'r', label = 'share of defectors')
plt.plot(FC, 'b--', label ='fitness of cooperators')
plt.plot(FD, 'r--', label ='fitness of defectors')
plt.plot(F, 'g--', label ='mean population fitness')
plt.legend(loc = 'best')
plt.ylim(0, 1)
plt.grid()
The Nash-equilibrium is $(D, D)$.
# use sympy to calculate steady states
from sympy import *
E = 1; I = 1; f = 1.6
xC, xD = symbols('xC, xD')
dC = xC * ((xC * (E - I + f*2*I/2.) + xD * (E - I + f*I/2.)) -
(xC * (xC * (E - I + f*2*I/2.) + xD * (E - I + f*I/2.)) + xD * (xC * (E + f*I/2.) + xD * E)))
dD = xD * ((xC * (E + f*I/2.) + xD * E) -
(xC * (xC * (E - I + f*2*I/2.) + xD * (E - I + f*I/2.)) + xD * (xC * (E + f*I/2.) + xD * E)))
# use sympy's way of setting equations to zero
CEqual = Eq(dC, 0)
DEqual = Eq(dD, 0)
# compute fixed points
equilibria = solve([CEqual, DEqual], [xC, xD])
print(equilibria)
# simulate with steady state
import matplotlib.pyplot as plt
%matplotlib inline
# paras
E = 1; I = 1; f = 1.99
# deviate slightly from the fixed point value
xC = [1 - 0.00001]
xD = [1 - xC[0]]
dt = 0.1
for t in range(30000):
# fitnesses
fC = xC[t] * (E - I + f*2*I/2.) + xD[t] * (E - I + f*I/2.)
fD = xC[t] * (E + f*I/2.) + xD[t] * E
ff = xC[t] * fC + xD[t] * fD
# differential equations for shares
xC.append(xC[t] + (xC[t] * (fC - ff)) * dt)
xD.append(xD[t] + (xD[t] * (fD - ff)) * dt)
plt.plot(xC, 'b', label = 'share of cooperators')
plt.plot(xD, 'r', label = 'share of defectors')
plt.legend(loc = 'best')
plt.ylim(0, 1.1)
plt.grid()
Note that the all-cooperate-equilibrium in the RPGG is not an asymptotic equilibrium either (i.e. it is not a stable fixed point). Cooperation is lost at the slightest deviation. The Nash-equilibrium is in the all-defect state.
$x_R$ is the proportion of players playing $R$, resp. $x_P$, $x_S$
$x_R + x_S + x_P = 1 $
Payoffs are:
\begin{array}{|c|c|} \hline & R & P & S \\\hline R & 0 & -1 & 1 \\\hline P & 1 & 0 & -1 \\\hline S & -1 & 1 & 0 \\\hline \end{array}
Starting frequencies are: $x_R = 0.25$, $x_P = 0.25$, $x_S = 0.5$
Fitness for player playing $R$ is $f_R = 0.25 * 0 + 0.25 * -1 + 0.5 * 1$
Fitness for player playing $P$ is $f_P = 0.25 * 1 + 0.25 * 0 + 0.5 * -1$
Fitness for player playing $S$ is $f_S = 0.25 * -1 + 0.25 * 1 + 0.5 * 0$
Average fitness $\bar{f}$ of the population is:
$\bar{f} = x_R * f_R + x_P * f_P + x_S * f_S$
The Replicator equations are
$\frac{dx_R}{dt} = x_R * (f_R - \bar{f})$
$\frac{dx_P}{dt} = x_P * (f_P - \bar{f})$
$\frac{dx_S}{dt} = x_S * (f_S - \bar{f})$
# simulate
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(figsize=(15,5))
fig.add_subplot(1,2,1)
xR = [0.25]
xP = [0.25]
xS = [0.5]
dt = 0.001
for t in range(50000):
# oscillations
fR = xR[t] * 0 + xP[t] * -1 + xS[t] * 1
fP = xR[t] * 1 + xP[t] * 0 + xS[t] * -1
fS = xR[t] * -1 + xP[t] * 1 + xS[t] * 0
# asymptotically stable (no cycling)
#fR = xR[t] * 0 + xP[t] * -1 + xS[t] * 2
#fP = xR[t] * 2 + xP[t] * 0 + xS[t] * -1
#fS = xR[t] * -1 + xP[t] * 2 + xS[t] * 0
f = xR[t] * fR + xP[t] * fP + xS[t] * fS
xR.append(xR[t] + (xR[t] * (fR - f)) * dt)
xP.append(xP[t] + (xP[t] * (fP - f)) * dt)
xS.append(xS[t] + (xS[t] * (fS - f)) * dt)
plt.plot(xR, 'g', label = 'rock')
plt.plot(xP, 'b', label = 'paper')
plt.plot(xS, 'r', label = 'scissors')
plt.title('Replicator dynamics of the"Rock, paper, scissors"-game')
plt.legend(loc='best')
plt.grid()
fig.add_subplot(1,2,2)
plt.plot(xR, xS)
plt.title('Phase space of the"Rock, paper, scissors"-game')
plt.xlabel('rock')
plt.ylabel('scissors')
plt.grid()