Logo

Modelling modules

The following provides code snippets for modelling issues - and for getting acquainted with the programming language Python.

Many aspects of equation-based models reappear in different models. Stored in a modules repository, they can be reused when needed.

Often for instance, variables are known to be influenced, but parameters values are not (yet) known. Data might be missing or will be obtained later in a survey. The model should hold the possibility to include data once it is available.

Most simply, assume for a start that in a model for logistic growth the growth rate $g$ is not known. One way to consider various consequences is to loop over different values of $g$.

$$\frac{dN}{dt} = g * N * (1 - \frac{N}{K})$$
In [13]:
# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
# show plots in notebook
% matplotlib inline

# prepare drawing canvas and plot
fig = plt.figure(figsize=(8,5))
fig.add_subplot(1,1,1)

# define initerval with different growth rates
G = np.linspace(0.3, 1.2, 5)
# define time range
T = range(2000)
dt = 0.01     # "differential"
K = 10.     # carrying capacity

# loop over different growth rates
for g in G:
    # define list to store results, include initial value
    N = [1.]
    # loop over time range
    for t in T:
        # append calculated values to list
        N.append(N[t] + (g * N[t] * (1 - N[t]/K))*dt)
    # plot list with resulting value
    plt.plot(N, label = 'growth rate = %0.1f' %g)

# add title, grid lines and legent to plot
plt.title('Logistic growth with various growth rates')
plt.grid()
plt.legend(loc = 'best')
Out[13]:
<matplotlib.legend.Legend at 0x10501e48>

Allee effect

Reproductive success can be density dependent (see here for an explanation). This can be modeled in the following way.

$$\frac{dN}{dt} = g * N * (1 - \frac{N}{K}) * \frac{N - a}{K}$$

with the Allee extinction threshold $a$ being unknown in this case.

In [2]:
# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
# show plots in notebook
% matplotlib inline

fig = plt.figure(figsize=(8,5))
fig.add_subplot(1,1,1)

# growth rate is fixed in this case
g = 0.1
# define interval with various a-values 
A = np.linspace(1., 5., 5)
# alternatively
#A = [1,2,3]
T = range(1600)
dt = 0.1
K = 10.

# loop over a values
for a in A:
    N = [3.] # list with iniitial value
    for t in T:
        N.append(N[t] + (g * N[t] * (1 - N[t]/K) * (N[t] - a)/K) * dt)
    plt.plot(N, label = 'a = %0.1f' %a)

plt.title('Allee effect with various extinction thresholds')
plt.ylim(0.10)
plt.grid()
plt.legend(loc = 'best')
Out[2]:
<matplotlib.legend.Legend at 0xb312d68>

Lake eutriphication model

$x$ is a nutrient suspended in phytoplankton in a lake causing turbidity. $a$ indicates the loading rate of the nutrient, $b$ its removal rate, and $r$ is an internal recycling rate of the nutrient. $h$ indicates the level at which the development reaches half of its saturation. The exponent $p$ determines the strength with which the internal recycling rate supports the loading of the nutrient.

$$\frac{dx}{dt}=a-b*x+r*\frac{x^p}{x^p + h^p}$$

When $a = 0.7$ this system shows two alternative equilibria. Close to $a = 0.7$ however, dynamics are attracted in varying extent to one of the two equilibria. This can be shown by looping through different initial values (for a more detailed analysis of this model see Critical transitions with Python).

In [171]:
# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
# show plots in notebook
% matplotlib inline

# prepare canvas for drawing plots
fig = plt.figure(figsize=(20,5))

# parameters
A = [0.6,0.7,0.8]            # a-values
b = 1.
r = 0.6
h = 1.
p = 12
X0 = np.linspace(0.01, 2, 8) # initial values
T = range(1600)              # time
dt = 0.01

#loop through a-values, additionally count the values
for n,a in enumerate(A):
    #loop through initial values
    for x0 in X0:
        x = [x0]
        for t in T:
            x.append(x[t] + (a - b*x[t] + r * (x[t]**p)/(x[t]**p + h**p)) * dt)
        # indicate one plot of three
        fig.add_subplot(1,3,n+1)
        plt.plot(x, label = 'x0 = %0.1f' %x0)
    plt.title('Dynamics with a = %0.1f from different initial values' %a)
    plt.grid()
# legend only in last plot
plt.legend(loc = 'best')
Out[171]:
<matplotlib.legend.Legend at 0x27426a58>

The interesting part in this model is: $\frac{dx}{dt}=r*\frac{x^p}{x^p + h^p}$. Dependent on the initial values, on the $h$-half-saturation parameter and/or on the exponent $p$, the dynamics show interesting behaviors.

In [1]:
# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
# show plots in notebook
% matplotlib inline

# prepare canvas for drawing plots
fig = plt.figure(figsize=(8,5))

# parameters
r = 0.5
h = 1.
p = 9
#P = [3, 9, 12]
X0 = np.linspace(0.6, 0.9, 5) # initial values
#x0 = 0.8
T = range(2000)              # time
#T = range(500)
dt = 0.01

#loop
#for p in P: # alternative for looping p
for x0 in X0:
    x = [x0]
    for t in T:
        x.append(x[t] + (r * (x[t]**p)/(x[t]**p + h**p)) * dt)
    # define plot
    fig.add_subplot(1,1,1)
    plt.plot(x, label = 'x0 = %0.1f' %x0)
    #plt.plot(x, label = 'p = %0.f' %p)  # in case of looping p
plt.title('Dynamics from different initial values')
#plt.ylim(0, 6) # in case of looping p
plt.grid()
plt.legend(loc = 'best')
Out[1]:
<matplotlib.legend.Legend at 0xb3cfeb8>

Adding noise

This model demonstrates the shift of a harvested resource to overexploitation. Resource biomass $x$ grows logistically and is harvested according to $-c\frac{x^2}{x^2+h^2}$

$$\frac{dx}{dt} = rx(1-\frac{x}{K})-c\frac{x^2}{x^2+h^2}+\sigma xdW $$

The term $\sigma xdW$ indicates added noise, which is used in analytical contexts to make the model's behavior appear more realistic. The following code demonstrates this addition of noise.

In [2]:
# import libraries
import numpy as np
from scipy import integrate
import matplotlib.pyplot as plt
# show plots in notebook
% matplotlib inline

#parameters
#c = 1.   # grazing rate
C = np.linspace(1, 2.6771, 1000)  # according to Dakos et al 2012
r = 1.    # growth rate
h = 1.    # half-saturation value
p = 2     # exponent for Hill-function
K = 10.   # carrying capacity

# determine strength of noise
sigma = 0.05

# define the system in a function
def Sys(X, t=0):
    # X[0] = x    
    return np.array([r * X[0] * (1 - X[0]/K) - c * (X[0]**p / (X[0]**p + h**p))])

# generate 1000 linearly spaced numbers for x-axes
t = np.linspace(0, 100,  1000)
# initial value
Sys0 = np.array([10])

# prepare plots
fig = plt.figure(figsize=(8,5))
plt.subplot(1,1,1)

XX = [] # empty list

# loop through c values
for c in C:
    # this time use scipy for integration 
    X, infodict = integrate.odeint(Sys, Sys0, t, full_output=True)
    #generate noise, sigma * x * dW
    noise = sigma * np.random.normal(0, 1, 1000)
    # transform x, and add noise
    x = X.T + noise*X.T
    # add only the last value of dynamics
    XX.append(x[0][-1])
    
plt.plot(C, XX, 'b-')
plt.grid()
plt.xlabel('grazing rate')
plt.title('Shift to overexploitation - with noise added')
Out[2]:
<matplotlib.text.Text at 0xb8c5898>

Confidence interval

A confidence interval is an estimate of an interval which, after infinite repetition of random experiments, contains the true value of a parameter (e.g. the mean) with probability $p%$. Typically, p is stated at the 95% confidence level. In modeling it can be used to quantify the confidence in the replicative power of the model.

The following code uses the above logistic growth model with added noise to generate artificial data points. It calculates a Least squares polynomial fit from these data points, and then defines a bandwidth into which, according to the data points, 95% of the fit would fall.

In [17]:
# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
# show plots in notebook
% matplotlib inline
# prepare plots
fig = plt.figure(figsize=(8,5))
fig.add_subplot(1,1,1)

# genertate artificial data points from logistic dynamics
g = 1.9   # growth rate
T = np.array(range(1000)) # time range
dt = 0.01 
K = 10.

N = [1.]   # list with initial value
Ndata = [] # empty list to hold generated data
for t in T:
    NN = N[t] + (g * N[t] * (1 - N[t]/K))*dt
    N.append(NN)
    # generate normally distributed noise, mu = 0, sigma = 1
    noise = np.random.normal(0, 1)
    # add noise
    Ndata.append(NN + noise)

# plot artificial data points
plt.plot(T, Ndata, 'wo', label = 'data')

# calculate least squares polynomial fit through artificial data points
# (assuming the original ODE is not known)
p = np.polyfit(T, Ndata, 7)
fitted_data = np.polyval(p, T)
# plot fit
plt.plot(T, fitted_data, 'r-', label = 'fitted curve')

# use statsmodels library for confidence and prediction (included in Anaconda)
# for further explanations: 
# http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/ols.html
import statsmodels.api as sm
from statsmodels.sandbox.regression.predstd import wls_prediction_std

# sm needs a column with the fitted data in addition to T   
TT = np.column_stack((T, fitted_data))
# also needs an intercept, add a column of 1s
TTT = sm.add_constant(TT)
# print TTT    # to see what it does

# use sm's Ordinary Least Square method, and fit it
re = sm.OLS(Ndata, TTT).fit()
#print re.summary()    #print resulting statistics

# use the wls_prediction_std command to calculate confidence interval 
prstd, iv_l, iv_u = wls_prediction_std(re)
plt.plot(T, iv_l, 'g--', label = '%0.f%% confidence interval' %(i * 100))
plt.plot(T, iv_u, 'g--')

plt.title('Confidence interval')
plt.grid()
plt.legend(loc = 'lower right')
Out[17]:
<matplotlib.legend.Legend at 0x225d7080>