At each time step, once the updated pollutant level is known, it is possible to calculate the economic return, had any given policy been chosen in the previous time step. Weights for the propagation of agents to the next time step are based on relative policy performance in the past time step (note that this comparison could have been carried out over an arbitrary number of past time steps). The group of agents with the best performing policy receive weight *w* (0.5 < *w* £ 1), and the other group receives weight 1-*w.* Let ** W_{t}** be a column vector of these weights and

φ / [Σ([p_{t-1 }] W_{t}φ )] |
(A.5.1) |

where **φ **= [φ 1-φ]*’* is a vector that introduces autocorrelation into the time series of ** N** and Σ(.) denotes the sum of matrix elements. Stochastic propagation of agents is introduced by drawing the agents for the next time step,

The weights ** W_{t}** control the bias in propagation due to success of one policy relative to the other. If

The weights φ control the autocorrelation of the propagation process. If φ is near 1.0, then the past ** N** is weighted heavily. Thus, the next