Research , part of a Special Feature on A Framework for Analyzing, Comparing, and Diagnosing Social-Ecological Systems Using Artificial Neural Networks for the Analysis of Social-Ecological Systems

. The literature on common pool resource (CPR) governance lists numerous factors that influence whether a given CPR system achieves ecological long-term sustainability. Up to now there is no comprehensive model to integrate these factors or to explain success within or across cases and sectors. Difficulties include the absence of large-N studies, the incomparability of single case studies, and the interdependence of factors. We propose (1) a synthesis of 24 success factors based on the current social-ecological systems (SES) framework and a literature review and (2) the application of neural networks on a database of CPR management case studies in an attempt to test the viability of this synthesis. This method allows us to obtain an implicit quantitative and rather precise model of the interdependencies in CPR systems. Given such a model, every success factor in each case can be manipulated separately, yielding different predictions for success. This could become a fast and inexpensive way to analyze, predict, and optimize performance for communities worldwide facing CPR challenges. Existing theoretical frameworks could be improved as well.


INTRODUCTION Motivation
Common pool resource (CPR) problems are ubiquitous.Considering the impact of climate change, the handling of such problems becomes even more important.There are many kinds of different CPR problems.A comprehensive overview can be found in Hess (2008).Our analysis is restricted to traditional commons: land use, forest management, irrigation, and fisheries.The central questions are: Why do some communities fail while others thrive?How can sustainability, efficiency, and justice be achieved in managing CPRs?

Social-ecological systems framework
In the last decades of research in the field of social-ecological systems (SES) it has become clear that there is no single factor, such as user participation or monitoring of user compliance, that accounts for success in managing CPRs.Institutions and settings are very heterogeneous, therefore panaceas are not available.In addition, most attempts to transfer successful designs from one system to another have failed (Meinzen-Dick 2007, Ostrom et al. 2007).
However, the interaction of a set of factors makes success highly probable in rather diverse settings (Ostrom 2005; hereafter called success factors).Elinor Ostrom was the first to construct such a set of success factors, what she called design principles, in her seminal work Governing the Commons.She defined a design principle as follows: "By 'design principle' I mean an essential element or condition that helps to account for the success of these institutions in sustaining the CPRs and gaining the compliance of generation after generation of appropriators to the rules in use" (Ostrom 1990:90).Subsequently, this work has been further developed in the SES framework (Ostrom 2009).A recent meta-analysis demonstrates the theoretical and practical strength of the SES framework, because since 1990 at least 91 studies have used or discussed these design principles as contributing to success (Cox et al. 2010).Furthermore, empirical studies support the conclusion that they contribute to success (see Nilsson 2001 for an in-depth study).Two essential follow-up questions are: Are these indeed the crucial factors that determine success?What is the individual role and relevance of each factor?
There have been several attempts to validate and extend the design principles.Agrawal undertook one of the most comprehensive attempts of summary and synthesis (Agrawal 2001(Agrawal , 2002)).He derived a comprehensive list of success factors by combining various compilations (Ostrom 1990, Wade 1994, Baland andPlatteau) and incorporating his own extensions.To a substantial amount, all these analyses use the same factors.One may even speak of a consensus on a core set of factors.However, relationships and positions of concepts and variables in these compilations remain debated.Because of its acceptance, empirical validity, and comprehensiveness, the SES framework (Table 1) serves as starting point for our own synthesis of success factors.
There have been many more attempts to identify success factors in CPR problems.Pomeroy et al. (1998) evaluated the importance of Ostrom's original design principles based on empirical research of 25 research projects on Asian fishery cooperatives.Clear boundaries of the resource and clear boundaries of the appropriating group, for example, were rated as highly important.Ostrom's other factors were rated as only somewhat important.Based on their own field research, another 28 factors were seen as important by the authors of this study (Pomeroy et al. 1998).
Many studies do not work at the success factor level of analysis, but at a more detailed level.They look at positive or negative correlations of single variables with success.An example of this is Shiferaw et al. (2008), who found positive correlations of success with variables like the amount of precipitation, the distance to the nearest market, and others for irrigation systems in 87 Indian villages.
Interestingly, a recent study on overcoming anticommons situations, i.e., management problems through small, private parcels of land, in German forests points out similar success factors to Ostrom's or Agrawal's lists (Schurr 2006).The existence of institutional frameworks, or financial support during the start-up, for example, is important.
Finally, a metastudy on community forestry encompassing 69 case studies worldwide identifies 43 variables as factors determining success.Factors discussed by all authors of case studies and found to be important for success of CPRmanagement are "... well-defined property rights, effective institutional arrangements, and community interests and incentives" (Pagdee et al. 2006:49).Because each of these factors comprises at least five subcategories, it is difficult to assess individual contributions to success.This is because of the incomparability of studies, which is a different problem, one that is difficult to overcome.
These studies show clearly that there is consensus by some authors on approximately 20 to 30 core success factors although many more are discussed.Few studies attach the same importance to the same success factors.Hence, there is no convergence on an overall relevance of each factor in comparison to the others.Therefore, many articles conclude that success is likely case specific.This, of course, is unsatisfactory, but it seems impossible to abstract from specific local contexts.Ostrom's (1990)  The SES framework guides this process (Ostrom 2009).

Synthesis of success factors
If we aim to construct a comprehensive set of success factors, the SES framework as well as other compilations of system attributes critical for success (Pagdee and Daugherty 2006) can be used.Unfortunately, a complete list would be impractical because it would comprise more than 100 success factors.Clearly, there are no case studies that include that number of variables in that exact or a sufficiently similar form to support these factors empirically.
However, because an SES framework-based consensus on a set of about 20 to 30 success factors does exist, this may well be the starting point.Moreover, empirical evidence supports this set at least as relevant, although weighting and importance of the individual factors differ.This may be because of complex interactions between the success factors.Our synthesis used Ostrom (2009) and Agrawal (2001) as a starting point.A literature search using keywords and references cited by authors that contributed success factors was then performed.These included, but were not limited to: Ostrom (1990), Berkes (1992), Tang (1992), Thomson et al. (1992), Wade (1994), Baland and Platteau (1996), Varughese and Ostrom (2001), Agrawal (2002), Pagdee et al. (2006), Schurr (2006), Nagendra (2007), and Shiferaw et al. (2008).We, like many of the authors noted, used the categories of the SES framework under which the success factors are typically subsumed: resource, resource units, actors, governance system, and external environment.Success factors were included if they occurred in at least four peer-reviewed studies based on empirical case studies.A list is available upon request.Wherever possible, small variations of one factor were merged into one, often slightly more abstract factor.In addition, inconsistencies like wrong categorizations, wrong level of abstraction, etc. were cleared where possible.
For example, a positive cost-to-benefit ratio is often mentioned, but it is actually a metafactor; it is calculated by individuals (more or less) intuitively weighting some or most of the factors listed above.Only if positive, will individuals contribute to CPR management.
A synthesis like this may be directed at one of two opposite goals.The first goal is to create a listing that is as comprehensive as possible.The advantage is that all success factors can be considered and none is overlooked.Later analyses will benefit from such a listing as well because subselections can easily be made according to the research focus.The disadvantages are equally obvious: irrelevant factors for the success of CPR management remain and are included in analyses, consuming time and interfering with the appropriate factors.Moreover, the large number of factors makes modeling unnecessarily complex and unmanageable with conventional analytical tools.However, ease of use is a desired feature in modeling (Schlüter et al. 2005).Worse, overly complex models may be outright misleading.
The second goal is a merged listing comprising only factors that have a high probability of relevance for the success of CPR management.Modeling is easier, and models are more concise, less cluttered with insignificant factors.However, it is dangerous to try to determine, a priori, which factors may be excluded.Another problem is to find a common level of abstraction.Again, the core set of success factors provides a standard.Our approach opts for the following merged list, ordered according to the SES framework.It is supported by both theoretical work and empirical data and has been validated by external experts (Table 2).
Because each factor is assumed to be relevant for success, it is important to know in which way it contributes.Table 3 lists the respective contributions of each factor to the success, including a reference in which this factor and its contribution are discussed in more detail than can be covered here.We explicitly state that our synthesis is meant as a starting point, one supported by theoretical and empirical work, to determine which factors may be relevant for success and which may not.It is the neural networks that allow the determination of their respective interplay and relevance.

Outcomes and success factors
One of the crucial points in researching success factors is how to measure success.Not surprisingly, opinions differ as to what constitutes the success of CPR management.Again, there exists a consensus on a core set of variables relevant for measuring success (Ostrom 1990, 2009, Berkes 1992, Pagdee et al. 2006).This set can be separated into (1) ecological, (2) social, (3) economic objectives, and (4) effects on other socialecological systems.The following lists are syntheses.
(1) Ecological objectives: q condition of resource q stability, sustainability q productivity, resilience q biodiversity q avoiding or halting environmental degradation (2) Social objectives: q equity, i.e., participation in management, appropriation process, benefit distribution, etc.
q stability, sustainability q accountability q rights q investment in future productivity q satisfaction of users, i.e., meeting local needs q improvement of local living conditions and decrease in poverty q conflict management q degree of compliance with rules q balance between conflicting management goals (3) Economic objectives: q productivity q cost-to-benefit ratio of appropriation process (4) External effects on other social-ecological systems: q ecological effects q social effects q economic effects There seems to be few systematic data for point 4 because most research focuses on the CPR itself and not on its effects on other systems.Additionally, most case studies do not collect data on the economic efficiency of CPR management.http://www.ecologyandsociety.org/vol18/iss2/art40/ The next obstacle is to measure these parameters.Because there is no direct measure of success, indicators have to be used.For example, an indicator for ecological success may be the condition of the resource.However, 'condition' could refer to the forest, irrigation system, etc., as a whole, but it could also be restricted to the sort of tree or species of fish harvested.
A second problem is that the condition of a forest, for example, can again be measured by indicators only.In the case of forests, this may be biodiversity, the productivity of the forest for a particular tree, the vegetation density, or the trunk density.Several difficulties arise.A set of indicators has to be chosen or developed that satisfies certain criteria (OECD 2008, Binder et al. 2010).Also, worldwide databases pose their own problems concerning comparability, e.g., how to compare the condition of boreal to tropical forests (Tucker et al. 2008).Moreover, it has been suggested that one parameter is not enough, and many parameters should be combined in a multivariate analysis (Wollenberg et al. 2007).Although methodologically sound, the last suggestion frequently cannot be put into practice because of the lack of precise data on outcomes.This is why most studies limit themselves to one parameter often combined with a subjective estimation of a local expert, e.g., a forest warden, or the users themselves.
This in turn causes serious problems for a comparative study such as ours because each case study measures outcomes such as equity through different indicators.This problem is solved by using large databases in which success is coded with the same standards across all cases combined with a model integrating and weighting multiple criteria (see Lam 1998 for irrigation performance).Because of data availability, we restricted our analysis to ecological success.

Obstacles in determining the impact of success factors
One of the central questions in CPR research has been to determine the impact of various success factors on success or failure.Consequently, this question has been the topic of much research (Agrawal 2002, Hess 2008, Ostrom 2009).
However, some major obstacles remain and thus far, there is no model that contains most or all relevant success factors including their mutual interactions.It is still unclear whether it is possible to infer from irrigation projects to forest management or fisheries (Agrawal 2001), although the general assumption is that this is not possible.This is mainly due to missing empirical evidence because almost no empirical study we are aware of does cross comparisons (Poteete et al. 2010).Even more problematic is the fact that even analyses restricted to one type of CPR are not consistent in their evaluations, if compared; there is no agreement on the importance of each factor across studies, e. g., for irrigation.As a consequence, this adds to the skepticism about panaceas (Meinzen-Dick 2007, Ostrom et al. 2007).
Furthermore, the methodology of each study is unique, making comparisons or metastudies next to impossible.As well, most studies limit themselves to one or a few CPR institutions focusing on a few variables.Consequently, there are only few large-N studies, despite the huge number of singular case studies: "This metastudy shows that measures of success discussed by the authors vary across all case studies.None of the selected articles discussed all measures of success simultaneously" (Pagdee et al. 2006:48).
Unfortunately, this situation has not improved significantly in the last years (Poteete et al. 2010).Up to now, no methodology existed to compute or even capture the complexity of interactions of possible factors (Agrawal 2001): "Although much of this writing acknowledges the importance of a large number of different causal variables and processes, knowledge about the magnitude, relative contribution, and even direction of influence of different causal processes on resource management outcomes is still poor at best" (Agrawal and Chhatre 2006:149).
Not surprisingly, the conclusion of many writers is deeply pessimistic, rating a comprehensive and general list of success factors as impossible (Agrawal 2001).Therefore, we are not aware of any study developing a model of causal factors because there are only singular studies available with no crossconsistent correlations.A large-N study with a set of more than 30 variables would hardly be feasible.This is, of course, because of sample size and costs (Agrawal 2001).
In addition, meta-analyses trying to find general statistical correlations face serious problems (Pagdee et al. 2006).The most problematic is perhaps that interactions between the factors analyzed are not known, which in turn leads to wrong estimations of relevance.The resulting problem of all studies that exclude a relevant factor is that the importance of all others shifts dramatically (Agrawal and Chhatre 2006).Because studies typically encompass 2 to 4 variables only and not the full set of 30 to 40 potential success factors, this problem is not trivial.All these problems are very serious, but might, in our opinion, be solved by making use of large-N databases and a new methodology applied to this research area, artificial neural networks.

Data selection
It is costly and time consuming to conduct empirical field studies with large samples.For that reason there are hardly any large-N studies.In contrast, a vast number of empirical studies with one case and a few independent variables, two to four in most cases, do exist.This problem is well known and addressed by recent publications (Poteete andOstrom 2008, Poteete et al. 2010).By analyzing publication trends, Poteete http://www.ecologyandsociety.org/vol18/iss2/art40/et al. ( 2010) demonstrated that there has been no substantial improvement in the past years.Nevertheless, there are a few large databases available.Our research project collects as many cases from these databases as possible.
One major problem is the cross comparability between studies because research focus and methodologies differ (Rudel 2008).Therefore, we decided to use only large data sets, which were collected using a consistent methodology.We briefly exemplify our methodology using data on Nepal irrigation systems collected in the Nepal Irrigation Institutions and Systems (NIIS) database of the Workshop in Political Theory and Policy Analysis, Indiana University.It contains 263 cases with 478 variables per case.The cases were coded during 1982 and 1997.For further information see Tang (1989).
In this database, there is information about the geographical location, including resource characteristics like rain fall or yields depending on the season.Variables on institutions, rules-in-use, organizations like water user associations or forest user committees, and specifically the group that is using the CPR are included as well.The NIIS database contains mostly data collected at a single point in time, although about 30 cases have been revisited.

Data preparation
Our research design required us to code the 24 success factors, i.e., the independent variables, as 24 real numbers and 1 realvalued measure for ecological success, i.e., the dependent variable, henceforth 24 + 1 factors for simplicity, from the case data in the databases.To do this, several steps were necessary: variable selection, combination, and recoding.

1.
We systematically screened all 615 variables and decided which indicator of the 24 + 1 factors was relevant for each variable, or if it was irrelevant for our set of factors.There were 480 variables included in the final set, an average of 19 for each success factor.Three members of our team, independently using the previously synthesized catalogue of 24 + 1 factors with their respective indicators, carried out this evaluation.Inter-rater reliability was satisfying (α NOMINAL = 0.778).Remaining disaccords were resolved in group discussion.

2.
We then recoded all selected variables to the same format.During this step, several redundant variables were combined into single variables to improve data density.
To reduce subjective interpretation as much as possible, text variables were used mostly to inform the recoding of numerical variables when these were sparsely populated.

3.
We assessed the relative weight of all variables associated with an indicator (3 raters, α INTERVAL = 0.901) and combined the recoded variables to give us real values for the indicators.

4.
We assessed the relative weight of the indicators in the composition of the 24 + 1 factors and combined them to form our final set of data.This assessment was also carried out individually by three members of our team (α INTERVAL = 0.913).

5.
Although data density could be improved significantly by combining redundant variables, in the final data set 38 data points (out of 25 x N = 6.575) were still missing.They were imputed by replacing the missing values with the mean of the existing data for the respective factors.

Method of analysis
We analyzed the data with artificial neural networks (NN).
They are a well-known nonparametric tool for pattern recognition, data mining, and the prediction of complex systems.Their strength lies in their ability to cope with nonlinear dependencies in data sets that other tools, e.g., multivariate linear regressions or principal component analysis, cannot (Shlens 2009) (Knutti et al. 2003).
A weakness of NN is that they remain 'black boxes' to some extent because they do not supply us with an explicit model, i.e., a set of formulas, of the dependencies of the factors.Therefore, an accurately predicting NN has to be regarded as an implicit model.However, in recent years it has become possible to open the black box (Gevrey et al. 2003, Thrush et al. 2008, Yeh and Cheng 2010) by running a series of analyses on trained networks.These methods can extract estimates of the relative overall importance of each input variable for the output.However, because it has already been established that there is no single most important success factor for CPR use, but a network of interwoven factors, such estimates could only lend additional empirical support for the 'no panacea' verdict regarding CPR governance.There is support for the 'no panacea' verdict, because if sets of factors are manipulated manually, no factor on its own is able to alter the outcome decisively if several other success factors do not point in the same direction.
Nonetheless, our implicit model can be used to simulate the impact of changes in parameter values on the resulting outcome.For example, we could simulate the impact of changes in the success factors on the predicted ecological outcome of the CPR systems analyzed.To our knowledge, this is the first quantitative model for CPR systems.Constructing such an implicit model is therefore a first step toward http://www.ecologyandsociety.org/vol18/iss2/art40/Fig. 1.Structure of an artificial neural network used in the analysis.
understanding the relationships between variables and factors, which then can be used to formulate an explicit model.Until then, our implicit model, for example, allows for the testing of certain sets of success factors suggested in the literature by manipulating them one by one or in combination, and then observing the changes in prediction.These yield results that indicate which factors in which combinations are likely to be influential and which theoretical suggestions may not be supported by our empirically adjusted implicit model.
For a complete introduction to using NN see Reed and Marks (1999).Our data analysis procedure can be summarized as follows: First, an appropriate network design is chosen.We decided to test a broad variety of single-layer perceptrons.A single-layer perceptron, or feed-forward network, is a network consisting of one layer of input neurons, which read the data for the 24 success factors, one layer of hidden neurons, i.e., neurons not directly in touch with input or output data, connected to the input neurons, and one output neuron connected to the hidden neurons that represents the net's prediction of a value for ecological success.
Figure 1 shows a simple net with 24 input and 9 hidden neurons connected to 1 output neuron.Information is processed from the input neurons through the hidden layer to the output neuron, hence the name feed-forward network.
The neurons are abstract and simplified versions of their biological counterparts.They are connected pairwise by links that can have different weights, which determine the strength of a connection between two neurons.These weights are adjusted according to a specific learning algorithm during the learning phase, in which the net's predictions, which are random initially, are optimized step-by-step to fit the data through repeated trial, error, and error correction.From a technical point of view, input can be any information.Hidden http://www.ecologyandsociety.org/vol18/iss2/art40/patterns in this information, which determine the output, can be found by the network.In our case, each input unit represents one potential success factor, e.g., clear boundaries of the resource.A value of 1 for that neuron would then indicate very clear boundaries, a value of 0 boundaries, which are neither particularly clear nor unclear, and a value of -1 boundaries, which are very unclear.
In the learning phase, the network is trained on a subset of the available data, i.e., the training set.Because there are many ways of dividing the data into training and test sets, i.e., different data splits, which influence the goodness of prediction, we tested several ways of splitting.For example, all cases were first ordered by size and then split in 80:20 proportions, so that sizes were equally distributed in both sets.
Once training is completed, e.g., when a given number of repetitions of the training data set or a previously defined error size have been reached, the net is validated by letting the trained net predict the outcome of cases on which it has not been trained, i.e., the test set, and evaluating the accuracy of this prediction.We chose the mean absolute error (MAE) as our primary measure for prediction accuracy.However, because this measure does not cover all features of interest, we also state other measures in the results section.
Finally, a serious challenge of data analysis using NN is that there is no algorithm for finding the best net architecture for a specific task (Sarle 1997, Reed andMarks 1999).Indeed, the task to determine the best network design for a given problem itself poses an NP-complete problem (Rojas 1993 To assess the performance of the best nets it is useful to compare it to competing predictors.The first benchmark is blind guessing.On the interval [-1.0, 1.0], the expected average error made when blindly guessing uniformly distributed random values is exactly 2/3 (proof available on request).A second benchmark is the MAE that results when simply using the mean of the data for success in the learning set as predictor for the data in the test set.The third benchmark is a multivariate linear regression (MLR).It is adjusted to the learning data set and has to predict success in the test data set.The details of these comparisons can be found in Table 4.
The difference between the MLR prediction and the averageof-best-5 prediction was not very big.However, NN had a reduced prediction error (MAE) of about 18% compared to the MLR and an R² increased by about 0.15.
We are positive that there is room for improvement.First, N might still be too small to get any better predictions.Second, the 24 + 1 factors might be interlinked in such a complex way that the current network architectures are too simple.We are currently working on both issues.Another method for measuring the model performance is to see how it performs in classifying the systems.Table 5 shows five categories of success of the systems.Although the mean absolute error of predictions is still amendable, the classifications obtained by our best five nets are already noteworthy.When the data on success is categorized into five discrete groups, as shown in Table 5, the nets are able to classify about 64% of all test cases correctly and make an error on the size of only one category in the remaining 36%.

DISCUSSION
Until now, the complexity and idiosyncrasy of CPR problems made it impossible to generalize from one of them to others in a meaningful way.Most conventional analytical tools cannot cope with the number of factors involved or with their many nonlinear interactions.Neural networks, however, are able to overcome some obstacles toward a general quantitative model of success factors in CPR problems.Because the quality of generalization of neural networks depends critically on the number of cases, the accuracy of prediction may rise when more cases are added.At least two other data sets will be added to the model consisting of 409 case studies in forestry and 123 fishery and irrigation cases.Such data coming from different commons will allow us to answer the questions on how general the success factors are.If individual models for different types of resources, e.g., irrigation, forestry, fishery, are more precise than one general model, despite the larger number of cases, this would clearly indicate a different set of success factors for each resource.
However, our first nets trained on irrigation data in Nepal are already able to predict values for ecological success of these systems rather accurately and better than, for example, multivariate regressions when fed data of a number of potential success factors.

CONCLUSION
The methodology exemplified might be able to cope with the real-life complexity of CPR problems.Today many attempts of CPR management fail without a clear understanding of the reasons, which makes quantitative and precise analysis a pressing cause.The benefits would be manifold.Among them are the ability to predict successes and failures and the probable results of rule changes or other policy measures.A better management of CPR problems worldwide could lessen the environmental burden by a substantial amount, and poor living conditions caused by mismanagement of CPR problems could be improved as well.
The methodology presented may result in a model that can be used as a free tool in many CPR projects.Each factor could be manipulated to simulate changes in the SES situation resulting in an immediate change in the prediction of success.The analysis itself should not take more than a few days and requires almost no expert knowledge, given that an independent data collection has taken place already.In short, our methodology may contribute to making CPR management more successful, to optimizing existing projects, or even help in salvaging failures.
Responses to this article can be read online at: http://www.ecologyandsociety.org/issues/responses.php/5202

Table 2 .
Synthesis of success factors; note that each success factor like "resource size" has another level (indicators) below it (not shown).

Table 3 .
Relevance of success factors to success.

Table 4 .
Comparison of goodness of fit (prediction results).

Table 5 .
Categorization schema of performance of common pool resource (CPR) systems.