A framework for conceptualizing and assessing the resilience of essential services produced by socio-technical systems

Essential services such as electricity are critical to human well-being and the functioning of modern society. These services are produced by complex adaptive socio-technical systems and emerge from the interplay of technical infrastructure with people and governing institutions. Ongoing global changes such as urbanization and increasing prevalence of extreme weather events are generating much interest in strategies for building the resilience of essential services. However, much of the emphasis has been on reliable and resilient technical infrastructure. This focus is insufficient; resilience also needs to be built into the human and institutional processes within which these technical systems are embedded. Here, we propose a conceptual framework, based on a complex adaptive systems perspective, that identifies four key domains that require investment to build the resilience of essential services. This framework addresses both the technical and social components of the socio-technical systems that underlie essential services and incorporates specified and general resilience considerations. The framework can be used to guide resilience assessments and to identify strategies for building resilience across different organizational levels.


INTRODUCTION
Modern society depends on a wide range of services being resilient in the face of disruption and rapid global change (Holling 2001, UNISDR 2015. These services include ecosystem services produced by social-ecological systems, as well as technologically mediated essential services such as electricity, water, and sanitation. Similar to ecosystem services, disruption in essential services can cause ripple effects, with considerable social consequence (Schulman et al. 2004, Rose et al. 2007, Pescaroli and Alexander 2015, and can escalate to disaster if it exceeds the ability of the affected community to cope (UNISDR 2009(UNISDR , 2015. Along with efforts to foster resilience of ecosystem services, building resilience of essential services is critically needed (La Porte 2006), accompanied by practical frameworks and approaches to better understand and assess the resilience of such services. Essential services are produced by complex adaptive sociotechnical systems (Varga 2015), which are embedded within broader social-ecological systems (Folke 2006, STAP 2015. Essential services are coproduced through the interplay of technology and social institutions, or hard and soft infrastructure, that compose socio-technical systems. Hard infrastructure refers to physical technical assets and systems, whereas soft infrastructure refers to social systems such as institutions, users, rules, and regulations (UNESCAP 2013). Most of the current resilience emphasis around essential services focuses on development, maintenance, and protection of the hard infrastructure, rather than assurance of the service itself (Auerswald et al. 2006, La Porte 2006. Investments in hard infrastructure ought to be accompanied by investments in soft infrastructure to ensure resilient service delivery. In the emergency preparedness and disaster management communities, it is increasingly recognized that continuity of essential services requires a focus on the broad-based resilience capabilities of communities, the private sector, and all levels of government (DHS 2010, NIAC 2010, FEMA 2015. Ensuring the resilience of electricity supply is of particular interest to government administrators (Grid Resiliency Task Force 2012, Executive Office of the President 2013, City of New York 2013, NAS 2017). Electricity supply is considered a foundational service because many other layers of critical infrastructure and the essential services derived from them (such as water supply) depend on electricity (Koester andCohen 2012, Jeschonnek et al. 2016). Like the socio-technical systems that produce other essential services, the electricity supply system is a complex adaptive system susceptible to disruption (Amin 2015). To ensure resilience, the interlinked social and technical parts of the system continuously have to rebound from, adapt to, and transform amid the many environmental, technical, and social risks factors that can disrupt supply.
In common usage, resilience refers to the ability to bounce back or spring back into shape following a disruption. As a systemslevel characteristic, resilience is an emergent property of complex adaptive systems (Cork 2011, Aldunce et al. 2015 and refers to the capacity of a system to sustain core functions in the face of disruption and change (Folke et al. 2010. Resilience can be used in either a descriptive or a normative sense. From a descriptive perspective, the concept is neutral and refers to the persistence of the core functions and identity of a system (Walker et al. 2004, Cumming et al. 2005, which can be either desirable or undesirable. Examples of undesirable resilient systems include poverty traps and organized crime Constas 2014, Dahlberg 2015). More recently, there has been a groundswell of interest in the normative use of resilience as an approach for managing complex adaptive systems toward desirable outcomes , Seville et al. 2015, Folke 2016. From a normative perspective, resilience is not merely the ability to sustain core functions, but to sustain specific outcomes such as continued production of specific ecosystem , Folke et al. 2016 or essential services. This ability may entail bouncing back after a disruption, but could also involve systemic transformation and bouncing forward to a position better than before (Boin andVan Eeten 2013, Weichselgartner and. Here, we propose a framework to conceptualize and assess the resilience of essential services using a complex adaptive systems perspective. For our purposes, we apply resilience normatively and define resilience of essential services as the capacity of complex adaptive socio-technical systems to sustain the production of essential services in the face of disruption and ongoing social, technological, and environmental change. The framework we propose draws on and integrates work on resilience from several different disciplinary traditions, particularly work on social-ecological systems , Folke et al. 2016, research on the resilience of engineered systems (Madni andJackson 2009, Park et al. 2013), and organizational resilience (Weick et al. 1999, Linnenluecke andGriffiths 2012), as well as practical policy guidance that focuses on critical social responses from community resilience (Cabinet Office 2011, NIST 2016a). We integrate these different strands of work based on a common underlying view of these problems as complex adaptive systems problems.
The framework we propose draws on an interdisciplinary synthesis of literature, as well as practical experience of conducting resilience assessments to electricity supply in Eskom Holdings, the South African national electrical utility. The South African experience is emblematic of the challenges facing electric utility providers, particularly in developing countries. By focusing on a clearly defined system, we aim to explore how the resilience of essential services that underpin key functions in modern societies can be enhanced. We suggest that this framework can be applied to other essential services and, with some modification, can also advance the understanding of social-ecological resilience more generally.

ELECTRICITY SUPPLY AS A COMPLEX ADAPTIVE SYSTEMS PROBLEM: THE CASE OF SOUTH AFRICA
Globally, electricity supply systems face an increase in the number and severity of large-scale emergencies, often triggered by severe weather (Abi-Samra et al. 2014, Cabinet Office 2015. In emerging economies, this trend is aggravated by rapid growth in electricity demand, posing challenges for reliable service provision and constraining opportunities for social and economic development (Bocca and Mehlum 2012). In the case of South Africa, 95% of the electricity used in the country is supplied by Eskom, a national vertically integrated generation, transmission, and distribution utility (Eskom 2016a). In a relatively short period of time, Eskom went from global power company of the year in 2001 (Khoza and Adam 2006) to no longer being able to maintain the national supply-demand balance in 2008, resulting in three weeks of nationwide rotational load shedding to deal with the shortfall (Chettiar et al. 2009). By 2014, the South African energy profile became comparable to that of China, India, and Mexico at the time, where energy shortfalls significantly constrain economic growth to meet human development needs (Bocca and Mehlum 2012).
Eskom initiated a resilience strategy in 2008 in response to growing electricity shortfalls and to deal with the new reality of regular loadshedding. Initially, the focus was only on power system resilience, but it expanded to the whole enterprise in 2013 to deal with wider business risks that were emerging. The purpose of a resilience focus is to prepare the organization to deal with business unusual. The expanded enterprise resilience focus is to ensure an integrated overview of risks and to facilitate an integrated emergency response capability to deal with systemslevel emergencies and special events such as the FIFA World Cup and national elections (Koch et al. 2013). There is a realization that traditional reductionist approaches, widely used to manage technology in the organization, are inadequate to deal with the complexity of emerging systemic problems (Guckenheimer and Ottino 2008), particularly the low-probability high-consequence risk of blackouts that Eskom has to manage.
The dynamics of complex adaptive power systems cause the systems to drift toward a critical point at which their apparent stability can abruptly change state (Dobson et al. 2007, Viejo et al. 2015. The complex intertwining of unforeseeable coincidences may cause rapidly cascading failure in the power system, and, in the worst case, results in a blackout (Bo et al. 2015), i.e., a widearea outage of long duration (NAS 2017). A blackout, in turn, normally results in further cascading failure across other interconnected and interdependent infrastructures such as water or telecommunications (Rinaldi et al. 2001, Mukhopadhyay andHastak 2016). Widespread blackouts are low-probability highconsequence events that often result in significant social and economic impact (Bo et al. 2015). In most developed nations with their highly interconnected grids, a blackout is rapidly restored through interconnections from neighboring areas that still have power (Bo et al. 2015). However, in the case of a national blackout, none of Eskom's neighboring electricity utilities have the capacity to restart the South African power system, which highlights the importance of resilience in general, and a black-start capability in particular. However, a well-developed technical black-start plan is insufficient to ensure national resilience to a blackout incident; institutional arrangements and integrated response plans are required in partnership with priority national role players (such as fuel, water, telecommunications, and security) to effectively respond to, and deal with, the consequences of a national blackout.
Given the described situation, it is clear that a fundamental, deliberate, and transformative change is required within and among institutions at national, regional, and local levels to establish the necessary preparedness across multiple sectors. We draw on the emerging body of work on complex systems problems (Cilliers 2000, Westley et al. 2006, Allenby and Sarewitz 2011 that indicates that such transformative change can be facilitated by recognizing that problems such as sustaining electricity supply in the face of disruption and change are fundamentally complex, rather than mere technical problems. Contingency planning and response strategies need to be implemented. The capacity to prepare and respond in a coordinated fashion requires complex https://www.ecologyandsociety.org/vol23/iss2/art12/ The relevant systems can be controlled The relevant systems cannot be controlled; the best one can do is to influence them. These problems have to be engaged directly; one must learn to "dance with them" (Meadows 2009:70, Poli 2013 adaptive systems thinking (Cilliers 2007, Bohensky et al. 2015, which emphasizes the presence of the interlinked nature of technical and human systems, their processes of interaction, and their tendency to self-organize into different regimes or result in disorder associated with critical stability points , Folke 2006. The difference between complicated and complex adaptive systems and problems is a difference of type, not of degree (Poli 2013). It is necessary to draw a clear distinction between these types of problems because the methods and approaches for understanding and managing them differ vastly Boone 2007, Poli 2013; Table 1). Reductionist approaches rely on problem-solving strategies that delimit reality into smaller parts and apply methodologies that aim toward predictability and control (Ramalingam et al. 2008). Such approaches assume that the nature of the problem is complicated. Reductionist approaches are inadequate to address complex problems. Complex problems require ongoing engagement and adaptation because apparent solutions often give rise to new problems (Poli 2013). Complex adaptive systems thinking explicitly considers unintended consequences, the agency of people, and unpredictable novelty (Juarrero 1999, Kurtz and Snowden 2003, Allenby and Sarewitz 2011. In reality, most problem situations contain both complicated and complex phenomena. It is essential for decision makers to make sense of the problem composition so as to apply solutions compatible with the nature of the problem at hand (Snowden and Boone 2007).
The system boundaries described by Allenby and Sarewitz (2011) are a useful guide to distinguishing between complicated and complex problems in socio-technical systems. Level 1 system boundaries are defined in terms of specific technological solutions such as electrical transformers or switchgear that aim to address a particular problem. Level 1 problems generally correspond to complicated problems that focus on hard infrastructure. However, for level 1 solutions to function, they are always embedded in level 2 systems, which incorporate the wider psychological, social, and cultural contexts that are inseparable from the technology (Allenby and Sarewitz 2011). Level 2 systems are complex adaptive systems that are susceptible to nonlinear risks and catastrophic disruption. Technical components in the power system are typically analyzed at level 1, whereas the overall electricity supply system should be recognized as a level 2 complex adaptive socio-technical system. The different types of problems described in Table 1 are thus correlated with boundary definition.
Eskom recognizes resilience as a strategic imperative (Eskom 2016a). By design, Eskom has multiple layers of defence to prevent a blackout, which are actively maintained to ensure their integrity. Even though the probability of such high-consequence events is low, Eskom is committed to establishing response preparedness and employing risk reduction measures to reduce the fallout from such eventualities (Eskom 2016b).

RESILIENCE THINKING
Resilience thinking is an application of complex adaptive systems thinking that pays specific attention to enhancing resilience. Building resilience has arisen as a response to deal with uncertainty and external risk, limited control, deep disruption, and an unpredictable future (DuPlessis VanBreda 2001, Sheffi 2005, Bhamra et al. 2011, Caldwell 2014. Resilience refers to the innate ability of complex adaptive systems to absorb disturbances or surprise and to adapt to dynamic change without losing their identity or function (Folke et al. 2002, Walker et al. 2004, Berkes 2007. The concept of resilience therefore includes interrelated aspects of persistence, adaptability, and transformability (Walker et al. 2004, Folke et al. 2010. Following this line of thinking, we define a resilient socio-technical electricity supply system from a normative perspective as one that has the emergent capability to absorb large shocks, even for low-probability high-consequence events such as a national blackout, and to continue to adapt amid ongoing changes such as climate change and urbanization while continuing to ensure reliable electricity supply in an affordable and sustainable manner. Literature on the application of resilience distinguishes between two different types of resilience that need to be established simultaneously: specified and general resilience (Folke et al. 2010, O'Connell et al. 2015a. Specified resilience refers to the resilience of a specified part of the system to identified disruptions, whereas general resilience refers to the capacity of a system to withstand all hazards, including novel and unforeseen ones, while continuing to provide essential functions (Walker et al. 2009; Table 2). General resilience is a generic capability to cope with uncertainty https://www.ecologyandsociety.org/vol23/iss2/art12/ The ability to persist within a stability zone (Folke et al. 2010) through anticipation strategies, being prepared, and applying prevention (Comfort et al. 2001) An intangible emergent capacity for adaptation and transformation (Folke et al. 2010) across multiple equilibria (North 1993, Caldwell 2014) How to build it Can be established by following best practice, through managing foreseeable risks (Garred 2013), and by how infrastructure is designed, built, and maintained (NIAC 2010) Is nurtured through the capacity for abductive thinking and sense making (Grøtan 2013) and evolutionary self-organization (Allan and Bryant 2014, Scolobig et al. 2015, De Coning 2016 How to sustain it Employs single-loop learning and aims to strengthen negative feedback loops (Antonacopoulou and Chiva, unpublished manuscript † ): to return conditions toward a predetermined target, to remove deviations, and to keep operations within deterministic boundaries (Weick and Sutcliffe 2007) Employs double-loop learning and aims to strengthen positive feedback loops (Antonacopoulou and Chiva, unpublished manuscript † ): to self-reinforce, amplify, enhance, and stimulate behaviors that enhance resilience, which includes modifying the rules that drive behavior (Holman 2010) † https://warwick.ac.uk/fac/soc/wbs/conf/olkc/archive/oklc6/papers/antonacopoulou__chiva.pdf and surprise and to endure novelty and instability, including multiple shocks and cascading failure (Folke et al. 2010, Walker andSalt 2012). General resilience emerges when predetermined plans are inadequate to deal with the situation at hand, and new capabilities are dynamically developed to respond (Lee et al. 2013). Resilience literature cautions that resilience investments have to be balanced across specified and general resilience because effort channeled into developing only one kind of resilience may reduce the other kind (Folke et al. 2010, Resilience Alliance 2010, Cork 2011).
Here, we apply the bifocal lens of complicated and complex problems to clarify the operational implications for building specified and general resilience. To establish specified resilience, decomposition of the system and its environment is required to determine "what" internal parts should be resilient, and against "what" external aspects of the environment this resilience is required (Carpenter et al. 2001). Although this reductionist approach is pragmatic, it employs a complicated approach to a complex system. Resilience associated with technical components can be engineered in a complicated fashion using classical reliability-oriented design (Holling 1996). Experts can follow best practice or good practice (Hummelbrunner and Jones 2013a) to establish resilience of specific parts of the system to specified shocks. However, these level 1 components can collapse when critical thresholds are exceeded in the level 2 systems context in which they are embedded (Pourbeik et al. 2006, Simone 2014. General resilience therefore needs to be established across multiple facets of the level 2 system and necessitates resilience practitioners to embrace complexity-based approaches. A key capability that enables leaders to make sense of inherent complexity and ambiguity is sensemaking (Weick 1995), i.e., the ability to comprehend, understand, and explain what is going on (Ancona 2012). Sensemaking is an integral part of learning and consists of an ongoing action-oriented cycle of acquisition, reflection, and action that people go through to integrate experiences into their understanding of the world to inform action (Kolko 2010). Sensemaking shapes organizational behavior, i.e., how the organization makes sense of where it is and what is going on, and directly affects how the agents in the system adapt and self-organize, which, in turn, influences how the system develops (Weick 1995). Appropriate collective sensemaking is crucial to ensure resilient service delivery because it directly affects general resilience features through the effectiveness of organizational response to crisis or disruption (Casto 2014).

Resilience assessment
Along with the rapid rise in interest in fostering resilience, there has been great demand for improved approaches to assess resilience (Quinlan et al. 2015). Assessments can be distinguished based on purpose (why), target audience (for whom), level of assessment (of whom), and object of assessment (what; Terenzini 1989, Carpenter et al. 2001, Quinlan et al. 2015. Many different resilience assessment methods exist. Several approaches highlight the need for participatory approaches (Almedom et al. 2007, Pasteur 2011, O'Connell et al. 2015b, Quinlan et al. 2015. Other resilience assessment approaches distinguish between types of resilience, an evaluation of the actual resilience displayed in past incidents, or comprise indicators of adaptive management, adaptive governance, or transformative capacity (Cork 2011, Walker and Salt 2012, O'Connell et al. 2015a. A stated objective of many resilience assessments is to understand how to build resilience of some desired outcome. Drawing on the literature from educational assessments, we distinguish between "summative assessments" that primarily aim to evaluate current levels of resilience for external reporting and benchmarking, and "formative assessments" that aim to build resilience through the assessment process itself (Table 3). Although these two objectives are not mutually exclusive, clarification of the primary purpose of a particular resilience assessment exercise can help in selecting a suitable approach. Summative assessments seek to standardize indicators for the benefit of comparison and to aggregate toward national or regional reporting of resilience (Stephenson 2010, O'Connell et al. 2015a, RESILENS 2016. Formative assessments comprise an ongoing process, not a periodic product (Black et al. 2003, Nicol andMacfarlane-Dick 2006). Such assessments entail a systematic and ongoing internal process of seeking and interpreting evidence, to participatively make sense of the current levels of system resilience, and to garner agreement to improve attainment of resilience outcomes. Formative assessments center on critical conversations among key actors in the system to enable collective sensemaking, promote commitment to resilience goals, and adaptively stimulate the emergence of resilience throughout the system. Care should be taken that the approach used does not undermine the intended outcome. When assessments for https://www.ecologyandsociety.org/vol23/iss2/art12/ Can be an ongoing process Can be scheduled periodically "For" a resilience outcome "Of " resilience To facilitate a bottom-up dialogue among actors in the system Against standardized indicators decided top-down To diagnose where the system is in its levels of resilience For the purpose of producing a report for a third party To agree where resilience should be strengthened To give an account of what has been achieved Through collective action toward shared resilience goals For comparison, aggregation, or benchmarking enhancing resilience are conducted as punitive compliance audits, it can lead to unintended consequences and erode resilience instead of building it (Dekker and Breakey 2016).
Formative resilience assessment processes merge into a transformative assess-and-build cycle. Such assessments require direct engagement with the complex adaptive system to learn about the nature of the complex dynamics (Quinlan et al. 2015). Key actors probe the system interactively to make sense of dynamically changing feedback mechanisms, constraints, and patterns of emergence (Juarrero 1999, Walker andSalt 2006). Attention is paid to: what builds, maintains, and breaks down resilience; where undesirable resilience should be disrupted; and where desirable resilience can be enhanced (Cork 2009, Quinlan et al. 2015. The assessor is part of the complex adaptive system, and probing can affect emergence of the system in unpredictable ways. Therefore, all probes should be carefully designed as interventions to enhance resilience (Holman 2010), and every intervention to build resilience can be used as a probe to better understand the system and its resilience dynamics. This ongoing process can adaptively transform the system's resilience over time.

A FRAMEWORK FOR CONCEPTUALIZING THE RESILIENCE OF ESSENTIAL SERVICES
Building on the emerging theoretical ideas outlined above, of resilience as the emergent outcome of complex adaptive systems, and practical experiences in operationalizing resilience thinking and assessments in the context of electricity supply in South Africa, we present a framework for conceptualizing different aspects of resilience in complex adaptive socio-technical systems.
To conceptualize the resilience of essential services, we juxtapose the types of resilience (specified and general) and focus of resilience investment (technology or social; Fig. 1). Although the social and technical components are interdependent, the distinction here is based on the content (Rosen 2000) and the focus of the resilience strategy (NIAC 2010). The resulting four quadrants represent different resilience domains that can serve as a guide for how to assess and build resilience of essential services: . The "specified technical resilience" quadrant represents areas where resilience to specific risks (e.g., storms) is built into technical infrastructure to ensure that it is adequate, reliable, and secure. This quadrant focuses on building robustness into level 1 systems.
. The "specified social resilience" quadrant represents areas where resilience to specific risks (e.g., disruption to critical business processes) is established through processes and institutions in the social domain. This quadrant focuses on building specific skills, response capabilities, and plans within level 2 systems.
. The "general technical resilience" quadrant represents areas where resilience to novel and unknown risks is established through network topology or adaptive technologies that offer systems-level flexibility to enable an agile response across the system in dealing with uncertainty. This quadrant focuses on connectivity and structure of level 2 systems to ensure systems-level flexibility.
. The "general social resilience" quadrant represents areas where resilience to novel and unknown risks is established through people, processes, and institutions. This quadrant focuses on collective human agency, agility, and volition in level 2 systems.

Fig. 1.
A conceptual framework for building and assessing resilience of essential services produced by socio-technical systems.

Differentiated resilience roles
These different forms of resilience can be cultivated at different organizational levels (operational, tactical, and strategic). The organization has been conceptualized as a layered triangle, with the operations layer being the largest bottom stratum, the tactical layer representing the middle level, and the top strategic layer representing the executive level (Anthony 1988, Mumford et al. Leadership fosters persistence through operational control in daily operations to ensure that the system has the day-to-day ability to absorb a magnitude of disturbances and to anchor essential services with minimum disruption Leadership establishes integrated response capabilities, adaptability through management control, continuous improvement, and scenariobased exercises to enable the organization to adaptively manage risk, to bounce back better, and to embrace opportunities to bounce forward Leadership takes a long-term perspective to transform the organization in a timely manner through emergent strategic planning to survive and thrive amid uncertainty while navigating disruptive change, to intentionally transform its identity toward a more sustainable development trajectory , Ho 2015. The different interrelated aspects of resilience (persistence, adaptability, and transformability) can occur at multiple hierarchical levels in organizations and interact across temporal, spatial, and hierarchical scales. To foster resilient essential services, we argue that the primary role of operational leadership is to foster persistence of core operational functions, the role of tactical leadership is to develop adaptability, and the role of strategic leadership is to transform the organization in a timely manner to survive and thrive amid disruptive change (Fig.  2, Table 4). We also argue that specified resilience is crucial in the lower strata of organizations while the significance of general resilience increases higher up. Operational leaders need to be aware of external threats and mindful of internal vulnerabilities to persist. In contrast, strategic leaders need to be aware of external opportunities and mindful of internal well-being of employees to transform proactively.

APPLYING THE FRAMEWORK TO BUILD AND ASSESS RESILIENCE OF ELECTRICITY SUPPLY
The framework introduced above can be used to identify different strategies and interventions to build the resilience of essential services in different parts of socio-technical supply systems.
Applying the framework at different organizational levels can facilitate contextually appropriate assessments that help develop a deeper and shared understanding of the complex adaptive dynamics of a system in relation to the larger context in which it is embedded, a key objective of many resilience assessments (Quinlan et al. 2015). To achieve this objective, we argue that the assessment process should incorporate key resilience-building principles of facilitating broad participation, encouraging learning, and facilitating a deeper understanding of complex dynamics in the socio-technical system, while building trust and social capital .
In the following sections, we discuss how the framework can be applied specifically in the context of socio-technical electricity supply systems to build and assess resilience. The four resilience quadrants can be used as a guideline for the differentiated assessment of respective types of resilience at different organizational levels. We also suggest indicators of quadrantspecific resilience applicable to specific organizational levels ( Table 5).

Specified technical resilience
Specified technical resilience represents areas where investments can be made in identified infrastructure and assets to ensure that they can withstand specified threats, in answer to "resilience of what and to what?" (Carpenter et al. 2001, Quinlan et al. 2015.
Although the timing and severity of these specified threats may be unknown, their potential future occurrence can be calculated probabilistically (O'Connell et al. 2015a). This quadrant draws on what Holling (1996) described as engineering resilience, or what is known in the electric utility world as utility resilience, reliability standards, electric power infrastructure resilience, or grid resilience (Madni and Jackson 2009, NIAC 2009, Park et al. 2013, DOE 2014, NERC 2015. The specified technical resilience domain represents level 1 technology solutions that enhance survivability and robustness (Pavard et al. 2006, Madni and Jackson 2009, Dahlberg 2015, following the laws of physics and using reductionist approaches.

Building specified technical resilience
Given adequate resources, infrastructure resilience can be achieved to withstand anticipated hazards through good practice, which includes intelligent engineering design that implements adequate margins of safety, quality construction, and sufficient maintenance (UNESCAP 2013). In a utility such as Eskom, this translates into applying engineering standards (for example, reliability criteria, quality controls, and routine inspections). Consideration should be given to fail-to-safe design philosophies (i.e., revert to a safe condition if it fails). Specified technical resilience can also be enhanced through a wider distribution of resources to increase redundancy. An example of increasing diversity and redundancy in electricity supply is the use of microgrids around critical facilities or the placement of critical spares such as spare towers or mobile transformers at select locations throughout the grid to speed up emergency response.

Assessing specified technical resilience
Specified technical resilience assessments can consist of quantitative measures (Quinlan et al. 2015), benchmarks, tests, and compliance with engineering standards and controls applied Ecology and Society 23(2): 12 https://www.ecologyandsociety.org/vol23/iss2/art12/ • Competent in execution of standard operating procedures, emergency roles and responsibilities, ability to execute preapproved response plans, and ability to participate effectively in simulation exercises (Wybo 2008) • Competent in semistructured decisions and ensuring efficient and effective use of resources through business planning, logistics coordination, and operational improvements † • Contingency arrangements, response plans, and risk reduction strategies are systematically reviewed and adaptively revised to incorporate learning (Saurin et al. 2013) • Response structures effectively integrate across functions Competent in unstructured decisions that are complex, ambiguous, and far-reaching in scope, entail high levels of uncertainty, and often pertain to nonlinear risks in the external environment † • Commitment to resilience through visible leadership in good-practice disciplines such as emergency preparedness and business continuity management • Ownership of contingency arrangements, knowing and testing established plans, and actively participating in emergency simulation exercises • Ability to anticipate and avoid "foreseeable, predictable, avoidable surprises" ‡ General technical resilience • Able to operate adaptive technology under pressure and maintain back-up and contingent systems components • Technical capabilities that allow operational flexibility often beyond the infrastructure itself, e.g., demand response contracts • Review asset condition monitoring practices and test results of deployed technologies that provide adaptive capacity and strengthen systems flexibility, e.g., unit islanding schemes and black-start tests performed • Consider technology solutions beyond the infrastructure system • Proactive investment in systems flexibility (in electricity supply, these include smart metering, smart grid, containerized mobile substations, demand-side products, and supply-side mix) General social resilience • Monitor whether people feel empowered to act in the interest of safety and resilience if contrary to what is expected • Able to follow intuition based on deep experience in situations that necessitate that rules be broken • During extreme events, be comfortable to apply an incident command system to perform emergency operations, even under great pressure • Employ fail-to-safe scenarios in emergency exercises that stretch people beyond the plan • Able to network and mobilize support through strong social networks, third-party agreements, and memorandums of understanding that have been established • Monitor for signs of restorative or retributive justice exercised in supervision • Identify heuristics used on the frontline, verify the validity to formalize and spread guiding heuristics to be used in crises • During extreme events, be comfortable to coordinate planning, be able to integrate situational awareness during the incident to provide a common operational picture of unfolding events, execute tactical command, mobilize resources, and coordinate logistics to support operations • Actively build a culture of resilience and safety, with restorative justice in word and deed; the ability to anticipate and avoid predictable surprises ‡ • Evidence that they value and actively build social and psychological capital in their networks and through their leadership, practice adaptive management, and encourage decentralized self-organization during disruption (Jones 2011, Pereira and Ruysenaar 2012, Everly et al. 2013 • Strengthen external and internal connections in functions, across disciplines, and with other sectors (Stephenson 2010) • During extreme events, be comfortable to fulfill the incident commander role, be able to see the big picture, prioritize objectives, take decisions in spite of incomplete information, and recognize when a phase change is evident or a regime shift has taken place † See Anthony (1988), Mumford et al. (2007), Ho (2015). ‡ See Bazerman and Watkins (2008). https://www.ecologyandsociety.org/vol23/iss2/art12/ throughout the asset life cycle. Reliability assessments contribute toward technical resilience, but reliability is not enough to ensure resilience to low-probability high-consequence events (Stockton 2014, Panteli andMancarella 2015). Because of an increase in severe weather events due to climate change, the resilience of technologies already deployed should be monitored (Savonis et al. 2014) to harden or reinforce existing infrastructure and modernize aging infrastructure to withstand severe climate events (Panteli and Mancarella 2015), and reliability design criteria for infrastructure should be revised to cater for new extremes.
When infrastructure is damaged in disasters (for example, due to severe weather), the global Sendai Framework for Disaster Risk Reduction suggests that asset owners consider the option to build back better (UNISDR 2015) to enable "bouncing forward" (Kelman et al. 2015:22). In addition, adaptive assessment approaches can be employed to verify the reliability and resilience of current infrastructure in relation to the increased probability and intensity of severe weather events. A risk assessment of climate-resilient infrastructure can identify assets vulnerable to inundation or structural failure to inform an infrastructure resilience investment strategy for disaster risk reduction (NDMC 2005). Within Eskom, the systematic application of this approach is prescribed in the disaster management strategy in the form of disaster risk assessments and disaster risk reduction. This process demonstrates the cyclical nature of assessing resilience to build resilience.

Specified social resilience
Specified social resilience entails specific investments in people and processes to ensure that they can maintain the continuity of critical functions when subjected to identified threats. This quadrant draws on the management disciplines of emergency management, crisis management, business continuity management, and safety management, as well as literature from the fields of organizational resilience, climate resilience, and disaster management (Linnenluecke and Griffiths 2012, Miao et al. 2013, Mendonça and Wallace 2015. The adequacy of people's technical skills draws on the traditional reductionist approaches of sociotechnical systems thinking and human-machine interface design (Dekker 2005, Qureshi 2007, Klein 2008. To ensure safety in highrisk operations, the literature on high-reliability organizations highlights cultivating resilience mindsets (Weick et al. 1999, Schulman et al. 2004, Lekka 2011.

Building specified social resilience
Specified social resilience can be built through the adoption of established disciplines of good practice (BSI 2014). The Eskom Resilience Programme is based on the adoption of emergency management, business continuity management, and disaster management at different scales across the organization, using risk management as a common basis, and incident management integrated at the time of response across functional and geographic boundaries (Koch et al. 2013). Through the adoption of these management systems, response preparedness and contingency arrangements are formally established. While these good-practice guidelines are aimed at specific response capabilities, the process can also contribute to general social resilience when people synthesize the wider context and recognize the purpose of these processes.
To develop the cognitive ability to deal with the disruption of extreme events, an effective response capability can be developed, but there is no substitute for experience (Cilliers 2000, Casto 2014, Doyle et al. 2015. Operators need the ability to recognize system failure conditions and arrest the collapse of technical infrastructure systems. Because real resilience tests seldom occur, this experience can be built up through being exposed to stretching scenarios in simulation exercises (Wybo 2008, Koch et al. 2013, Kellett and Peters 2014. The apprentice program for a new system operator to autonomously man a desk in Eskom National Control lasts longer than a decade and includes extensive time on the simulator. Participation in emergency exercises and simulations is vital to build and assess resilience (Wybo 2008).
Continuous learning is a vital resilience-enhancing principle . While incident investigations assess root causes, they also propose preventive measures. Collectively, these findings can be useful in facilitating adaptation requirements that build specified resilience. Highly reliable organizations cultivate collective mindfulness that pays attention to small signals, for example, when incidents result in responses at a systemic level that are outside of the expected norms (Weick et al. 1999). Such organizations learn from their and others' mistakes to "fail forward." At a wider scale, specified social resilience can be enhanced by changing the rules of the game such as by redesigning the regulatory framework to support resilience (NIAC 2010, Keogh andCody 2013), increasing the range of options (e.g., having critical load specifications for the utility or diversifying the energy options for customers), and increasing the size of buffers through energy demand management programs.
Assessing specified social resilience Specified social resilience assessments can entail a verification of established preparedness against predefined objectives in the form of authorized contingency arrangements, response and recovery plans, and standard operating procedures. Such assessments can be done based on the guidelines of good-practice disciplines such as emergency preparedness, business continuity management, and disaster management. Various indicators of specified social capabilities have been recommended to enable repeatable and comparable resilience assessments (McManus et al. 2007, Stephenson 2010, Lee et al. 2013, Matzenberger et al. 2015. Within Eskom, divisional and provincial progress is monitored against key deliverables as part of an enterprise resilience program. The role of exercises in specified social resilience assessments is to test execution against these predefined plans and to verify the effectiveness of the preparedness at a disaggregated level in organizations. Such integrated provincial and national exercises are conducted annually in Eskom.

General technical resilience
General technical resilience refers to the generic ability of manmade systems to withstand any threat or disruption amid the complexity of the level 2 systems in which they are embedded. This quadrant draws on network topology, resilience engineering, systems resilience, systems of systems, and critical infrastructure systems literature (Hollnagel et al. 2006, Janssen et al. 2006, Dekker et al. 2008, McDaniels et al. 2008, Gopalakrishnan and Peeta 2010, Stockton 2014, Amin 2015, Gao et al. 2016. The field of resilience engineering should be distinguished from Ecology and Society 23(2): 12 https://www.ecologyandsociety.org/vol23/iss2/art12/ engineering resilience described by Holling (1996). Resilience engineering applies a complexity perspective to the safety of manmade systems by ensuring that the overall socio-technical system has the capacity to withstand a threat, the flexibility to restructure itself in the face of a threat, the tolerance to degrade gracefully following an encounter with a threat, and the cohesion to operate before, during, and after an encounter with a threat (Dekker et al. 2008, Jackson 2008.

Building general technical resilience
Building general technical resilience requires increasing systemslevel flexibility that allows bending rather than breaking (Longstaff et al. 2014, Dahlberg 2015. It entails optimizing network topology for resilience to maintain connectivity amid disruption, although there can be a trade-off with network efficiency (Gutfraind 2012, Gao et al. 2016. General technical resilience can be strengthened through technology that enables emergent and adaptive approaches that support novel self-service capabilities through, for example, built-in fail-to-safe modes and just-in-case contingency capacities that accommodate systems failure and manage failure and recovery (Park et al. 2013, Seville et al. 2015. Measures that increase system adaptation under system failure conditions include systems-level flexibility, increased observability and controllability, permeable systems boundaries that are less brittle under pressure (Rumbaitis del Rio 2015), and tools that support rapid response and recovery (Schneider and Somers 2006, Francis and Bekera 2014, Panteli and Mancarella 2015. By extrapolating from resilience in socialecological systems, general technical resilience can be enhanced by paying attention to energy flows, systems-level feedback loops, slow variables, thresholds, and interdependencies in the system. In the electricity industry, general technical resilience is a key consideration in the focus on smart grid technology. For example, smart metering enables connectivity with improved information flow, controllability, and dynamic reconfigurability of the system; self-healing networks enable technical systems to self-organize following disruption; and microgrids enable modularity, diversity, and redundancy (Lacey 2014, Ye 2014, Zarakas et al. 2014. Regulatory requirements that enable the flexible management of real-time electricity demand reduction in the event of a range of scenarios in South Africa include the establishment of critical and essential load requirements as well as interruptible load contracts (SABS 2010). General technical resilience can also be built into communities, for example, by diversifying energy options such as solar-powered traffic lights to prevent gridlock when power supply fails and through the use of peak-day pricing, stimulating energy efficiency that improves peak demand reduction and contributes to overall systems efficiency.

Assessing general technical resilience
Assessments of general technical resilience need to appraise levels of general technical resilience of the critical infrastructure system through an evaluation of the flexibility of the overall system when under strain or under failure conditions that may not yet be apparent. Metrics are available for the resilience of complex networks based on network topology and system dynamics (Zhao et al. 2011, Gao et al. 2016. Indicators of general technical resilience identified for socio-technical systems include safety margins, buffers and levels of redundancy built into the design and operations of the system (Madni and Jackson 2009). Potential indicators, inferred from social-ecological systems, include systems-level connectivity and barriers . Drawing on Cork's (2011) work on resilient ecosystems, general resilience indicators applicable to assessment of technical systems include: modularity in the connections of components in the network to ensure that the overall system continues to function even if one part of the system has collapsed (referred to as redundancy and diversity by Woods [2005]); tight feedback mechanisms through which information about change is gathered and transmitted through the system (referred to as observability by Savulescu [2014]) to ensure adequate, timely, and scaleappropriate response (referred to as controllability by Panteli and Mancarella [2015]), and; levels of just-in-case economic and system reserves that can be drawn from if something untoward happens (Seville et al. 2015).
The cost of general technical resilience investments is high, and there is no certainty about when it is enough. We therefore propose balancing investments in this quadrant with resilience investments in general social resilience because the uniquely human strength to adjust and improvise enhances the adaptability of complex level 2 socio-technical systems (Dekker 2005, Heese et al. 2014.

General social resilience
General social resilience refers to investments in people and processes to ensure that the overall socio-technical system has continuity and a general ability to cope with dynamic change in the face of novel and unanticipated disruptions. This quadrant focuses on learning to adapt to change, preparing the system for emergent self-organization, and using complexity leadership thinking to renew the system should large shocks occur (Comfort et al. 2001, Marion and Uhl-Bien 2001, Walker et al. 2002, Kaufmann 2013). This quadrant draws on psychology, behavioral and social sciences, community resilience literature (DuPlessis VanBreda 2001, Youssef and Luthans 2007, Armitage et al. 2012, Carpenter et al. 2012, the fields of ergonomics and human factors (Qureshi 2007, Klein 2008, Dekker 2012, NIST 2016b, as well as the side of resilience engineering that helps people who operate within complex socio-technical systems to cope with complexity under pressure and endure (Hollnagel et al. 2006, Righi et al. 2015.

Building general social resilience
Eskom has identified five generic social capabilities of a resilient essential service system, namely: (1) anticipate, identify, and adapt rapidly to threats, vulnerabilities, and opportunities arising from changes in the internal and external environment; (2) operate at elevated levels of stress without failure for extended periods of time; (3) respond rapidly to a shock to contain the impact (severity and duration) of the event or threat; (4) recover rapidly in a coordinated manner; and (5) deliberately evolve to a higher state of resilience in response to changes in the environment by implementing learning from near misses and incidents (Koch et al. 2013). These general social resilience capabilities can be nurtured through investment in social, cultural, and educational competencies (PwC 2013).
An organizational culture of resilience can be fostered through behaviors that help employees to be agile and adaptive in the face of disruption and change (Luthans et al. 2006, Everly et al. 2013). Organizations can encourage purposive self-organization (Pavard et al. 2006, Shaw et al. 2014, De Coning 2016. For instance, a Ecology and Society 23(2): 12 https://www.ecologyandsociety.org/vol23/iss2/art12/ standard incident command system offers a flexible and highly adaptive management system that enables dynamic selforganization, yet ensures coordination toward common incident objectives (Maitlis and Christianson 2014). Empowering leadership that explicitly gives people permission to act in a hightrust environment (Jones 2011) makes space for personal commitment that unlocks determination and willpower (Conway et al. 1974) and can contribute significantly to resilient organizational response to disruption (Nguyen et al. 2016).

Assessing general social resilience
Sense of coherence has arisen as a significant indicator of individual and societal resilience (DuPlessis VanBreda 2001, Almedom et al. 2007, Overland 2011. It refers to how people make sense of everyday reality and whether they view life and the world as comprehensible, manageable, and meaningful Eriksson 2006, Almedom et al. 2007). A healthy sense of coherence provides the ability to cope with stressful situations (Eriksson and Lindström 2005); contributes to preventive, protective, and restorative capacity in people subjected to disruption; and influences survival and recovery (DuPlessis VanBreda 2001, Overland 2011. Furthermore, cultivating a restorative safety culture that is just (rather than retributive) significantly contributes to resilience because it enables an organization to learn from mistakes rather than focusing on attributing blame, which can result in covering up incidents or tampering with evidence (Dekker and Breakey 2016). Effective learning processes can be facilitated through adaptive management (Hummelbrunner and Jones 2013b) and adaptive governance systems (Folke et al. 2005, Garschagen 2013, Seeliger and Turok 2014.
The general social resilience quadrant represents a highly soughtafter resilience advantage but is the most difficult to establish or assess. Assessments of general social resilience require sense making that engages with contextual complexity. General resilience assessment indicators adapted from Cork (2011) include monitoring for change in: (1) levels of openness in the system for the movement of people and ideas into, through, and out of the system; (2) levels of social reserves, and; (3) levels of social and relational capital such as leadership, networks, community, and trust exhibited in the system (Pereira and Ruysenaar 2012). General social resilience can also potentially be assessed by measuring and monitoring collective sense of coherence (Ghoshal andBruch 2003, Lindström andEriksson 2006); evaluating the presence and effectiveness of the seven generic principles proposed by Biggs et al. (2015), and; evaluating the nature of the culture, informal institutions, and heuristics used to make judgements under uncertainty (Tversky and Kahneman 1974, North 1991, Pereira and Ruysenaar 2012.

CONCLUSION
The resilience of technologically mediated essential services is critical to human well-being. These essential services are produced by complex adaptive socio-technical systems that consist of layers of critical infrastructure embedded within people and processes in organizations responsible for delivering these services. Here, we make a novel contribution by conceptualizing the resilience of essential services in terms of both specified parts and the whole of the complex adaptive socio-technical system that produces essential services. The framework we propose juxtaposes and distinguishes between specified and general resilience investments in (1) people and institutions as a social infrastructure investment, and (2) infrastructure and assets as a technology infrastructure investment (Fig. 1). This four-quadrant framework provides a guide to a differentiated but integrated set of resilience strategies and assessment indicators that can be applied across different organizational levels.
We suggest that all four quadrants of the proposed framework should be applied at all organizational levels. However, the relative importance of specified and general resilience varies across these levels: specified resilience is more pertinent at the operational level, whereas general resilience is more pertinent at the strategic level (Fig. 2). This difference partly explains why reductionist approaches have been dominant in considering resilience of infrastructure systems because the emphasis is on continuity of technical operations amid disruption. However, as the concept of resilience thinking matures in essential service provision, we expect that complex adaptive systems thinking will increasingly permeate resilience practice. All four dimensions of resilience are important, but general social resilience in essential service systems in particular has generally been neglected.
Specified resilience can be built in a linear fashion based on good practice, but general resilience needs to be built in an emergent fashion, drawing on approaches from complex adaptive systems thinking. Technological resilience investments generally reduce vulnerability and mitigate failure, whereas social resilience investments increase available options and enhance collective adaptability. Both forms of resilience are necessary to safeguard essential services against systems failure. Both reductionist and complexity-based approaches to resilience add value and should be employed in a complementary, rather than competitive or exclusive, fashion. When either approach is used exclusively, it might erode resilience.
We argue that formative resilience assessments can be conducted for building resilience of essential services based on social and technical indicators of specified and general resilience. To stimulate the emergence of social resilience across the system, a key aspect of formative resilience assessments is identifying and conducting critical conversations at different organizational levels. By stimulating appropriate discussions at multiple levels, resilience assessments can promote adaptation and transformation of the system and stimulate the emergence of resilience across the system.
More work is required to understand the options to assess and build resilience of socio-technical systems and, in particular, the social dynamics required to ensure resilient essential service delivery. Humans can be both the weakest link and the strongest resource to ensure resilience of essential services. More research is required on how to build a culture of resilience in key service providers and to develop and understand techniques that foster social resilience. Although we have focused on the case of sociotechnical systems, we suggest that the approach we have adopted in our framework may be useful for advancing thinking and indicator development in social-ecological systems more broadly, for instance, by overlaying specified and general resilience against societies and ecosystems. We suggest that this approach can support the operationalization of resilience assessments that can identify and integrate a diverse portfolio of resilience-enhancing initiatives and investment strategies.