October 2023 Reproduction - William Procter

Abstract

Malcomb, Weaver and Krakowka (2014) published one of the first sub-national geographic climate change vulnerability models for a developing country (1.4). The authors intended for the study to be replicable across space (other African countries with similar data available) (7.1), time (when new survey data is published) (4.5 and 7.1), and vulnerability stimuli (7.1). The study’s social impacts are to address extreme vulnerability to climate change (1.3) and assisting in the allocation and evaluation of foreign aid (1.2). The methodology was designed to be “transparent and easily replicable” (2.1) in its use of “locally derived indicators and granular data” (2.1). The study was designed to address critiques of vulnerability models aimed at their uncertainty and sensitivity due to problems of scale and spatial aggregation, normative and subjective modelling decisions, and data availability, and challenges in model comparability (2.1). The model uses household adaptive capacity data from the United States Agency for International Development (USAID) Demographic and Health Surveys (DHS) (1.4 and 4.1) available in 44 African countries (7.1), livelihood sensitivity data from the USAID / Famine Early Warning Systems Network (FEWSnet) livelihood zones baseline surveys available in 23 African countries (3.6), and global physical exposure data from the United Nations Environment Programme (UNEP) Global Risk Data Platform.

This replication study is motivated by three factors. First, there is an urgent need to evaluate the reproducibility of research in human-environment and geographical sciences (HEGS) and to establish protocols and infrastructure for conducting and publishing reproduction/replication studies and reproducible research in HEGS. Second, a fully reproducible publication can be more readily replicated in new geographic, temporal, and thematic contexts, and tested for uncertainty due to data constraints and subjective modelling decisions. Third, climate change is causing increasingly severe in Africa. Improving the reproducibility and replicability of climate vulnerability research will hopefully enhance the potential for research to inform policy and reduce harm caused by climate change.

Malcomb et al (2014) produce two models of interest for Malawi. Figure 4, labelled “Malawi Household Resilience”, visualizes the average adaptive capacity score of households in each traditional authority. Figure 5, labelled “Malawi Composite Vulnerability Index”, visualizes vulnerability scores by locations (cells) in a continuous raster grid. In this study, we will attempt to identically reproduce figure 4 (adaptive capacity by traditional authority) and figure 5 (vulnerability grid) using The R Project for Statistical Computing and the same data sources cited in the original publication. We will visually compare our resulting reproduction figures with the original figures. Comparison will be aided by digitizing and joining the original figure results to the reproduction results for each model, and then calculating any differences between them. Differences will be visualized with thematic maps for both models, a confusion matrix for figure 4 (adaptive capacity by traditional authority), and a scatterplot for figure 5 (vulnerability grid). An exact reproduction should produce exact replicas of the rank order of traditional authorities by adaptive capacity and grid cells by vulnerability. We will test this with the Spearman’s Rho Correlation Coefficient, expecting values of 1 for perfect correlation.

The original study is a descriptive geographic multi-criteria analysis based on local expert opinion, and therefore has no testable hypotheses or effects.

The replication study data and code will be made available in a GitHub repository to the greatest extent that licensing and file sizes permit. The repository will be made public at github.com/HEGSRR/RPr-Malcomb-2014

Malcomb, D. W., E. A. Weaver, and A. R. Krakowka. 2014. Vulnerability modeling for sub-Saharan Africa: An operationalized approach in Malawi. Applied Geography 48:17–30. DOI:[10.1016/j.apgeog.2014.01.004](DOI:%5B10.1016/j.apgeog.2014.01.004){.uri}.

Keywords

Reproducibility, Vulnerability, GIS, Climate Change, Africa

Study design

The reproduction study design will first implement the original study as closely as possible to reproduce the 2010 Household Resilience map (F4) and Malawi Vulnerability Map (F5). Our two confirmatory hypotheses are that we will be able to independently reproduce results for both maps.

The working hypotheses are therefore:

H1: There is no perfect positive correlation between Malcomb et al’s ranking of traditional authorities by household resilience and our reproduction study’s ranking of traditional authorities by household resilience.

H2: There is no perfect positive between Malcom et al’s ranking of locations by climate vulnerability and our reproduction study’s ranking of locations by climate vulnerability.

We will evaluate each of these hypotheses using a Spearman’s Rho Correlation. A failure to reject these hypotheses would indicate that our results do not exactly match those of the original authors. A positive correlation approaching 1 would indicate a partial reproduction

Original study design

The original study is observational and descriptive, with no hypotheses or effect sizes. The study is a multi-criteria analysis using geographic information systems (GIS) to implement a hierarchical geographic model of climate change vulnerability model in Malawi.

The spatial extent of the study was the country of Malawi. The spatial scale of the study was the third administrative level (traditional authorities) and a raster grid of unknown spatial resolution. The temporal extent of the study was explicitly 2004—2010 (4.5), but the contains secondary data collected earlier (3.6 and F5).

The model themes, indicators, and weights were selected based upon 70 interviews and 11 village focus groups from field trips to Malawi in March and August of 2011 (1.4, 4.2 and A1). Themes and indicators were also contextualized in literature (3.3 through 3.7) and adjusted based on redundancy and representativeness across the country (4.3). The model and weights were adjusted through “several iterations of the model using alternative weighting schemes” (4.3) to produce a “final product that reflects Malawi’s contextual and perceptual vulnerability” (4.3). Each theme was constructed of indicators from a single data provider: adaptive capacity is measured with USAID DHS surveys, livelihood sensitivity is measured with FEWSnet/Malawi Vulnerability Assessment Committee (MVAC) livelihood zones baseline data, and physical exposure is measured with UNEP Global Risk Data Platform data (T1 and T2). Although the authors emphasize a grounded local evidence-based selection of indicators and weights (2.1, 4.2, 5.1 and 7.1), other evidence in the publication suggests a model design based on a more pragmatic combination of factors including expert local opinion, deductive theory, and the availability and characteristics of data.

The study did not use any randomization.

The original study was conducted using STATA™ (4.4) and ArcGIS™ (4.6, F3 and F4) with unspecified software versions, by 2012 according to creation dates on map figures (F3, F4 and F5).

Computational environment

The study was originally conducted using ArcGIS and unspecified statistical software. This reproduction study uses R, including the rdhs package for DHS survey data, the sf package for vector analysis, the stars package for raster analysis, and the tmap package for cartography.

# set up default knitr parameters
knitr::opts_chunk$set(
  echo = FALSE,
  fig.width = 8,
  fig.path = paste0(here("results", "figures"), "/")
)

# these values allow you to access private and public raw data more efficiently
private_r <- here("data", "raw", "private")
public_r <- here("data", "raw", "public")
public_d <- here("data", "derived", "public")
scratch <- here("data", "scratch")

Data

Lakes

Major lakes were downloaded from MASDAP, the Malawi Spatial Data Platform.

Lakes data transformations

Dissolve lakes into a single multi-part feature with one field EA containing the value Lake.

Livelihood zones

Livelihood zones geographic data may be downloaded from the FEWS NET Data Center at https://fews.net/fews-data/335.

Livelihood sensitivity data is derived from household economic analysis (HEA) baseline surveys of livelihood zones created by MVAC in collaboration with USAID and FEWSnet (3.6). Livelihood zones are distinct from traditional authorities (5.6). They are “geographic areas where populations share characteristics of farming practices, labor, and environmental coping strategies” (3.6). Eleven zones were surveyed in 2003 (3.6). An MVAC 2005 report on livelihood zones appears in the references with an expired URL (R).

Livelihood sensitivity is measured with the following variables from FEWSnet livelihood zone data.

  • 6%: percent of food from own farm (T2)
    • ability to meet food needs (T1 theory)
    • % food intake from personal farm (T1 indicator)
    • % of food that poor households receive independently from their own farm, an indication of sustainability of livelihoods (3.6)
  • 6%: percent income from wage labor (T2)
    • % of income that poor households receive from wage labor (3.6)
    • income source (T1 theory)
    • % poor income from labor (T1 indicator)
  • 4%: percent income from cash crops (T2)
    • % of labor income that is susceptible to market shocks (i.e. tobacco, sugar, tea, & coffee (3.6)
    • cash crop exposure (T1 theory)
    • % non-food crop (cotton, tobacco, tea) (T1 indicator)
  • 4%: disaster coping strategy (T2)
    • ecological destruction associated with livelihood coping strategies during time of crisis (3.6)
    • ecological coping effect (T1 theory)
    • access to alternative form of income (T1 indicator)

Livelihood zones attribute data was provided by FEWS NET in the form of one three spreadsheets describing typical livelihood profiles for each zone, with one sheet for poor households, one for middle income households, and one for rich households. This data was based on focus groups with stakeholders in each livelihood zone. The authors have summarized the individual poor household spreadsheets into one comprehensive table of variables relevant to the study.

Livelihood zone data transformations

In order to prepare geographic livelihood zone data for analysis, geometry errors are fixed, national parks are removed, and the coordinate reference system is transformed to EPSG:4326 (WGS 1984) geographic coordinates. Livelihood zone attribute data is then joined to the geographic data by livelihood zone code LZCODE.

Traditional authorities

The adaptive capacity analysis is conducted in traditional authorities, which may be provided by the “GADM administrative boundaries for Africa” cited on maps of household resilience (F3 and F4). No date, version, or formal citation for this data is provided in the original study. Traditional authorities (TAs) data can be downloaded from Database of Global Administrative Areas (GADM) version 2.8 at https://gadm.org/download_country_v2.html and unzipped. This data must be downloaded directly from GADM. While the data license permits free use of data for research purposes and publication, it does not permit redistribution.

Traditional authorities (TAs)

Load traditional authorities (TA) data, fix geometry data, and count types of areas.

Type N
City 4
Headquarter 16
National Park 6
Reserve 8
Sub-chief 66
Town 6
Traditional Authority 134
Urban 3
Water body 13

Visualize Lakes, Livelihood Zones, and TAs

TA data transformations

TA data includes conservation areas (reserves and national parks) and water bodies which do not contain populated villages. Extract conservation areas (forests and parks) to a new ta_cons_v layer.

Several of the Lake Malawi water body features in TA data erroneously include populated areas of land. Extract these features as ta_lake_malawi. Likoma island is incorrectly labelled as Lake Malawi, so do not include it as an error for extraction.

Remove conservation areas and water bodies from TAs.

## [1] "256 features in original traditional authorities"
## [1] "230 features after removing conservation areas and water bodies"

Find areas of Lake Malawi features that are actually land by buffering lakes by 500 meters and clipping the Lake Malawi TA features. Calculate new unique second level ID’s as 1000 times the row number. Remove splinter polygons by selecting polygons over 4 km^2 with centroids intersecting livelihood zones.

Merge fixed TA errors back into TA data and save results as derived ta_v.gpkg.

## [1] 9 features created by fixing errors on Lake Malawi shore
## [1] 239 features in final corrected traditional authorites

Drought risk and flood risk

Physical exposure data is derived from the United Nations Environment Programme (UNEP) Global Risk Data Platform (1.4) as global (3.7) continuous raster data (5.6). The climate vulnerability map also cites the Dartmouth Flood Observatory (1999-2007) (F5). According to the references to Peduzzi (2011, 2012), the data for flood risk and drought exposure is available from UNEP/DEWA/GRID-Europe at preview.grid.unep.ch/. The drought risk data is based on “a global monthly gridded precipitation dataset obtained from the Climatic Research Unit (University of East Anglia)” and “a global Standardized Precipitation Index based on Brad Lyon (IRI, Columbia University) methodology” (3.7).

Physical exposure is measured with the following two indicators.

  • 20%: estimated risk for flood hazard (T2)
    • floods & rain variability (T1 theory)
    • flood events (T1 indicator)
    • risks of flood (3.7)
    • global estimated risk index for flood hazard (R)
  • 20%: exposition to drought events (T2)
    • drought & dry spells (T1 theory)
    • drought indices (T1 indicator)
    • (risks of) drought exposure (3.7)
    • physical exposition to drought events 1980 - 2001 (R)

The UNEP Global Risk Data Platform used for this research is no longer available online. The data is provided with the research compendium.

Household DHS data

Household adaptive capacity data is derived from USAID DHS Surveys conducted in 2004 and 2010 (1.4). Readers are referred to the DHS website for an “explanation on using survey data with GPS information” (4.4). The website, www.measuredhs.com, is provided in the references, and forwards to dhsprogram.com. There were 24,850 household surveys in 2010 (5.2), providing data for 203 traditional authorities (F3).

Adaptive capacity is composed of assets and access with the following DHS survey variables.

Assets

  • 6%: Arable land (hectares) (T2)
    • amount of arable land (T1 theory) per household (T1 indicator)
    • larger landholders can diversify crops and sell food (3.4)
  • 4%: Number of livestock units (T2)
    • livestock (T1 theory)
    • number of animals per household by type (T1 indicator)
    • animals used as coping strategy (3.4)
  • 4%: Wealth index score (T2)
    • money (T1 theory)
    • wealth index (based on owned assets) (T1 indicator)
    • wealth (disposable capital assets) (3.4)
    • income is discussed separately from wealth (3.4) but is not included as an indicator
  • 3%: Number in household sick in past 12 months (T2)
    • good health (T1 theory and 3.4)
    • sick in the past 12 months (T1 indicator)
  • 3%: Number of orphans in household (T2)
    • orphan care (T1 theory)
    • number of orphans or vulnerable children (T1 indicator)
    • orphans… are a highly socially vulnerable subset of the population (3.4)
    • orphan care adds tremendous burden to families that are… poor and food insecure (3.4)

Access

  • 4%: time to water source (T2)
    • basics (T1 theory)
    • water (time to source) (T1 indicator)
    • burden that often falls to women and can consume large amounts of time… in a time of shock or drought, water collection time can be protracted causing even greater hardship and vulnerability (3.5)
  • 4%: own a cell phone (T2)
    • media and information (T1 theory)
    • own a cell phone (Y/N) (T1 indicator)
    • households were better prepared, informed and warned about disasters through being well-connected through radio, mobile technology, or tribal networks (3.5)
  • 3%: own a radio (T2)
    • technology sharing (T1 theory)
    • own a radio (Y/N) (T1 indicator)
    • Radio programs are powerful tools for reaching previously inaccessible populations (3.5)
  • 3%: electricity (T2)
    • basics (T1 theory)
    • electricity (Y/N) (T1 indicator)
    • access to the electrical grid (3.5)
  • 2%: type of cooking fuel (T2)
    • basics (T1 theory)
    • cooking fuel type (T1 indicator)
    • selling of charcoal is one of the top coping strategies during periods of food insecurity and market shocks (3.5)
  • 2%: house setting (urban/rural) (T2)
    • market access (T1 theory)
    • rural, peri-urban, urban (T1 indicator)
    • nearest vehicle-accessible road can be several kilometers and the nearest paved road for public transportation to urban centers might be a days or more journey by foot (3.5)
  • 2%: sex of head of household (T2)
    • power and decision-making (T1 theory)
    • female-headed HH (Y/N) (T1 indicator)
    • households headed by females are more vulnerable based on less access to sources of power, land, and resources (3.5)
    • households headed by one parent or by children (encompassed in the variable family structure) were seen as more vulnerable (3.5)

Geographic USAID Demographic and Health Survey (DHS) data requires pre-approved access clearance and login credentials from the DHS Program. For this reproduction study, the following procedure was used to gain access:

  1. Go to https://dhsprogram.com/Data/
  2. Create an account, ideally with an education or government e-mail address
  3. Within the Datasets menu, Create a new project
  4. Enter the following information: Project Title: Reproducing a Vulnerability Model of Malawi Description of Study: The purpose of this study is to reproduce the methods of a published research article: Malcomb, D. W., E. A. Weaver, and A. R. Krakowka. 2014. Vulnerability modeling for sub-Saharan Africa: An operationalized approach in Malawi. Applied Geography 48:17–30. https://doi.org/10.1016/j.apgeog.2014.01.004. The authors of this paper used geocoded DHS surveys for Malawi in 2004 and 2010, in combination with FEWSnet livelihood data and UNEP flood and drought risk data. Following the author’s methodology, we plan to download the data using the rdhs package for R and aggregate the data at Malawi’s 2nd administrative level: districts. We will be working with a GitHub repository that stores the raw data locally in a directory ignored by the .gitignore file, and only moves the data into a shared and version-controlled directory once it has been aggregated to the District level. This will ensure that the privacy of survey respondents and requirements of data partners are protected, because all of the data will be aggregated into district polygons, as already shown and published in Malcomb et al (2014).
  5. Choose Region: Sub-Saharan Africa
  6. Click Show GPS Datasets at the top-left of the country tables
  7. Check Survey and GPS data for Malawi
  8. Save selection
  9. Read and agree to the conditions of use for the DHS Program datasets and save these conditions for your metadata records.
  10. Enter a Justification for using DHS Program Geographic Datasets: The research aim is to reproduce Malcomb et al (2014) in which GPS Datasets are used to spatially join DHS Survey data to Malawi’s Districts for the purpose of sub-national climate change vulnerability mapping. Therefore, the research will not be reproducible without the geographic datasets.

The rdhs package can be used to download the data, provided a login email and project name via console and password via pop-up dialogue.

Download the Malawi 2010 survey data and geographic points.

Load tabular data of household surveys

Load geographic data of household survey clusters. Some household survey points are erroneously placed at the WGS 1984 coordinate reference system origin (Equator and Prime Meridian).

DHS Data Transformations

In order to simultaneously maximize reproducibility while avoiding direct redistribution of DHS GPS data, we spatially join the GPS data to the Traditional Authority enumeration areas. Adaptive capacity is ultimately mapped by traditional authority, but the data comes from household-level surveys. Surveys are grouped into clusters with one geographic point. Therefore, the traditional authority to which each survey will be assigned must be spatially joined to the cluster point, and then joined by attribute to the household survey. The adaptive capacity calculation at the household level also requires urban/rural status, which is stored in the cluster.

Many household surveys contain inconclusive answers (e.g. “I don’t know”) or are missing data for survey questions used in the adaptive capacity calculation. The livestock variable will be calculated as a sum of four livestock types, so we remove any household with uncertain answers about any of the livestock types and remove households with missing data for all livestock types. Households with answers about some livestock types and missing data for others are still included in the data.

We remove incomplete household surveys.

Prior observations

Some of the authors had already examined the data and attempted a reproduction study prior to writing the preregistered analysis plan.

Bias and threats to validity

The spatial extent of the study was the country of Malawi (OSM relation 195290), excluding large bodies of water, national parks or similarly reserved land, and areas missing data (4.5). 203 traditional authority areas were included in the original study (F4).

The authors suggest that the scale of the phenomena of vulnerability dynamics in the context of climate change is at the household level (1.4, 2.2, 3.1 and 4.4). The authors use the third administrative level (traditional authorities) as the spatial scale and units of analysis of household resilience (4.4 and F4). The spatial support for the final analysis of climate vulnerability is a raster grid (4.6, F5) with unknown spatial resolution—appearing finer than the size of traditional authorities and the smallest unit on the scale bar, which is 12.5 kilometers (F5). We presume that the spatial resolution may be identical to at least one of the gridded physical exposure raster inputs.

Edge effects and neighboring countries will not be addressed in the analysis (4.2). The spatial analysis techniques in this study are not sensitive to edge effects.

The analysis does not include creation of any spatial subgroups and does not measure or account for any spatial autocorrelation, spatial heterogeneity, or spatial anistropies.

Analysis

Planned differences from the original study

The replication study will focus on reproducing 2010 household resilience (F4) and climate vulnerability (F5), excluding the 2004 household resilience analysis (F3). The aim of this reproduction is to produce results identical to the original study. Therefore, we will not collect new interview or focus group data. Additionally, qualitative interview and focus group data was not provided with the original study. Therefore, we will not attempt to reinterpret any qualitative data or determine new themes, indicators or weights for the models. The reproduction study will use the indicators and weights as they are described in the original study.

The replication study will use a different software environment, using replicable open source software over proprietary software. Specifically, the study will be completed using The R Project for Statistical Computing version 3.6.1 or later using RStudio version 1.3.1 or later, and the research will be completed in full on both Windows 10 and MacOS operating systems. A complete list of required R packages is not known at the time of preregistration, but will be reported with the final publication.

The study will attempt to reproduce the original methods exactly, but some differences may be inevitable due to ambiguous or conflicting information in the original article. We will plan to make the following reasonable decisions, which may differ from the authors’ intentions: 1. Figure 4 represents adaptive capacity, composed of assets and access. 1. Adaptive capacity scores will be calculated for each household, and then household scores will be spatially joined by traditional authority and averaged. 1. Figure 5 represents vulnerability, composed of adaptive capacity, livelihood sensitivity, and physical exposure. 1. Every indicator will be rescaled to a 0 to 4 scale using the formula: percent rank * 4. This method is a compromise from the uncertainty caused by a 0 to 5 scale, quintiles, and nominal indicators. 1. High ranks (4) will be assigned to better and safer conditions for each indicator. 1. Weighted aggregation will be formulated so that the aggregate scores have a theoretical minimum of 0 and maximum of the assigned percentage for the thematic concept. - Assets = ([land] * 0.06 + [livestock units] * 0.04 + [wealth] * 0.04 - [number sick] * 0.03 - [orphans] * 0.03) * 25 - Access = ([water] * 0.04 + [cell phone] * 0.04 + [radio] * 0.03 + [electricity] * 0.03 + [cooking fuel] * 0.02 + [urban/rural] * 0.02 + [female household] * 0.02) * 25 - Livelihood sensitivity = ([subsistence food] * 0.06 + [wage income] * 0.06 + [cash crop income] * 0.04 + [disaster coping] * 0.04) * 25 - Physical exposure = (flood risk * 0.2 + drought exposure * 0.2) * 50 1. Each thematic indicator will be rasterized or resampled to the UNEP/GRID data input most closely resembling the spatial resolution of figure 5. 1. Vulnerability will be calculated so that the aggregate scores have a theoretical minimum of 0 and maximum of 100. This is achieved by inverting physical exposure. - Vulnerability = Assets + Access + Livelihood sensitivity + (40 - Physical Exposure) 1. Any traditional authority missing adaptive capacity data from DHS surveys will be removed / masked from the final vulnerability analysis.

Adaptive Capacity

The variables for adaptive capacity are aggregated into thematic concepts and referenced in the original paper as outlined below:

  • 40%: Adaptive capacity (T2)
    • “adaptive capacity” defined as “household-level assets to recover from disasters and access to resources” (2.2) and referred to as:
      • “adaptive capacity”, “capacity score”, or “adaptive capacity score” (3.3, 4.6 formula, 5.2, 5.4, 6.3)
      • “assets” and “access” (3.3, 5.2, F3 and F4)
      • “assets” and “access” included, but not “adaptive capacity” (1.4, T1 theory, F5)
      • “resilience”, “household(-level) resilience” or “resilience scores” (5.2, 5.3, F3, F4 and F5, 6.4)
      • “vulnerability” (4.1, 4.4, 4.5, 5.1, 5.3, 5.4, 5.5, 6.1, 6.2, 6.3, 6.4, 7.2)
    • measured as a positive condition (4.6)
  • 20%: Assets (T2)
    • defined only as a component of adaptive capacity: “assets to recover from disasters” (2.2) and referred to as:
      • “assets” (1.4, 3.3, 3.4, T1 theory, F5)
    • measured as a positive condition (4.6)
  • 20%: Access (T2)
    • defined only as a component of adaptive capacity: “access to resources” (2.2) and referred to as:
      • “access” (1.4, 3.3, 3.5, T1 theory, F5)
    • measured as a positive condition (4.6)

Rescale adaptive capacity indicators

Calculate percent rank for each component of household adaptive capacity. We had to make many assumptions about calculating individual components, e.g. about how to aggregate different forms of livestock, and which values to invert such that high numbers correspond to low capacity (e.g. number of orphans or sick members of the household). Rescaling to a quintile rank as described in the original study is unclear, especially considering the number of discrete or even binary inputs. We have made a judgement call to do this by calculating percent rank and multiplying by 4, producing a theoretical domain of 0 to 4 similar to that of quintiles.

Household adaptive capacity

Calculate household-level adaptive capacity scores based on original study Table 2 weights. The indicators have already been rescaled to a possible domain of 0 to 4, and the weights sum to 0.4, giving a possible domain of adaptive capacity scores from 0.0 to 1.6.

Summary statistics of adaptive capacity and its components at the household level.

Traditional authority adaptive capacity

Aggregate household adaptive capacity scores to traditional authorities. The original paper found adaptive capacity scores for 203 TAs, of which we found 6 TAs were conservation areas, leaving 197 meaningful TA scores. We created an additional 9 TAs from errors from three features on Lake Malawi, so if the original authors did not notice those errors, we could expect scores for 206 TAs.

Now that household adaptive capacity data has been aggregated, they may be saved to the data\derived\public directory.

Loads the pre-aggregated (publicly-avaialable adaptive capacity data) DHS BLACK BOX ENDS HERE

Load aggregated public adaptive capacity data.

## Reading layer `ta_v' from data source 
##   `/Users/williamprocter/GitHub/RPr-Malcomb-2014/data/derived/public/ta_v.gpkg' 
##   using driver `GPKG'
## Simple feature collection with 239 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 32.67152 ymin: -17.12721 xmax: 35.91505 ymax: -9.363796
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 18
##   ta_id capacity_avg capacity_min capacity_max capacity_sd  n_hh livestock_avg
##   <dbl>        <dbl>        <dbl>        <dbl>       <dbl> <int>         <dbl>
## 1     1        0.667       0.384         0.855       0.139    20         0.120
## 2     2        0.380       0.100         0.846       0.146   315         0.982
## 3     3        0.390       0.0960        0.846       0.152   450         1.05 
## 4     4        0.635       0.227         0.993       0.163   301         0.404
## 5     5        0.370       0.164         0.761       0.158    24         1.41 
## 6     6        0.431       0.203         0.741       0.137    54         0.524
## # ℹ 11 more variables: sick_avg <dbl>, land_avg <dbl>, wealth_avg <dbl>,
## #   orphans_avg <dbl>, water_avg <dbl>, electricity_avg <dbl>,
## #   cooking_avg <dbl>, femalehh_avg <dbl>, cellphone_avg <dbl>,
## #   radio_avg <dbl>, urban_avg <dbl>

Count TAs with adaptive capacity data.

## [1] 215 TAs have adaptive capacity data

Finding scores for 215 traditional authorities is surprising, and most likely relates to differences in discovery and treatment of geometry errors and missing data. The reason(s) for these differences cannot be determined with the content of the original manuscript.

Mapping adaptive capacity

Join adaptive capacity data to geographic TAs and rescale in attempt to match original publication. The original publication figure 4 shows ranges from 11.48 to 25.77, but after rescaling indicators to domains of 0 to 4 and multiplying by percentages in table 2 (which sum to 0.4), the theoretical domain is only 0 to 1.6. We might suppose that the authors had rescaled adaptive capacity to a possible domain of 0 to 40 in accordance with the 40% weight of adaptive capacity in the overall vulnerability model. Therefore, we may multiply our possible domain of 0 to 1.6 by 25 to achieve a possible domain of 0 to 40.

rpac_unscaled rpac
nbr.val 215.00 215.00
nbr.na 24.00 24.00
min 0.30 7.41
max 0.68 16.90
range 0.38 9.48
median 0.43 10.66
mean 0.44 10.99
std.dev 0.07 1.80

The original publication uses the Jenks Natural Breaks method to classify the data.

rpac_class n
1 67
2 80
3 53
4 15
NA 24

Reproduction figure 4

Map reproduction results for comparison to figure 4.

Evaluate adaptive capacity reproduction

In order to test the adaptive capacity results, we will georeference the original figure 4 map using the QGIS3 georeferencer plugin. Using a vector dataset of traditional authorities and the georeferenced map, we will then use zonal statistics to extract the average brightness values, (which represent four classes of adaptive capacity) for each traditional authority. We will use an interior buffer of the traditional authority polygons, optimized in order to avoid summarizing border symbol in zonal statistics while capturing as much of the choropleth color symbol as possible. After inspecting a histogram of the mean brightness values, we will reclassify the values as closely to the four classes on the original figure 4 as possible and then manually adjust the attribute values for any misclassified traditional authorities. We will compare original and reproduction household resilience results by creating a confusion matrix, calculating the Spearman’s Rho correlation coefficient (expecting a value of 1 for perfect positive correlation), and creating a thematic map of the difference between the original results and replication results.

Digitize original study figure 4

Ordinal data from figure 4 was digitized in QGIS with the following procedure:

  1. Copy image from the original publication pdf file using Adobe Acrobat Pro
  2. Paste the image and save as a .png file with pixel dimensions 1982 by 2811
  3. Use QGIS 3.26.3 Georeference the map image to match ta_v.gpkg using WGS 84 geographic coordinates (epsg:4326). Use linear georeferencing with points in metadata\malcomb_fig4.png.points
  4. Make internal buffer to reduce the noise from boundary line symbology.
    1. Project ta_v to UTM 36S epsg:32736: ta_v_fig4.gpkg:utm36s.
    2. Calculate an internal buffer of -600m: ta_v_fig4.gpkg:utm36s.
    3. Project back to WGS 84 epsg:4326: ta_v_fig4.gpkg:buffer_wgs84.
  5. Extract the average and standard deviation of the original map’s red, green, and blue bands for each traditional authority using the zonal statistics algorithm: ta_v_fig4.gpkg:r, ta_v_fig4.gpkg:rb and ta_v_fig4.gpkg:rbg
  6. Join the zonal statistics results to the ta_v layer by the ID_2 attribute: ta_v_fig4.gpkg:ta_v_fig4
  7. Classify the results in a new field orac (original adaptive capacity) using the field calculator and CASE statements, choosing break points that classify most traditional authorities correctly.
  8. Visually inspect results and edit the orac attribute for any mis-classified area.
  9. The original map contains data in six conservation areas, noted with digitized point features in ta_v_fig4.gpkg:fig4_errors. Other areas are coded as follows:
code description
-3 polygon too small to discern color or pattern fill
-2 white fill not matching any legend item
-1 pattern fill for “missing DHS data”
1 lowest adaptive capacity
2
3
4 highest adaptive capacity

Original study figure 4

Load digitized figure 4 data and display counts of results. Convert all forms of missing data to NA to be excluded from mapping and statistics. Join original figure 4 adaptive capacity results to ta_v.

orac n
-3 3
-2 30
-1 3
1 38
2 56
3 72
4 37

Map original figure 4.

Compare adaptive capacity result

Calculate and map difference between the two maps.

##    
##      1  2  3  4
##   1 34 27  6  0
##   2  4 26 44  5
##   3  0  0 19 29
##   4  0  0  0  3
## 
##  Spearman's rank correlation rho
## 
## data:  ta_v$rpac_class and ta_v$orac
## S = 268637, p-value < 2.2e-16
## alternative hypothesis: true rho is greater than 0
## sample estimates:
##       rho 
## 0.7891711

Vulnerability

  • 40%: Adaptive Capacity
  • 20%: Livelihood Sensitivity (T2)
    • “sensitivity” defined as “degree to which a system will respond to an external disturbing force” (2.2) and referred to as:
      • “livelihood sensitivity” (1.4, 3.3, 3.6, T1 theory, 4.6 formula, F5)
    • measured as a positive condition (4.6)
  • 40%: Physical exposure (T2)
    • “exposure” defined as the “magnitude and frequency of forces that could stress a system” (2.2) and referred to as:
      • “physical exposure” (1.4, 3.3, 3.7, 4.6 formula, T2)
      • “biophysical exposure” (T1 theory)
      • “exposure to floods and droughts” (F5)
    • measured as a negative condition (4.6)
  • 100%: Household Resilience (T2)
    • “resilience” defined as “ability of a household to prepare for, respond to and recover from complex drivers of vulnerability” (2.2, 5.6) and referred to as:
      • “household resilience” calculated as “Adaptive Capacity + Livelihood Sensitivity - Physical Exposure” (4.6 formula)
      • “vulnerability to climate change” calculated as “assets + access + livelihood sensitivity - physical exposure” (F5)
      • “vulnerability” (title, 3.3, 3.6, 4.3, 4.5, 6.5, 7.1, 7.2)

Extent and spatial resolution

Create bounding box representing the spatial extent of Malawi. Create a raster grid frame matching the extent of the bounding box and the spatial resolution of the drought exposure raster, which is 0.041667 decimal degrees. Although the flood risk raster has a coarser spatial resolution, visual inspection of the original figure 5 suggests that the finer spatial resolution of drought exposure was used for the original analysis.

Adaptive capacity

Convert adaptive capacity to raster grid.

Drought exposure

Clip and warp drought exposure to match our extent and spatial resolution.

Create a mask with the adaptive capacity results so that lakes, conservation areas, and traditional authorities with no data will not skew the classification / rescaling of drought exposure. Apply this mask to drought exposure. Masking is our own decision based on intuition: it is not specified in the original publication.

Classify drought exposure into quintile classes (0 to 4) Then rescale to 20% by multiplying by 4.

Flood risk

Clip and warp flood risk to match our extent and spatial resolution.

Mask and rescale flood. Since flood is already on scale from 0 to 4, simply multiply by 5 to achieve the 20% weight.

Livelihood sensitivity

Calculate livelihood sensitivity indicators from FEWSnet livelihood zone baseline profiles of poor households according to table 2.

Rescale livelihood sensitivity indicators into quantiles.

##              pctOwnCrop pctIncWage pctIncCashCrops pctDisasterCope ownCrop
## nbr.val            18.0       18.0            18.0            18.0    18.0
## nbr.null            0.0        0.0            13.0             1.0     1.0
## nbr.na              0.0        0.0             0.0             0.0     0.0
## min                29.4        9.7             0.0             0.0     0.0
## max                88.0       50.3            75.1            71.9     4.0
## range              58.6       40.6            75.1            71.9     4.0
## sum              1059.3      489.6           171.8           236.5    36.0
## median             55.0       24.7             0.0             8.8     2.0
## mean               58.9       27.2             9.5            13.1     2.0
## SE.mean             3.1        2.6             5.3             3.7     0.3
## CI.mean.0.95        6.6        5.5            11.2             7.9     0.6
## var               176.8      121.3           507.8           251.0     1.6
## std.dev            13.3       11.0            22.5            15.8     1.3
## coef.var            0.2        0.4             2.4             1.2     0.6
##              wageIncome cashCropIncome disasterCope
## nbr.val            18.0           18.0         18.0
## nbr.null            1.0            1.0          1.0
## nbr.na              0.0            0.0          0.0
## min                 0.0            0.0          0.0
## max                 4.0            1.2          4.0
## range               4.0            1.2          4.0
## sum                36.0           17.6         36.0
## median              2.0            1.2          2.0
## mean                2.0            1.0          2.0
## SE.mean             0.3            0.1          0.3
## CI.mean.0.95        0.6            0.2          0.6
## var                 1.6            0.1          1.6
## std.dev             1.3            0.4          1.3
## coef.var            0.6            0.4          0.6

Calculate aggregate livelihood sensitivity score

##              sensitivity
## nbr.val            18.00
## nbr.null            0.00
## nbr.na              0.00
## min                 5.88
## max                14.00
## range               8.12
## sum               161.65
## median              8.65
## mean                8.98
## SE.mean             0.56
## CI.mean.0.95        1.18
## var                 5.65
## std.dev             2.38
## coef.var            0.26

Convert livelihood sensitivity into raster grid

Vulnerability score

Calculate an aggregated vulnerability score by adding low adaptive capacity (invert adaptive capacity by subtracting from the maximum score of 40), livelihood sensitivity, drought exposure, and flood risk.

\[ Vulnerability = (40 - Adaptive Capacity) + Livelihood Sensitivity + Drought Exposure + Flood Risk \]

Reproduction figure 5

Polygonize Figure 5 Reproduction to show vulnerability by TAs

## tmap mode set to interactive viewing
##     vulnerability                 NAME_2
## 210      74.88749          Luchenza Town
## 18       73.40314               TA Lundu
## 191      70.47000            Rumphi Boma
## 218      69.76477             TA Kapichi
## 36       68.55306              SC Chauma
## 207      68.11785             TA Maganga
## 56       67.93421            TA Kilupula
## 102      67.90067          Mangochi Town
## 20       67.86956              TA Maseya
## 17       66.49447             TA Katunga
## 200      65.97462            Salima Town
## 169      65.83453               TA Ngabu
## 87       65.67543             TA Kalumbu
## 167      65.60438               TA Mlolo
## 86       64.89906             TA Kalumba
## 125      64.52626              TA Nkanda
## 209      63.93380               TA Pemba
## 19       63.93202            TA Makhwira
## 217      63.92973           TA Chimaliro
## 5        63.89426             TA Chigaru
## 59       63.71255           Kasungu Boma
## 236      63.52184            Lake Malawi
## 124      63.09310              TA Mabuka
## 25       63.04508             TA Likoswe
## 27       62.55026              TA Nchema
## 54       62.45810           SC Mwakaboko
## 23       62.44943             TA Chitera
## 214      62.31684             SC Thukuta
## 26       62.23818               TA Mpama
## 146      61.92832          SC Fukamalaza
## 90       61.83397           TA Mazengera
## 53       61.59489           Karonga Town
## 22       61.56654        Chiradzulu Boma
## 91       61.43003           Liwonde Town
## 28       60.98727               TA Nkalo
## 126      60.80534         TA Nthiramanja
## 24       60.79577            TA Kadewere
## 88       60.39819            TA Khongoni
## 212      60.35019             SC Mbawela
## 223      59.91867               SC Mbiza
## 225      59.83121             TA Chikowi
## 215      59.67893             TA Bvumbwe
## 229      59.46331              TA Mwambo
## 12       59.35587               TA Somba
## 120      59.32395           Mulanje Boma
## 6        59.25960              TA Kapeni
## 213      59.11343              SC Mphuka
## 166      58.90130             TA Malemia
## 228      58.40261              TA Mlumbe
## 75       58.17479          Lilongwe City
## 58       58.11425             TA Wasambo
## 94       58.07124             SC Chiwalo
## 168      58.05344             TA Ndamera
## 219      58.01133         TA Nchilamwela
## 119      58.00610                TA Zulu
## 216      57.76417            TA Changata
## 194      57.48298            SC Mwahenga
## 123      57.47580            TA Chikumbu
## 122      57.31542        SC Laston Njema
## 115      57.20259             SC Mavwere
## 80       56.91485              TA Chadza
## 226      56.76168          TA Kuntumanji
## 221      56.56635              TA Thomas
## 85       56.42342              TA Kalolo
## 121      56.19820                SC Juma
## 152      56.19259       TA Malenga Mzoma
## 9        56.15886               TA Lundu
## 100      56.08372             TA Liwonde
## 109      55.94069              TA Katuli
## 35       55.81042             Dedza Boma
## 211      55.76873          SC Kwethemule
## 11       55.27498              TA Makata
## 45       54.82391          Mponela Urban
## 206      54.81473            TA Kuluunda
## 76       54.58220          SC Chitekwele
## 235      54.44070            Lake Malawi
## 89       54.43742              TA Malili
## 40       54.30595             TA Kaphuka
## 98       54.28279              SC Sitola
## 7        54.22746             TA Kuntaja
## 145      54.20499               TA Symon
## 180      54.15146           TA Njolomole
## 130      54.09096            Mzimba Boma
## 106      54.02623             SC Namabvi
## 230      53.86458             Zomba City
## 71       53.68579               TA Mwase
## 172      53.64496            Ntcheu Boma
## 65       53.57297             SC Njombwa
## 37       53.55159       SC Chilikumwendo
## 66       53.46895            SC Simlemba
## 114      53.27869               SC Dambe
## 81       53.25236             TA Chimutu
## 108      53.23626              TA Jalasi
## 72       53.09319              TA Santhe
## 127      53.04996            Mwanza Boma
## 69       53.03479              TA Kaomba
## 3        52.98554             TA Nsamala
## 195      52.94149            SC Mwalweni
## 163      52.77137              SC Makoka
## 4        52.74460          Blantyre City
## 116      52.70460               SC Mduwa
## 171      52.67748             TA Tengani
## 31       52.48693         TA Mwabulambya
## 157      52.33348             SC Mphonde
## 46       52.31622            SC Chakhaza
## 55       52.28510        SC Mwirang'ombe
## 118      51.80475            TA Mlonyeni
## 177      51.77004            TA Kwataine
## 96       51.75683               SC Mposa
## 43       51.59991             TA Tambala
## 78       51.06876               SC Njewa
## 60       51.01491      SC Chilowamatambe
## 97       50.64223              SC Ngokwe
## 199      50.50884            TA Mwamlowe
## 227      50.50788             TA Malemia
## 201      50.47142           SC Kambalame
## 141      50.46917           TA Mzukuzuku
## 47       50.16975             SC Kayembe
## 208      50.15420              TA Ndindi
## 154      50.01755             TA Timbiri
## 83       50.00911           TA Chitukula
## 2        49.88832             TA Kalembo
## 136      49.81532            TA M'Mbelwa
## 57       49.77020              TA Kyungu
## 101      49.60914              TA Nyambi
## 147      49.49666             SC Malanda
## 186      49.48667              TA Kalumo
## 196      49.47585       SC Mwankhunikira
## 173      49.42627            SC Champiti
## 175      49.20521         SC Makwangwala
## 73       49.08606               TA Wimbe
## 77       49.04273               SC Mtema
## 50       48.94071             TA Chiwere
## 93       48.87266             SC Chikweo
## 202      48.67615            SC Kambwiri
## 92       48.64308              SC Chamba
## 84       48.56950            TA Kabudula
## 95       48.55069              SC Mlomba
## 187      48.53755            TA Kasakula
## 176      48.25062         TA Chakhumbira
## 139      48.15588              TA Mtwalo
## 41       47.85752             TA Kasumbu
## 103      47.81872       Monkey Bay Urban
## 137      47.70679            TA Mabulabo
## 237      47.58378            Lake Malawi
## 183      47.49863            SC Chilooko
## 117      47.44227              TA Mkanda
## 174      47.27630       SC Goodson Ganya
## 234      47.08236            Lake Malawi
## 64       46.98682            SC M'nyanja
## 32       46.76167         TA Mwenemisuku
## 70       46.58720            TA Kapelula
## 204      46.48960             TA Karonga
## 128      46.26805             TA Kanduku
## 135      45.98441              TA Chindi
## 205      45.94304           TA Khombedza
## 107      45.89017            TA Chimwala
## 51       45.78555              TA Dzoole
## 178      45.68986              TA Masasa
## 150      45.61832          TA Fukamapiri
## 153      45.60334             TA Musisya
## 38       45.48548       SC Kamenya Gwaza
## 184      45.45104             SC Nthondo
## 148      45.38836          SC Nyaluwanga
## 63       45.34869               SC Lukwa
## 185      45.19532              TA Chikho
## 129      45.11431             TA Nthache
## 179      44.95430              TA Mpando
## 67       44.92747               TA Chulu
## 30       44.89904              TA Kameme
## 68       44.76613            TA Kaluluma
## 151      44.57588           TA Kabunduli
## 161      44.54433            TA Mwadzama
## 140      44.45509           TA Mzikubola
## 160      43.84254      TA Malenga Chanzi
## 48       43.80120             SC Mkukula
## 156      43.27700            SC Kafuzila
## 143      43.11324              TA Mlauli
## 34       42.82582            TA Nthalire
## 49       42.81496             SC Mponela
## 52       42.67091          TA Msakambewa
## 134      42.55345  SC Khosolo Gwaza Jere
## 192      42.48289          SC Chapinduka
## 181      42.17665            TA Phambala
## 133      41.93592    SC Kampingo Sibande
## 158      40.67678           SC Mwansambo
## 144      38.74298               TA Ngozi
## 188      35.91677            TA Kasukula
## 1              NA            Balaka Town
## 8              NA          TA Kunthembwe
## 10             NA          TA Machinjili
## 13             NA          Chikwawa Boma
## 14             NA            Ngabu Urban
## 15             NA          TA Chapananga
## 16             NA              TA Kasisi
## 21             NA               TA Ngabu
## 29             NA           Chitipa Boma
## 33             NA          TA Mwenewenya
## 39             NA        TA Kachindamoto
## 42             NA               TA Pemba
## 44             NA              Dowa Boma
## 61             NA            SC Chisikwa
## 62             NA             SC Kawamba
## 74             NA            Lake Malawi
## 79             NA            SC Tsabango
## 82             NA             TA Chiseka
## 99             NA             TA Kawinga
## 104            NA               SC Chowe
## 105            NA       SC Mbwana Nyambi
## 110            NA           TA Makanjila
## 111            NA              TA Mponda
## 112            NA            TA Nankumba
## 113            NA           Mchinji Boma
## 131            NA             Mzuzu City
## 132            NA SC Jaravikuba Munthali
## 138            NA           TA Mpherembe
## 142            NA               TA Dambe
## 149            NA            SC Zilakoma
## 155            NA        Nkhotakota Boma
## 159            NA            TA Kanyenda
## 162            NA            Nsanje Boma
## 164            NA              SC Mbenje
## 165            NA            TA Chimombo
## 170            NA         TA Nyachikadza
## 182            NA           Ntchisi Boma
## 189            NA             TA Mkhumba
## 190            NA             TA Nazombe
## 193            NA             SC Kachulu
## 197            NA      TA Chikulamayembe
## 198            NA             TA Katumbi
## 203            NA              SC Mwanza
## 220            NA              TA Nsabwe
## 222            NA            Thyolo Boma
## 224            NA            SC Mkumbira
## 231            NA            Lake Malawi
## 232            NA            Lake Malawi
## 233            NA            Lake Malawi
## 238            NA            Lake Malawi
## 239            NA            Lake Malawi

Evaluate vulnerability reproduction

In order to compare the Malawi vulnerability results, we will georeference the original figure 5 map using QGIS georeferencer plugin. We will vectorize the UNEP-Grid raster input most closely matching the published map and summarize the red, green, and blue brightness values of the original map using zonal statistics. We will add the green and blue brightness values together to convert the original color ramp into a linear scale of continuous values. We will compare original and reproduction Malawi vulnerability results by creating a scatterplot, Spearman’s Rho correlation coefficient (expecting a value near 1 for perfect positive correlation), and thematic map of the difference between the original results and replication results.

Original study figure 5

Comparing the reproduction of figure 5 with the original figure 5 requires first digitizing the original figure 5 (unclassified choropleth map with yellow to red gradient) in QGIS as follows:

  1. Copy image of figure 5 from the original publication pdf file using Adobe Acrobat Pro
  2. Paste the image and save as a .png file with pixel dimensions 1949 by 2811
  3. Use QGIS 3.26.3 Georeference the map image to match ta_v.gpkg using WGS 84 geographic coordinates (epsg:4326). Use linear georeferencing with points in ...
  4. Convert ta_capacity.tif raster to vector polygons
  5. Extract the average blue and green bands from the georeferenced map image using zonal statistics
  6. Save results as georef_bg.gpkg.

To approximate data values from the yellow to red gradient of the original map, the blue and green bands are then added, inverted, and rescaled to a range from 0 to 100.

As a planned deviation for reanalysis, a z-scored historgram is added to visualize the spread of the vulnerability range.

Compare vulnerability result

## 
##  Spearman's rank correlation rho
## 
## data:  vulnerability_p$orv and vulnerability_p$rpv
## S = 7087504387, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.1974578

Map differences in Figure 5. As a planned deviation for reanalysis, a z-scored historgram is added to visualize the spread of the vulnerability range.

## Warning: Removed 10752 rows containing non-finite values (`stat_bin()`).

Discussion of the original study

Data collection and spatial sampling

The original study included over 70 qualitative semi-structured interviews with officials from development agencies, government, and non-governmental organizations (1.4). 47 specific interviewee titles are listed in a “full list of the interviews” (A1), leaving uncertainty around the remaining 23 (or more) interviewees. The study included 11 village focus groups (1.4). The interviews and focus groups were selected to focus on villages and organizations connected at different levels of organization to “externally designed adaptation projects” (3.2). We presume that the interviewees and focus groups were selected based upon their association with externally designed and funded climate change adaptation projects, which are neither equally nor justly distributed (Barrett 2014). No minimum sample size or stopping criteria were defined.

The original authors used exclusively secondary data for the vulnerability models and did not subset or sample from the secondary data. The published results based on DHS surveys include data 203 traditional authorities in 2010 (F4), whereas the authors state that there are more than 250 populated traditional authorities in Malawi (4.4). The 2010 household resilience data is based upon 24,850 DHS household surveys (5.2). Furthermore, “not every traditional authority had surveys conducted within its administrative boundaries” (5.2). This suggests possible spatial sampling problems in the DHS survey data in the context of its application in this study. The authors’ rationales for using the third level administrative units of traditional authorities include matching the political level at which many projects are planned and assessed, and identifying hotspots of vulnerability that are lost in the aggregation to second level districts (4.4). The original paper does not contain any further detail on the spatial sampling or distribution of DHS surveys vis a vis traditional authorities.

Aggregate thematic concepts

The authors use the terms and definitions of “adaptive capacity”, “vulnerability” and “resilience” inconsistently in the paper. Depending on context, both “vulnerability” and “resilience” may refer only to the assets and access portion of the model or to the full model inclusive of adaptive capacity (assets and access), livelihood sensitivity, and exposure. In some sections and formulas, “adaptive capacity” is used, while in others it appears as its components (assets and access) or even as “household resilience”. The data mapped in figures 3 and 4 are described in different sections of the original paper as resilience, adaptive capacity, and/or vulnerability. The data mapped in figure 5 are described in different sections of the article as vulnerability or resilience. Confounding matters further, the authors’ definition for resilience is typical for resilience theory, but the formula for resilience is more typical of the well-known Intergovernmental Panel on Climate Change (IPCC) operationalization of “vulnerability” (Gallopín 2006, Smit and Wandel 2006).

For the purposes of the reproduction study, the data visualized in figure 4 of the original study and referred to as 40% Adaptive Capacity in table 2 will be referred to as adaptive capacity while the data visualized in figure 5 of the original study and referred to as 100% Household Resilience in table 2 will be referred to as vulnerability.

Attribute variable transformations

The original study’s description of attribute variable transformations is confusing, and we attempt to present all of the evidence in the original study clearly below. All variables are normalized between zero and five (4.3 and 5.6) with zero representing the worst or poorest condition and five representing the best or richest condition (4.3). The normalization method is not described, but the poorest and richest conditions are described as “quintiles” with values of zero and five (4.3). “Quintiles” suggests classification into five classes with equal counts or frequencies. “Normalization” suggests transforming each variable into a normal distribution. A minimum of zero and maximum of five suggests rescaling the data to a range from zero to five or classifying the data into six quantiles assigned integer values zero through five. Four variables were described as (Y/N) nominal data (T1), for which it is not clear how to transform the data into ordinal data with more than two classes. A similar concern arises for the market access variable with three classes (Rural, Peri-urban, and Urban). Regardless, there is ambiguity in the method of normalizing, scaling, or classifying each variable.

Adaptive capacity is analyzed with the weighted combination of all asset theme indicators and access theme indicators. Weights are given in the Variables section. The formula for this combination is not specified. For 2004, the authors report a minimum household adaptive capacity score of -0.80 and maximum of 39.33 (5.2). A maximum score near 40 is intuitive as a theoretical maximum of 40% for the adaptive capacity category. It is not clear whether the calculation should be a weighted average, weighted sum, weighted combination specified to achieve a maximum of 40%, or some other form of weighted combination. Results of the average adaptive capacity score for traditional authorities in 2010 are visualized in figure 4.

Livelihood sensitivity is presumably calculated with the weighted combination of its four indicators. Weights are given in the Variables section. The formula for this combination is not specified. Results for this theme are not presented.

Physical exposure may be calculated with the weighted combination of its two indicators. Weights are given in the Variables section. The formula for this combination is not specified. It is not specified whether any geographic transformation is required to combine these two variables. Results for this theme are not presented.

Geographic transformations

Adaptive capacity is analyzed in the spatial units of traditional authorities (4.4, 5.2, F3 and F4). This aggregation from households to traditional authorities is accomplished with a spatial join (5.2) with an average of the individual household resilience scores (5.2, 6.3, F3 and F4), and classified into four classes with the Natural Breaks Jenks classification method (F3 and F4). In the methodology section, the authors state that “DHS Indicators were disaggregated to the village level” (4.4). Since the DHS data is described as using the household level of aggregation, it is not clear how or why the data would be “disaggregated” to the village level.

Vulnerability is analyzed in the spatial units of gridded cells (F5). Each theme was converted to the raster grid data format (4.6). The paper does not specify the parameters for raster conversion, including the relationship between vector polygons and raster grid cells or the spatial resolution of the grid cells. Presumably, only adaptive capacity and livelihood sensitivity were converted, since physical exposure is already in a raster grid format. The paper does not specify if or how any geographic transformation is required for the biophysical risk grids from the UNEP Global Risk Data Platform. It may be possible to infer the resolution and methods from close inspection of the final climate vulnerability map (F5).

Analyses

The final climate vulnerability analysis is calculated with map algebra on a raster grid for each theme using the formula:

household resilience = adaptive capacity + livelihood sensitivity - physical exposure (4.6).

The results are presented as a continuous raster grid with a continuous color gradient. No descriptive statistics of the results are provided.

Geographical characteristics

The coordinate reference system(s) used in the original study are not specified. However, the study does not appear to include any distance or area calculations: therefore the analysis should not be sensitive to the coordinate reference system used as long as each layer is stored or reprojected into one consistent system. Visual map distortion is the only relevant threat in this sense.

Temporal characteristics

The temporal extent of the original study was stated as “2004–2010 based on the availability of the Malawi DHS datasets with GPS data” (4.5). The study also references Dartmouth Flood Observatory Data from 1999 to 2007 (F5) used to indicate flood risk and a Malawi Vulnerability Assessment Committee (MVAC) Household Economy Approach (HEA) baseline survey conducted between May and July 2003 used to indicate livelihood sensitivity (3.6). Therefore, it appears that the 2004–2010 temporal extent applies strictly to the household resilience analysis, and not to the climate vulnerability analysis.

The temporal support for the household resilience analysis was longitudinal DHS Survey data collected in 2004 and 2010. The temporal support for the climate vulnerability analysis is an aggregation of data from different sources, ranging from 1999 to 2010 (F5). Temporal effects are not measured or accounted for, although the authors discuss differences between household resilience in 2004 and 2010 (5.3 and 5.4)

Data exclusion

Some traditional authorities are missing data for DHS surveys because no DHS surveys were conducted within their administrative boundaries. These traditional authorities were excluded from the analysis of household resilience and labelled as ‘Areas Missing DHS Data’ (5.2). Additionally, some subjective decisions were made in the context of survey responses to the “cattle” question in particular. Many household surveys contain inconclusive answers (e.g. “I don’t know”) or are missing data for survey questions used in the adaptive capacity calculation. The livestock variable (“assets”) is calculated as a sum of four livestock types, and they remove any household with uncertain answers about any of the livestock types and remove households with missing data for all livestock types. Households with answers about some livestock types and missing data for others are still included in the data.

No inferences were made to fill the missing data (5.2). Some traditional authorities are clearly symbolized with the diagonal hashmarks representing missing data for household resilience (F4), and appear to be similarly excluded from the analysis of climate vulnerability (F5). However, several other traditional authorities appear white—not one of the four categories of household resilience (F4). Some of these areas appear to be excluded from the climate vulnerability analysis while others are not (F5).

The study does not analyze the presence of outliers or exclude them. The study does not weight samples.

Abstract

The original study is a multi-criteria analysis of vulnerability to Climate Change in Malawi, and is one of the earliest sub-national geographic models of climate change vulnerability for an African country. The study aims to be replicable, and had 40 citations in Google Scholar as of April 8, 2021.

Original Study Information

The study region is the country of Malawi. The spatial support of input data includes DHS survey points, Traditional Authority boundaries, and raster grids of flood risk (0.0833 degree resolution) and drought exposure (1/24 = 0.0416 degree resolution).

The original study was published without data or code, but has detailed narrative description of the methodology. The methods used are feasible for undergraduate students to implement following completion of one introductory GIS course. The study states that its data is available for replication in 23 African countries.

Data Description and Variables

###Access & Assets Data

Demographic and Health Survey data are a product of the United States Agency for International Development (USAID). Variables contained in this dataset are used to represent adaptive capacity (access + assets) in the Malcomb et al.’s (2014) study. These data come from survey questionnaires with large sample sizes. The DHS data used in our study were collected in 2010. In Malawi, the provenance of the DHA data dates back as far as 1992, but has not been collected consistently every year. Each point in the household dataset represents a cluster of households with each cluster corresponding to some form of census enumeration units, such as villages in rural areas or city blocks in urban areas DHS GPS Manual. This means that each household in each cluster has the same GPS data. This data is collected by trained USAID staff using GPS receivers. Missing data is a common occurrence in this dataset as a result of negligence or incorrect naming. However, according to the DHS GPS Manual, these issues are easily rectified and typically sites for which data does not exist are recollected. Sometimes, however, missing information is coded in as such or assigned a proxy location. The DHS website acknowledges the high potential for inconsistent or incomplete data in such broad and expansive survey sets. Missing survey data (responses) are never estimated or made up; they are instead coded as a special response indicating the absence of data. As well, there are clear policies in place to ensure the data’s accuracy. More information about data validity can be found on the DHS’s Data Quality and Use site. In this analysis, we use the variables listed in Table 1 to determine the average adaptive capacity of each TA area. Data transformations are outlined below.

Table 1: DHS Variables used in Analysis

Variable Code Definition
HHID “Case Identification”
HV001 “Cluster number”
HV002 Household number
HV246A “Cattle own”
HV246D “Goats own”
HV246E “Sheep own”
HV246G “Pigs own”
HV248 “Number of sick people 18-59”
HV245 “Hectares for agricultural land”
HV271 “Wealth index factor score (5 decimals)”
HV251 “Number of orphans and vulnerable children”
HV207 “Has Radio”
HV243A “Has a Mobile Telephone”
HV219 Sex of Head of Household”
HV226 “Type of Cooking Fuel”
HV206 “Has electricity”
HV204 “Time to get to Water Source”

Variable Transformations

  1. Eliminate households with null and/or missing values
  2. Join TA and LHZ ID data to the DHS clusters
  3. Eliminate NA values for livestock
  4. Sum counts of all different kinds of livestock into a single variable
  5. Apply weights to normalized indicator variables to get scores for each category (assets, access)
  6. find the stats of the capacity of each TA (min, max, mean, sd)
  7. Join ta_capacity to TA based on ta_id
  8. Prepare breaks for mapping
  9. Class intervals based on capacity_2010 field
  10. Take the values and round them to 2 decimal places
  11. Put data in 4 classes based on break values

###Livelihood Zones Data

The Livelihood zone data is created by aggregating general regions where similar crops are grown and similar ecological patterns exist. This data exists originally at the household level and was aggregated into Livelihood Zones. To construct the aggregation used for “Livelihood Sensitivity” in this analysis, we use these household points from the FEWSnet data that had previously been aggregated into livelihood zones. The four Livelihood Sensitivity categories are 1) Percent of food from own farm (6%); 2) Percent of income from wage labor (6%); 3) Percent of income from cash crops (4%); and 4) Disaster coping strategy (4%). In the original R script, household data from the DHS survey was used as a proxy for the specific data points in the livelihood sensitivity analysis (transformation: Join with DHS clusters to apply LHZ FNID variables). With this additional FEWSnet data at the household level, we can construct these four livelihood sensitivity categories using existing variables (Table 1).

Table 2: Constructing Livelihood Sensitivity Categories

Livelihood Sensitivity Category (LSC) Percent Contributing How LSC was constructed
Percent of food from own farm 6% Sources of food: crops + livestock
Percent of income from wage labor 6% Sources of cash: labour etc. / total * 100
Percent of income from cash crops 4% sources of cash (Crops): (tobacco + sugar + tea + coffee) + / total sources of cash * 100
Disaster coping strategy 4% Self-employment & small business and trade: (firewood + sale of wild food + grass + mats + charcoal) / total sources of cash * 100

Variable Transformations

  1. Join with DHS clusters to apply LHZ FNID variables
  2. Clip TA boundaries to Malawi (st_buffer of LHZ to .01 m)
  3. Create ecological areas: LHZ boundaries intersected with TA boundaries to clip out park/conservation boundaries and rename those park areas with the park information from TA data), combined with lake data to remove environmental areas from the analysis

###Physical Exposure: Floods + Droughts

Floods:This dataset stems from work collected by multiple agencies and funneled into the PREVIEW Global Risk Data Platform, “an effort to share spatial information on global risk from natural hazards.” The dataset was designed by UNEP/GRID-Europe for the Global Assessment Report on Risk Reduction (GAR), using global data. A flood estimation value is assigned via an index of 1 (low) to 5 (extreme). Drought: This dataset uses the Standardized Precipitation Index to measure annual drought exposure across the globe. The Standardized Precipitation Index draws on data from a “global monthly gridded precipitation dataset” from the University of East Anglia’s Climatic Research Unit, and was modeled in GIS using methodology from Brad Lyon at Columbia University. The dataset draws on 2010 population information from the LandScanTM Global Population Database at the Oak Ridge National Laboratory. Drought exposure is reported as the expected average annual (2010) population exposed. The data were compiled by UNEP/GRID-Europe for the Global Assessment Report on Risk Reduction (GAR). The data use the WGS 1984 datum, span the years 1980-2001, and are reported in raster format with spatial resolution 1/24 degree x 1/24 degree.

Analytical Specification

The original study was conducted using ArcGIS and STATA, but does not state which versions of these software were used. This update replication study will use R 4.3.1.

Materials and Procedure

Process Adaptive Capacity

  1. Bring in DHS Data [Households Level] (vector)
  2. Bring in TA (Traditional Authority boundaries) and LHZ (livelihood zones) data
  3. Get rid of unsuitable households (eliminate NULL and/or missing values)
  4. Join TA and LHZ ID data to the DHS clusters
  5. Pre-process the livestock data Filter for NA livestock data Update livestock data (summing different kinds)
  6. FIELD CALCULATOR: Normalize each indicator variable and rescale from 1-5 (real numbers) based on percent rank
  7. FIELD CALCULATOR / ADD FIELD: Apply weights to normalized indicator variables to get scores for each category (assets, access)
  8. SUMMARIZE/AGGREGATE: find the stats of the capacity of each TA (min, max, mean, sd)
  9. Join ta_capacity to TA based on ta_id (Multiply by 20–meaningless??) I have a question about this (so do I) ln.216
  10. Prepare breaks for mapping Class intervals based on capacity_2010 field Take the values and round them to 2 decimal places Put data in 4 classes based on break values
  11. Save the adaptive capacity scores

Process Livelihood Sensitivity

  1. Load in LHZ geometries into R
  2. Join non-geometry LHZ sensitivity data into R code
  3. Read in processed LHZ dataset
  4. Join the data to the LHZ geometries (polygons)
  5. Rescale the LHZ data into quintiles
  6. Assign weights
  7. Rasterize the livelihood data
  8. Calculate capacity score based on values in Malcomb et al. (2014)

Process Physical Exposure

  1. Load in UNEP rasterSet CRS for drought
  2. Set CRS for flood
  3. Clean and reproject rasters
  4. Create a bounding box at extent of Malawi Where does this info come from
  5. For Drought: use bilinear to avg continuous population exposure values
  6. For Flood: use nearest neighbor to preserve integer values
  7. Create mask and CLIP the traditional authorities with the LHZs to cut out the lake and national parks
  8. RASTERIZE the ta_capacity data with pixel data corresponding to capacity_2010 field
  9. RASTERIZE the livelihood sensitivity score with pixel data corresponding to capacity_2010 field

Adaptive Capacity (40%) + Livelihood Sensitivity (20%) - Physical Exposure (40%) = Vulnerability Later

Raster Calculations

  1. Create a mask
  2. Reclassify the flood layer (quintiles, currently binary)
  3. Reclassify the drought values (quantile [from 0 - 1 in intervals of 0.2 =5])
  4. AGGREGATE: Create final vulnerability layer using environmental vulnerability score and ta_capacity.

We then georeferenced maps from the original study using QGIS in order to compare the results generated by our R script to those found in Malcomb et al. (2014). We ran a Spearman’s Rho correlation test between the two maps of Figs. 4 and 5 to determine the differences in results.

Replication Results

For each output from the original study (mainly figure 4 and figure 5), present separately the results of the replication attempt.

  1. State whether the original study was or was not supported by the replication
  2. State whether any hypothesis linked to a planned deviation from the original study was supported. Provide key statistics and related reasoning.

Figures to Include: - map of resilience by traditional authority in 2010, analogous to figure 4 of the original study - map of vulnerability in Malawi, analogous to figure 5 of the original study - map of difference between your figure 4 and the original figure 4 - map of difference between your figure 5 and the original figure 5

Reproduction of Figure 4 (map of adaptive capacity by traditional authority in 2010):

Our Spearman’s rank correlation of original adaptive capacity vs. reproduced adaptive capacity and choropleth map visualizing the difference between reproduced - original reveals that our reproduction values are consistently too large. There are dozens of TAs where the reproduced value is 1 unit too large (ex. says 2 but should be 1, says 3 but should be 2, etc.). The difference map highlights that the inequality between adaptive capacity scores is widespread across Malawi, particularly the northern part of the country along the coast. The Spearman’s rank correlation rho value is 0.789, indicating moderately strong positive but imperfect correlation between reproduced and original. The confusion matrix below presents original as rows and reproduction as columns. The differences between values are significant enough to threaten the reproducibility and internal validity of the Malcomb study. The reproduction is a lot more spatially discontinuous in terms of AC score values across space. Meanwhile, the original is more continuouse across space…scores seem to flow/diffuse more easily, rather than drastic changes in AC score adjacent to one another.

 1  2  3  4

1 34 27 6 0 2 4 26 44 5 3 0 0 19 29 4 0 0 0 3

Reproduction of Figure 5 (map of vulnerability in Malawi):

Again, our reproduction vulnerability values are about 20% less than what we expect them to be based on the original values. The original vulnerabilities range from 20-100, while the reproduced vulnerabilities range from 0-80 Our Spearman’s rank correlation rho is not good at 0.1974. While the vulnerability differences are in the same direction as the adaptive capacity differences, the reproduced vulnerabilities are significantly worse. Because adaptive capacity is a component to the vulnerability score, we can expect that if reproduced adaptive capacity is too low, then reproduced vulnerability will also be too low. However, the reproduced vulnerability is significantly lower than the reproduced adaptive capacity was, meaning that the additional undercounting comes at least in part from something to do with the Livelihood Capacity scores or the Physical Exposure layers. It could come from numerous sources, including georeferencing problems, unaccounted for lake/park chunks, normalization erros, differences in the computational environment, or human error.

Planned deviation - map of difference between reproduction Figure 4 and original Figure 4 (adaptive capacity):

The difference map indicates a moderate level of discrepancy between reproduced adaptive capacity and original adaptive capacity. For much of Malawi, original is higher than reproduced (aka reproduced results are lower than the original results). Since AC is measured on a 1-4 scale, and because much of the country has reproduced adaptive capacity that is 1 unit lower than the original AC, this discrepancy is rather significant. The southernmost part of the country (Zomba and Blantyre) has the greatest discrepancy (reproduced is significantly lower than original), but also some pockets of incredibly high variability. The narrow part of Malawi on the northern side of the lake interestingly has some of the most consistently accurate adaptive capacity - no difference between reproduced and original. However, the “-1” distance for much of Malawi is concerning…this indicates that something in our reproduction made it too low, or something in the original authors’ study made it too high. It could be due to the incorrect sign on the # of sick HH members and # of orphans in calculating the AC score. Other potential sources of uncertainty could come from incorrect georeferencing of the original study to create the difference map, as well as subjective data exclusions arising from the DHS data. Although the weights for the AC index are provided in the paper, there is no way to know with certainty exactly how these weights were uses, or if they were possibly changed, in the original study.

Planned deviation - map of difference between reproduction Figure 5 and original Figure 5 (vulnerability):

As with the difference map for figure 4, this difference raster map indicates a moderate level of discrepancy between reproduced vulnerability and original vulnerability. The greatest amount of difference is found in the south-central portion of Malawi. Across the country, there is a majority negative difference value, meaning the reproduction vulnerability is less than the original vulnerability. The few instances of the opposite - a positive difference value (reproduction vulnerability is higher than original vulnerability) - are found in the northernmost (along the lake) and southernmost parts of the country.

Unplanned Deviations from the Protocol

Summarize changes and uncertainties between - your interpretation and plan for the workflow based on reading the paper - your final workflow after accessing the data and code and completing the code

Overall, the original paper itself did not effectively outline a clear workflow. Specifically, it did a poor job distinguishing between “vulnerability,” “adaptive capacity,” and “resilience.” The authors did not do a good job using consistent terminology, which made interpreting their workflow in the paper rather confusing. For instance, in Table 2, they seem to imply that assets + access = adaptive capacity, and that adaptive capacity + livelihood sensitivity + physical exposure = household resilience. However, figures 3 and 4 are titled as maps of “Average Resilience Scores” by TA, which would imply adaptive capacity + livelihood sensitivity + physical exposure. However, they note in the captions that these maps are just the “socioeconomic resilience” which is the same thing as adaptive capacity which is the same thing as assets + access. Clearly, they do not do a consistent job using the terminology that they defined, both in the figures and even in some of the discussion.

I was not able to download any of the DHS adaptive capacity myself data myself due to a black box, so I had to deviate and just load in pre-aggregated, publicly accessible adaptive capacity data that came as part of the forked repository. The source data is USAID DHS Surveys conducted in 2004 and 2010, but it required a complicated access process of signing up for accounts, etc. There are still lots of uncertainties with the subjective weighting of the DHS data. Malcomb stated that “Vulnerability, like happiness, is a human state or condition that cannot be measured directly or in an objective manner. Therefore, finding meaningful variables to form statistical relationships represents a major challenge for a data-driven approach. As a result, expert opinion formed the basis of selecting the most important indicators and weighting them appropriately in this research” (22). Thus, the whole formation of the adaptive capacity score, which included both this DHS data (some of which was subjectively removed…see “Data Exclusion” section) and subjective weighting, was very much black boxed. This reproduction is our best understanding of the workflow utilized by the original authors. The FEWSNET livelihoods zones and the biophysical flood/drought steps were all quite simple and easy to reproduce, however/

Additionally, the original authors regularly rescaled many of the metrics, but did a poor job specifying exactly where in their workflows. Based on my best efforts and intuition, the correct places to rescale include the following: - For adaptive capacity, rescale the DHS data to a [0,4] rank before assigning weights but after assigning them a geometry - Also for adaptive capacity, rescale the adaptive capacity score itself after groupying by TA and joining by attribute - For flood and drought, rescale the rasters to quantiles after clipping/masking the to the topography but before assigning weights to calculate overall vulnerability. - For FEWSNET livelihood zones, rescale the rasters to quantiles after addining geometries but before combining and rasterizing them (all before calculating overall vulnerability/adding weights)

Discussion

Provide a summary and interpretation of the key findings of the replication vis-a-vis the original study results. If the attempt was a failure, discuss possible causes of the failure.

In conclusion, this reproduction study aimed to replicate the original multi-criteria analysis of vulnerability to climate change in Malawi. Despite facing several challenges, including the lack of access to the original DHS data and ambiguities in the original study’s methodology, the reproduction study produced insights and raised important considerations.

Key findings of the reproduction study include discrepancies in the adaptive capacity and vulnerability scores, especially in the spatial distribution of values. The Spearman’s rank correlation indicated a moderately strong positive correlation for adaptive capacity but a much weaker correlation for vulnerability. The differences in vulnerability were particularly pronounced, with reproduced values consistently lower than the original ones.

Several unplanned deviations from the protocol were necessary due to the lack of access to the original DHS data. The reproduction study had to rely on pre-aggregated, publicly accessible adaptive capacity data. Additionally, uncertainties arose from subjective weighting in the original study and the lack of clarity in the methodology regarding variable transformations.

The discussion highlighted the spatial patterns of vulnerability and adaptive capacity, emphasizing the relevance of the outputs from Figures 4 and 5. The central and southern regions of Malawi exhibited high levels of vulnerability, with evidence of both spatial diffusion across traditional authority borders and clustering of high-vulnerability areas.

Despite the challenges and uncertainties in the reproduction study, the findings contribute valuable insights into the spatial distribution of vulnerability to climate change in Malawi. Generally, the same areas that were high vulnerability in the original study roughly also have high vulnerability in the reproduction. However, there are plenty of areas where this is not the case, and the exact values of the AC and vulnerability scores do not line up. The discrepancies observed between the reproduced and original results underscore the importance of transparency in methodology and the need for data accessibility in climate vulnerability research.

Future research in this area should focus on improving data availability and transparency. Access to the original DHS data would significantly enhance the reproducibility and robustness of vulnerability assessments. Moreover, efforts should be made to standardize terminology and methodologies in climate vulnerability studies to facilitate clearer communication and comparison of results.

In summary, while the reproduction study faced challenges and uncertainties, it provides valuable reflections on the complexities of replicating climate vulnerability research. The findings emphasize the need for open access to data, clear methodological documentation, and standardization in future studies to enhance the credibility and reliability of vulnerability assessments.

The lack of access to the DHS data, along with the subjectivity and lack of transparency of the authors in the adaptive capacity score section was by far the biggest road block to a successful reproduction.

In this replication, any failure is probably due to practical causes, which may include: - lack of data - lack of code - lack of details in the original analysis - uncertainties due to manner in which data has been used. Lack of DHS data (having to rely on a pre-aggregated form) certainly skews our adaptive capacity score, which only snowballs into an even more skewed vulnerability score. Lack of code from the original authors to know how we should clip/mask out lakes and conservation areas, what coordinate reference system to use, where and when to introduce weighting, and ambiguity in terms of normalizing/rescaling/weighting/quantiles, etc adds uncertainty to our code regarding whether the order of our steps are the same as the original authors. If we had their code and could run it exactly as how they ran it, we would more easily be able to isolate sources of error and uncertainty, such as from computational environments and data processing.

Conclusion

Restate the key findings and discuss their broader societal implications or contributions to theory. Do the research findings suggest a need for any future research?

Key Findings:

Data Collection and Spatial Sampling: The original study conducted over 70 interviews and 11 focus groups, but uncertainties exist regarding the identity of some interviewees. Spatial sampling issues are noted in the use of DHS surveys, with potential problems in coverage and aggregation at the traditional authority level.

Thematic Concepts and Definitions: The authors inconsistently used terms like “adaptive capacity,” “vulnerability,” and “resilience.” The definitions and usage of these terms varied, creating confusion in the interpretation of results.

Attribute Variable Transformations: The original study lacks clarity in the normalization and transformation of variables, particularly in adaptive capacity calculations. The weighted combinations lack specified formulas, contributing to ambiguity.

Geographic Transformations: Spatial units like traditional authorities and gridded cells were used for analysis, but the paper lacks details on specific transformations, raster conversion parameters, and spatial resolutions.

Temporal Characteristics: The study’s temporal extent and support varied for different analyses, introducing potential temporal inconsistencies. Temporal effects were not measured or addressed.

Data Exclusion and Subjective Decisions: Some traditional authorities lacked DHS survey data, and subjective decisions were made in handling inconclusive survey responses, potentially introducing bias.

Reproduction of Figures 4 and 5: The reproduction study revealed discrepancies in adaptive capacity and vulnerability scores. Spatial patterns were inconsistent, indicating potential issues in data processing or methodology.

Broader Societal Implications or Contributions to Theory:

The study contributes to the discourse on climate change vulnerability in Malawi, emphasizing the importance of clear terminology, transparent methodologies, and standardized practices in vulnerability assessments.

The findings underscore the challenges of replicating studies with limited data accessibility and highlight the need for open access to enhance the credibility of climate vulnerability research.

The spatial patterns of vulnerability and adaptive capacity reveal regions in Malawi that may face higher climate-related risks, emphasizing the need for targeted adaptation and resilience strategies.

Future Research Suggestions:

  • Future research should focus on improving data accessibility, especially for critical datasets like DHS surveys, to enhance the reproducibility and reliability of vulnerability assessments.

  • Standardization of terminology and methodologies in climate vulnerability studies is crucial for comparison and synthesis of results across different studies.

  • Exploration of temporal dynamics and the inclusion of more recent data could provide insights into how vulnerability has evolved over time, allowing for more dynamic adaptation strategies.

  • Research on the societal implications of climate vulnerability patterns could inform policy decisions and resource allocations to address the specific needs of vulnerable regions.

  • Consider fringe effects and border interactions to determine how neighborhing countries relate to Malawi’s high vulnerability levels. Would considering factors in bordering countries increase or decrease vulnerability scores in Malawi?

Conclusion: This reproduction attempt clearly demonstrates the pressing need for greater transparency of original studies. It sheds light on the complexities of replicating climate vulnerability research and highlights the critical role of transparency and data accessibility. The findings, despite some discrepancies, contribute valuable insights into vulnerability patterns in Malawi, emphasizing the need for further research and standardized practices in the field. The outputs from Figures 4 and 5 are certainly relevant. There is clear evidence of spatial diffusion of vulnerability across TA borders, as well as spatial clustering of high-vulnerability areas. Future reproductions should absolutely consider ways to better handle DHS data…is there a better way than just removing all of the incomplete or “I don’t know” responses? Is there a better way to systematize how DHS data is weighted as inputs into the adaptive capacity score, or a better geographic unit to use to link DHS household to a spatial scale? Both the original and the reproduction figures indicate high levels of vulnerability in the central and south regions of Malawi, which warrant’s international attention to build resilience against climate change and other socio-physical hazards for residents in these areas.

Discussion of specific contributions to this reproduction study

This reproduction study was chocked full of frictions and barriers to reproduction.

In addition to some minute formatting, commenting, and typo correction, there are a handful additions to this reproduction study.

The first included visualized the original figure 5 in the R script. Figure 5 (original vulnerability raster layer) was brought in as a data object and was correctly used in creating the difference raster layer in the prior reproduction, but it was never visualized for the user to see.

The second was to amend the Figure 5 output (vulnerability map) to show vulnerability by TA, rather than just a raster grid. I polygonized/aggregated the vulnerability raster grid and joined with TA names to produce this new map is titled ta_vuln_map. Showing vulnerability by TA allows us to easily include natural parks/wildlife refuges, as well as identify areas where there was missing data for the TAs (stemming from HH surveys).

Additionally, since I managed to join TA name to the vulnerability by TA layer, I created a sorted data frame that shows lists the names of TAs in decending order of their vulnerability scores. This provides policymakers and stakeholders with a direct list to see the most vulnerable TAs and could be useful in terms of directing aid.

I believe I found a possible error in the code for calculating the adaptive score and weights. In the existing code, number of orphans and number of sick people in the HH are positive contributions to the adaptive capacity scire…I switched the sign to negative to accurately reflect the fact that more sick people and more orphans should not increase one’s “adaptive capacity” in the same way that having more livestock or cooking fuel would.

Lastly, I added added z-scored histograms to several of the raster figure outputs (maps) to better visualize the original vulnerability raster’s spread of values, as well as the reproduction - original vulnerability (difference) raster’s spread of values.

List of things that this updated reproduction/replication/reannalysis accomplishes: - Actually visualized original figure 5 in the R script. Figure 5 (original vulnerability raster layer) was brought in as a data object and was correctly used in creating the difference raster layer, but it was never visualized for the user to see. - Polygonized/aggregated the vulnerability raster grid to get vulnerability scores by TA, and mapped them (both an interactive and a static map) - Sorted TA vulnerability table - Possible error in the adaptive capacity score creation and weighting, where number of orphans and number of sick people in the HH are positive contributions to adaptive capacity…I switched the sign to negative to accurately reflect the fact that more sick people and more orphans does not increase one’s “adaptive capacity” - Added z-scored histograms to better visualize the vulnerability raster’s spread of values, as well as the reproduction - original vulnerability (difference) raster’s spread of values - Minor commenting and formatting

Integrity Statement

This report and its preregistration were written after already attempting the reproduction study, including acquisition and analysis of all of the secondary data sources required. However, the preregistered analysis plan was written as if we had no prior knowledge of the data other than what is documented in the study. Holler has previously reviewed and compared other climate vulnerability models for Malawi, and conducted a scoping study in the Lilongwe and Mangochi districts of Malawi in 2015, including meeting with the Regional Centre for Mapping of Resources for Development (RCMRD) consultants who created the Malawi Hazards and Vulnerability Atlas (2015).

References

Referencing the original paper

Malcomb, D. W., E. A. Weaver, and A. R. Krakowka. 2014. Vulnerability modeling for sub-Saharan Africa: An operationalized approach in Malawi. Applied Geography 48:17–30. DOI:[10.1016/j.apgeog.2014.01.004](DOI:%5B10.1016/j.apgeog.2014.01.004){.uri}.

Sections

  1. Introduction
  2. Complex vulnerability
  3. Evidence-based Indicators
  4. Methodology
  5. Results
  6. Discussion
  7. Conclusion

Tables, figures, other elements

  • T1 Evidence-based complex vulnerability indicators
  • T2 Weighted indicators by metatheme
  • F1 Map of Malawi
  • F2 Vulnerability web
  • F3 Malawi Household Resilience (2004)
  • F4 Malawi Household Resilience (2010)
  • F5 Malawi Composite Vulnerability Index
  • A1 Appendix 1
  • R References

Replication of # Vulnerability modeling for sub-Saharan Africa

Original study by Malcomb, D. W., E. A. Weaver, and A. R. Krakowka. 2014. Vulnerability modeling for sub-Saharan Africa: An operationalized approach in Malawi. Applied Geography 48:17–30. DOI:[10.1016/j.apgeog.2014.01.004](DOI:%5B10.1016/j.apgeog.2014.01.004){.uri}

Replication Authors: Your Name, Joseph Holler, Kufre Udoh, Open Source GIScience students of fall 2019 and Spring 2021, William Procter

Replication Materials Available at: Forked RP-Malcomb Repository

Created: 14 April 2021 Revised: 12 November 2023

References

Malcomb, D. W., E. A. Weaver, and A. R. Krakowka. 2014. Vulnerability modeling for sub-Saharan Africa: An operationalized approach in Malawi. Applied Geography 48:17–30. DOI:[10.1016/j.apgeog.2014.01.004](DOI:%5B10.1016/j.apgeog.2014.01.004){.uri}

Report Template References & License

This template was developed by Peter Kedron and Joseph Holler with funding support from HEGS-2049837. This template is an adaptation of the ReScience Article Template Developed by N.P Rougier, released under a GPL version 3 license and available here: https://github.com/ReScience/template. Copyright © Nicolas Rougier and coauthors. It also draws inspiration from the pre-registration protocol of the Open Science Framework and the replication studies of Camerer et al. (2016, 2018). See https://osf.io/pfdyw/ and https://osf.io/bzm54/

Camerer, C. F., A. Dreber, E. Forsell, T.-H. Ho, J. Huber, M. Johannesson, M. Kirchler, J. Almenberg, A. Altmejd, T. Chan, E. Heikensten, F. Holzmeister, T. Imai, S. Isaksson, G. Nave, T. Pfeiffer, M. Razen, and H. Wu. 2016. Evaluating replicability of laboratory experiments in economics. Science 351 (6280):1433–1436. https://www.sciencemag.org/lookup/doi/10.1126/science.aaf0918.

Camerer, C. F., A. Dreber, F. Holzmeister, T.-H. Ho, J. Huber, M. Johannesson, M. Kirchler, G. Nave, B. A. Nosek, T. Pfeiffer, A. Altmejd, N. Buttrick, T. Chan, Y. Chen, E. Forsell, A. Gampa, E. Heikensten, L. Hummer, T. Imai, S. Isaksson, D. Manfredi, J. Rose, E.-J. Wagenmakers, and H. Wu. 2018. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour 2 (9):637–644. http://www.nature.com/articles/s41562-018-0399-z.

Other References

Barrett, S. 2014. Subnational Climate Justice? Adaptation Finance Distribution and Climate Vulnerability. World Development 58:130–142. DOI: 10.1016/j.worlddev.2014.01.014. Gallopín, G. C. 2006. Linkages Between Vulnerability, Resilience, and Adaptive Capacity. Global Environmental Change 16 (3):293–303. DOI: 10.1016/j.gloenvcha.2006.02.004. Rufat, S., E. Tate, C. G. Burton, and A. S. Maroof. 2015. Social vulnerability to floods: Review of case studies and implications for measurement. International Journal of Disaster Risk Reduction 14:470–486. DOI: 10.1016/j.ijdrr.2015.09.013. Smit, B., and J. Wandel. 2006. Adaptation, adaptive capacity and vulnerability. Global Environmental Change 16 (3):282–292. DOI: 10.1016/j.gloenvcha.2006.03.008.