Maize Loss Climate Experiment

Study looking at how climate change may impact loss rates for insurance by simulating maize outcomes via a neural network and Monte Carlo. This includes interactive tools to understand results and a pipeline to build a paper discussing findings.

Purpose

This repository contains three components for a study looking at how crop insurance claims rates may change in the future within the US Corn Belt using SCYM and CHC-CMIP6.

Pipeline: Contained within the root of this repository, this Luigi-based pipeline trains neural networks and runs Monte Carlo simulations to project future insurance claims under various different parameters, outputting data to a workspace directory.
Tools: Within the paper/viz subdirectory, the source code for an explorable explanation built using Sketchingpy both creates the static visualizations for the paper and offers web-based interactive tools released to ag-adaptation-study.pub which allow users to iteratively engage with these results.
Paper: Within the paper subdirectory, a manuscript is built from the output data from the pipeline. This describes these experiments in detail with visualizations.

These are described in detail below.

Usage

The easiest way to engage with these results is through the web-based interactive explorable explanation which is housed for the public at ag-adaptation-study.pub. The paper preprint can also be found at https://arxiv.org/abs/2408.02217. We also publish our raw pipeline output. Otherwise, see local setup.

Local setup

For those wishing to extend this work, you can execute this pipeline locally by checking out this repository (git clone [email protected]:SchmidtDSE/maize-loss-climate-experiment.git).

Dev Container (Recommended)

The easiest way to get started with development is using the provided dev container, which automatically sets up all dependencies for pipeline, paper, and visualization development:

GitHub Codespaces: Click the "Code" button and select "Open with Codespaces" for instant cloud-based development.
VS Code with Dev Containers: Open the repository in VS Code and click "Reopen in Container" when prompted (requires Docker and the Dev Containers extension).
Local Docker: Clone the repository and run docker build -t maize-experiment .devcontainer then docker run -it -v $(pwd):/workspaces/maize-loss-climate-experiment maize-experiment.

The dev container includes:

Python 3.11 with all project dependencies pre-installed
LaTeX and Pandoc for paper building
Sample data for visualization development
Development tools (linting, testing)
All system dependencies configured

After the container starts, you'll have a fully configured environment ready for development on any component.

Local pipeline

First, get access to the SCYM and CHC-CMIP6 datasets and download all of the geotiffs to an AWS S3 Bucket or another location which can be accessed via the file system. This will allow you to choose from two execution options:

Setup for AWS: This will execute if the USE_AWS environment variable is set to 1. This assumes data are hosted remotely in an AWS bucket defined by the SOURCE_DATA_LOC environment variable and we use Coiled to execute the computation. After setting the environment variables for access credientials (AWS_ACCESS_KEY and AWS_ACCESS_SECRET) and setting up Coiled, simply execute the Luigi pipeline as described below.
Setup for local: If the USE_AWS environment variable is set to 0, this will run using a local Dask cluster. This assumes that SOURCE_DATA_LOC is a path to the directory housing the input geotiffs. After setting up Coiled, simply execute the Luigi pipeline as described below.

You can then execute either by:

Run directly: First, install the Python requirements (pip install -r requirements.txt) optionally within a virtual environment. Then, simply execute bash run.sh to execute the pipeline from start to finish. See also breakpoint_tasks.py for Luigi targets for running subsets of the pipeline.
Run through Docker: Simply execute bash run_docker.sh to execute the pipeline from start to finish. See also breakpoint_tasks.py for Luigi targets for running subsets of the pipeline and update run.sh which is executed within the container. Note that this will operate on the workspace directory.

A summary of the pipeline is created in stats.json. See local package below for use in other repository components such as the interactive tools or paper rendering. Users may optionally skip some expensive steps by placing the files from https://zenodo.org/records/14533227 into the workspace directory.

Interactive tools

Written in Sketchingpy, the tools can be executed locally on your computer, in a static context for building the paper, or through a web browser. First, one needs to get data from the pipeline or download prior results:

Download prior results: Retrieve the latest results and move them into the viz directory (paper/viz/data). Simply use wget to gather model outputs when in the paper/viz directory as so: wget https://ag-adaptation-study.pub/archive/data.zip; unzip data.zip. If using prior sweep results, download full sweep information like so: cd data; wget http://ag-adaptation-study.pub/data/sweep_ag_all.csv; cd ...
Use your own results: Update the output data per instructions regarding local package below.

There are two options for executing the tools:

Docker: You can run the web-based visualizations through a simple Docker file in the paper/viz directory (bash run_docker.sh).
Local apps: You can execute the visualizations manually by running them directly as Python scripts. The entry points are hist_viz.py, history_viz.py, results_viz_entry.py, and sweep_viz.py. Simply run them without any command line arguments for defaults. Note you may need to install python dependencies (pip install -r requirements.txt).

Note that the visualizations are also invoked through paper/viz/render_images.sh for the paper.

Paper

Due to the complexities of the software install, the only officially supported way to build the paper is through the Docker image. First update the data:

Download prior results: Retrieve the latest results and move them into the paper directory (paper/outputs).
Use your own results: Update the output data per instructions regarding local package below.

Then, execute render_docker.sh to drop the results into the paper_rendered directory.

Local package

Instead of retrieving data from https://ag-adaptation-study.pub, you can use your own pipeline data outputs by running bash package.sh. This will produce the data and outputs sub-directories inside of a new package directory which can be used for the interactive tools and paper rendering respectively.

Alternative: Manual setup

If you prefer not to use the dev container, you can manually set up each component following the individual setup instructions below, though this requires more configuration steps.

Examples

The following section provides a "cookbook" of common examples for how to use these tools for the most common scenarios.

Review existing results

If wanting to review the current outputs, simply navigate to https://ag-adaptation-study.pub. No additional software is required.

Run interactive tools

To use the existing outputs and run the interactive tools locally after cloning this repository, gather the data and run the visualization scripts.

$ cd paper/viz
$ wget https://ag-adaptation-study.pub/archive/data.zip
$ unzip data.zip
$ cd data
$ wget http://ag-adaptation-study.pub/data/sweep_ag_all.csv
$ cd ..
$ pip install -r requirements.txt
$ python hist_viz.py

This runs the historgram visualization but hist_viz.py, history_viz.py, rates_viz.py, results_viz_entry.py, and sweep_viz.py are all available.

Execute the pipeline locally via Docker

The following will execute the entire pipeline locally after having placed SCYM and CHC-CMIP6 in a local directory (assumed to be path/to/data below).

$ export USE_AWS=0
$ export SOURCE_DATA_LOC=path/to/data
$ bash run_docker.sh

Testing

As part of CI / CD and for local development, the following are required to pass for both the pipeline in the root of this repository and the interactives written in Python at paper/viz:

pyflakes: Run pyflakes *.py to check for likely non-style code isses.
pycodestyle: Run pycodestyle *.py to enforce project coding style guidelines.

The pipeline also offers unit tests (nose2 in root) for the pipeline. For the visualizations, tests happen by running the interactives headless (bash render_images.sh; bash script/check_images.sh).

Deployment

To deploy changes to production, CI / CD will automatically release to ag-adaptation-study.pub once merged on main.

Development standards

Where possible, please follow the Python Google Style Guide unless an override is provided in setup.cfg. Docstrings and type hints are required for all top-level or public members but not currently enforced for private members. JSDoc is required for top level members. Docstring / JSDoc not required for "utility" code.

Data

We make publicly available both inputs and outputs to our modeling. Due to size, some of these are archived at ag-adaptation-study.pub while others are deposited into Zenodo.

CHC-CMIP6

Our derivative dataset from CHC-CMIP6 is available at climate.csv and our Zenodo. These are aggregated and preprocessed as used within our modeling.

USDA RMA SOB

Note that an archive of UDSA Risk Management Agency (RMA) Summary of Business (SOB) data is also provided at our usda_rma_sob.zip. As a relatively large supplemental dataset, this is not currently in Zenodo. In addition to original format, all SOB datasets are given in Avro format where possible with standardized formatting / encoding. A subset of these data are considered within our paper as supporting evidence. See the README within the data archive for further details.

Yield estimations (SCYM)

Our derivative SCYM yield estimations at neighborhood-level are avialable at Zenodo.

Model outputs

The following model outputs are made available both through our website:

export_claims.csv: Information about the claims rate under different conditions.
sim_hist.csv: Information about simulation-wide yield distributions under different conditions.
sweep_ag_all.csv: Information about sweep outcomes and model performance.
tool.csv: Geographically specific information about simulation outcomes at the 4 character geohash level.

As smaller payloads, these are also included in our Zenodo.

Open source

The pipeline, interactives, and paper can be executed independently and have segregated dependencies. We thank all of our open source dependencies for their contribution.

Pipeline dependencies

The pipeline uses the following open source dependencies:

bokeh under the BSD 3-Clause License.
boto3 under the Apache v2 License.
dask under the BSD 3-Clause License.
fiona under the BSD License.
geolib under the MIT License.
geotiff under the LGPL License.
geotiff under the LGPL License.
imagecodecs under the BSD 3-Clause License.
keras under the Apache v2 License.
libgeohash under the MIT License.
Luigi under the Apache v2 License.
NumPy under the BSD License.
Pandas under the BSD License.
Pathos under the BSD License.
requests under the Apache v2 License.
scipy under the BSD License.
shapely by Sean Gillies, Casper van der Wel, and Shapely Contributors under the BSD License.
tensorflow under the Apache v2 License.
toolz under the BSD License.

Use of Coiled is optional.

Tools and visualizations

Both the interactives and static visualization generation use the following:

Jinja under the BSD 3-Clause License.
Matplotlib under the PSF License.
NumPy under the BSD License.
Pandas under the BSD License.
Pillow under the HPND License.
pygame-ce under the LGPL License.
Sketchingpy under the BSD 3-Clause License.
toolz under the BSD License.

The web version also uses:

es.js under the ISC License (Andrea Giammarchi).
micropip under the MPL 2.0 License.
packaging under the BSD License.
Popper under the MIT License.
Pyodide under the MPL 2.0 License.
Pyscript under the Apache v2 License.
Sketchingpy under the BSD 3-Clause License.
Tabby under the MIT License.
Tippy.js under the MIT License.
toml (Jak Wings) under the MIT License.
ua-parser 1.0.36 under the MIT License.

Paper

The paper uses the following open source dependencies to build the manuscript:

Jinja under the BSD 3-Clause License.
Matplotlib under the PSF License.
NumPy under the BSD License.
Pandas under the BSD License.
Pillow under the HPND License.
Sketchingpy under the BSD License including the packages included in its stand alone hosting archive.
toolz under the BSD License.

Users may optionally leverage Pandoc as an executable (not linked) under the GPL but any tool converting markdown to other formats is acceptable or the paper can be built as Markdown only without Pandoc. That said, for those using Pandoc, scripts may also use pandoc-fignos under the GPL License and pandoc-tablenos under the GPL License.

Other runtime dependencies

Some executions may also use:

Docker under the Apache v2 License.
Docker Compose under the Apache v2 License.
Nginx under a BSD-like License.
OpenRefine under the BSD License.

Other sources

We also use:

Color Brewer under the Apache v2 License.
Public Sans under the CC0 License.

GitHub Copilot used for some post-publication steps, largely to prepare materials for presentations.

License

Code is released under BSD 3-Clause and data under CC-BY-NC 4.0 International. Please see LICENSE.md.

Name		Name	Last commit message	Last commit date
Latest commit History 939 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
paper		paper
refine		refine
.gitignore		.gitignore
.replit		.replit
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
breakpoint_tasks.py		breakpoint_tasks.py
clean.sh		clean.sh
cluster_tasks.py		cluster_tasks.py
const.py		const.py
data_struct.py		data_struct.py
distribution_struct.py		distribution_struct.py
export_tasks.py		export_tasks.py
file_util.py		file_util.py
gaussian_process.py		gaussian_process.py
get_deltas.sh		get_deltas.sh
lstm.py		lstm.py
normalize_tasks.py		normalize_tasks.py
package.sh		package.sh
parse_util.py		parse_util.py
prepare_viz_deploy.sh		prepare_viz_deploy.sh
preprocess_climate_tasks.py		preprocess_climate_tasks.py
preprocess_combine_tasks.py		preprocess_combine_tasks.py
preprocess_yield_tasks.py		preprocess_yield_tasks.py
replit.nix		replit.nix
requirements.txt		requirements.txt
run.sh		run.sh
run_docker.sh		run_docker.sh
selection_tasks.py		selection_tasks.py
setup.cfg		setup.cfg
sim_tasks.py		sim_tasks.py
stats_tasks.py		stats_tasks.py
test_breakpoint_tasks.py		test_breakpoint_tasks.py
training_tasks.py		training_tasks.py
usda_tasks.py		usda_tasks.py

License

SchmidtDSE/maize-loss-climate-experiment

Folders and files

Latest commit

History

Repository files navigation

Maize Loss Climate Experiment

Purpose

Usage

Local setup

Dev Container (Recommended)

Local pipeline

Interactive tools

Paper

Local package

Alternative: Manual setup

Examples

Review existing results

Run interactive tools

Execute the pipeline locally via Docker

Testing

Deployment

Development standards

Data

CHC-CMIP6

USDA RMA SOB

Yield estimations (SCYM)

Model outputs

Open source

Pipeline dependencies

Tools and visualizations

Paper

Other runtime dependencies

Other sources

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages