Organizing a project

In the previous section, we explained how it is often useful to develop the final analysis versions (production runs) by starting from a simple version and adding data and analysis components preferably one at a time (so called development variant). We also gave some guidance on the order in which data and analysis components are typically added into the development variants. Zonation imposes no strict specifications on how exactly you should name and arrange your input files, but over the years, we have settled on some best practices that we will go through next. For a detailed description on what these input files are, take a look at Chapter 3.3 in the Zonation manual or Chapter 8 in “A quick introduction to Zonation” document.

As starting point, it is convenient to organize all your Zonation input files in a single root folder. This folder constitutes your project folder (see “project root folder” in Figure 2). This folder will house all the necessary input files you need. Write down plain text descriptions of your development variants and devise a coding scheme you can use to create informative file and folder names in the development variants. Note that the input rasters are best organized under a separate subfolder underneath the project (see “data” in Figure 2). Then, you can reference the same input rasters from all analysis variants using relative paths. We do not recommend that you replicate input rasters under individual analysis variants as space requirements multiply and data maintenance can become awkward.

$C:\Users\admin_jlehtoma\Dropbox\Projects\CBIG\Zonation\zprocess\figs\Fig3.png$

Figure 2. An illustration on how to organize the files for your development variants in a systematic way in your Zonation project. Subdividing your data into appropriate subfolders may also be a good idea.

Figure 2 displays a coding scheme based on a numeric prefix and 3-letter code to describe the input data and analysis components used. For example, “01_abf” refers to 1st development variant which is using additive benefit function methods in Zonation as a ranking method. This coding scheme does not specify what input biodiversity features are used, but you can include this in the name as well (e.g. “hab” for habitat data and “spp” for species data if you have both). “02_abf_wgt” builds on the 1st development variant and adds weights (“wgt”). The next variant could be called e.g. “03_abf_wgt_cnd” where “cnd” would stand for taking into account the ecological condition of habitats. While the benefit of this kind of naming scheme is that it makes it easier to remember what is included in each development variant, it has at least one obvious disadvantage: variant names can get unwieldy long if you have many development variants. Note also, that not every input data type and analysis component needs to go into the name; only the critical ones that help you to understand the development of the variants towards the production runs are recommended.

After you have decided which coding scheme serves your purpose the best, create a subfolder for each of your development variants using the coding scheme. Again for convenience, the development variant folders are best placed in a subfolder (see “setup” in Figure 2). If you have all your variant folders in a subfolder, it is easier to move or copy them if need be, or place them in a version control system without the need to necessarily include the data folder.

For each of your development variants - as for any Zonation analysis - you need at absolute minimum the input biodiversity features (input rasters, not shown in Figure 2), a biodiversity feature list file (spp-file) and a run settings file (dat-file). In addition, you will also need a Windows batch file (bat-file) that defines which spp- and dat-files are used, where the outputs are placed and some other parameters. Having a large number of development variants means that you can potentially end up with a large number of input files. Therefore it useful to place all the configuration files related to a specific development variant into that variants’ subfolder (Figure 2). Some development variants will share identical configuration files, but in our experience some redundancy caused by using identical files with different names is a small price to pay for clearer organization. Outputs produced by a development variant are also best placed in that variant’s subfolder. Note also that when you get the simple development variants set up, you can usually create a new variant by just copying the previous development variant’s subfolder, replacing the name of various input files, editing the existing configuration files and potentially adding new configuration files.

Should the input data go into your project folder as well? Everything else being equal, we believe this preferable. As with all files, Zonation does not care where they reside on your system as long as the path definitions in the configuration files are correct. However, if you place all your input files, including the biodiversity feature rasters, into subfolders under your project folder and use relative file paths instead of absolute file paths, your project is self-contained and portable. This does not matter if you are planning to work exclusively on a single computer, but if you ever need to distribute your project to other computers, say to use more computational resources, then a self-contained project folder with relative file paths will save you a lot of trouble.