Design Choices

In designing RasterIndex, we faced a few thorny questions. Below we discuss these considerations, and the approach we’ve taken. Ultimately, there are no easy answers and tradeoffs to be made.

CRS handling

Why do we want CRS handling?

  1. We want Xarray to disallow alignment of two Xarray objects with different CRS e.g. da1 + da2 should fail if da1, and da2 have different CRS.

  2. Support wraparound indexing along the longitude dimension (GH#26)

  3. Assign appropriate attributes to the created coordinate variables (GH#22). (e.g. choose between standard_name: latitude and standard_name: projection_y_coordinate)

  4. more?

xproj.CRSIndex is an attempt at providing a building block for CRS handling in the Xarray ecosystem that solved problem (1). Thus (1) can be handled by assigning xproj.CRSIndex to the spatial_ref variable. How might RasterIndex integrate with xproj.CRSIndex? Our options are:

  1. fully encapsulate xproj.CRSIndex, or

  2. satisfy the “CRS-aware” protocol provided by xproj, or

  3. simply handle the affine transform and ignore the CRS altogether.

Why should RasterIndex be aware of the CRS?

RasterIndex handles indexing and the creation of coordinate variables. Thus it is the natural place to support (2) and (3) in the list above.

Why not encapsulate CRSIndex?

If RasterIndex must track CRS in some form, one way to do that would be to have RasterIndex internally build a CRSIndex for the spatial_ref variable. Thus, RasterIndex would be associated with 3 variables instead of 2: x, y, and spatial_ref, for example.

The downside of this approach is that it doesn’t compose well with any other Index that would also like to handle the CRS (e.g. xvec.GeometryIndex). For example, xr.merge([geometries, raster]) where geometries has xvec.GeometryIndex[geometry, spatial_ref] (square brackets list associated coordinate variable names) and raster has RasterIndex[x, y, spatial_ref], would fail because the variable spatial_ref is associated with two Indexes of different types. This fails because the Xarray model enforces that one Variable is only associated with only one Index, in order to prevent different Indexes modifying the same Variable.

CRS-aware Index

Therefore, we have chosen to experiment with the “CRS-aware” approach described in the xproj docs. Here RasterIndex tracks it’s own optional copy of a CRS object (not an Index) and defines the hooks needed for CRSIndex to communicate with RasterIndex. The downside here is that CRS information is duplicated in two places explicitly, and requires explicit handling to ensure consistency

Don’t like it?

We chose this approach to enable experimentation. It is entirely possible to experiment with other approaches. Please reach out if you have opinions on this topic.

Handling the GeoTransform attribute

GDAL chooses to track the affine transform using a GeoTransform attribute on a spatial_ref variable. The "spatial_ref" is a “grid mapping” variable (as termed by the CF-conventions). It also records CRS information. Currently, our design is that xproj.CRSIndex controls the CRS information and handles the creation of the "spatial_ref" variable, or more generally, the grid mapping variable. Thus, RasterIndex cannot keep the "GeoTransform" attribute on "spatial_ref" up-to-date because it does not control it.

Thus, assign_index() will delete the "GeoTransform" attribute on the grid mapping variable if it is detected, after using it to construct the affine matrix.

If you wish to extract the GeoTransform attribute to write it to a location of your choosing use RasterIndex.as_geotransform().

Attributes on coordinate variables

Currently assign_index will first drop any existing coordinate variables and then recreate them as lazy coordinate variables. This loses any existing attributes, and new attributes appropriate to the coordinate system (generated by pyproj.CRS.cs_to_cf()) are added.