User API¶
Core clases¶
- class xagg.classes.weightmap(agg, source_grid, geometry, weights='nowghts')¶
Class for mapping from pixels to polgyons, output from
xagg.wrappers.pixel_overlaps()
Methods
diag_fig
(poly_idx)(NOT YET IMPLEMENTED) a diagnostic figure of the aggregation
- diag_fig(poly_idx)¶
(NOT YET IMPLEMENTED) a diagnostic figure of the aggregation
- class xagg.classes.aggregated(agg, source_grid, geometry, ds_in, weights='nowghts')¶
Class for aggregated data, output from
xagg.core.aggregate()
Methods
to_csv
(fn)Save as csv
Convert to pandas dataframe.
to_dataset
([loc_dim])Convert to xarray dataset.
to_netcdf
(fn[, loc_dim])Save as netcdf
to_shp
(fn)Save as shapefile
- to_csv(fn)¶
Save as csv
- to_dataframe()¶
Convert to pandas dataframe.
- to_dataset(loc_dim='pix_idx')¶
Convert to xarray dataset.
- to_netcdf(fn, loc_dim='pix_idx')¶
Save as netcdf
- to_shp(fn)¶
Save as shapefile
Primary (wrapper) functions¶
- xagg.wrappers.pixel_overlaps(ds, gdf_in, weights=None, weights_target='ds', subset_bbox=True)¶
Wrapper function for determining overlaps between grid and polygon
For a geodataframe gdf_in, takes an xarray structure ds (Dataset or DataArray) and for each polygon in gdf_in provides a list of pixels given by the ds grid which overlap that polygon, in addition to their relative area of overlap with the polygon.
The output is then ready to be fed into
xagg.core.aggregate()
, which aggregates the variables in ds to the polygons in gdf_in using area- (and optionally other) weights.(NB: the wrapper uses
subset_bbox = True
inxagg.core.create_raster_polygons()
)- Parameters
- ds
xarray.Dataset
,xarray.DataArray
an xarray Dataset or DataArray containing at least grid variables (“lat”/”lon”, though several other names are supported; see docs for
xagg.aux.fix_ds()
) and at least one variable on that grid- gdf_in
geopandas.GeoDataFrame
a geopandas GeoDataFrame containing polygons (and any other fields, for example fields from shapefiles)
- weights
xarray.DataArray
orNone
, optional, default =None
(by default, None) if additional weights are desired, (for example, weighting pixels by population in addition to by area overlap), weights is an
xarray.DataArray
containing that information. It does not have to be on the same grid as ds - grids will be homogonized (see below).- weights_targetstr, optional
if ‘ds’, then weights are regridded to the grid in [ds]; if ‘weights’, then the ds variables are regridded to the grid in ‘weights’ (LATTER NOT SUPPORTED YET, raises a NotImplementedError)
- ds
- Returns
- wm_outdict
the output of
xagg.core.get_pixel_overlaps()
which gives the mapping of pixels to polygon aggregation; to be input intoxagg.core.aggregate()
.
- xagg.core.aggregate(ds, wm)¶
Aggregate raster variable(s) to polygon(s)
Aggregates (N-D) raster variables in ds to the polygons in gfd_out - in other words, gives the weighted average of the values in [ds] based on each pixel’s relative area overlap with the polygons.
The values will be additionally weighted if a weight was inputted into
xagg.core.create_raster_polygons()
The code checks whether the input lat/lon grid in ds is equivalent to the linearly indexed grid in wm, or if it can be cropped to that grid.
- Parameters
- ds
xarray.Dataset
an
xarray.Dataset
containing one or more variables with dimensions lat, lon (and possibly more). The dataset’s geographic grid has to include the lat/lon coordinates used in determining the pixel overlaps inxagg.core.get_pixel_overlaps()
(and saved inwm['source_grid']
)- wm
xagg.classes.weightmap
the output to
xagg.core.get_pixel_overlaps()
; axagg.classes.weightmap
object containing['agg']
a dataframe, with one row per polygon, and the columns pix_idxs and rel_area, giving the linear indices and the relative area of each pixel over the polygon, respectively
['source_grid']
the lat/lon grid on which the aggregating parameters were calculated (and on which the linear indices are based)
- ds
- Returns
- agg_out
xagg.classes.aggregated
an
xagg.classes.aggregated
object with the aggregated variables
- agg_out
Auxiliary functions¶
- xagg.aux.fix_ds(ds, var_cipher={'Lat': {'Lat': 'lat', 'Lon': 'lon'}, 'Latitude': {'Latitude': 'lat', 'Longitude': 'lon'}, 'Y': {'X': 'lon', 'Y': 'lat'}, 'latitude': {'latitude': 'lat', 'longitude': 'lon'}, 'latitude_1': {'latitude_1': 'lat', 'longitude_1': 'lon'}, 'nav_lat': {'nav_lat': 'lat', 'nav_lon': 'lon'}, 'y': {'x': 'lon', 'y': 'lat'}}, chg_bnds=True)¶
Puts the input ds into a format compatible with the rest of the package
grid variables are renamed “lat” and “lon”
the lon dimension is made -180:180 to be consistent with most geographic data (as well as any lon_bnds variable if
chg_bnds=True
(by default))the dataset is sorted in ascending order in both lat and lon
NOTE: there probably should be a safeguard in case “y” and “x” are multiindex dimension names instead of lat/lon names… maybe a warning for now… (TO DO)
- Parameters
- ds
xarray.Dataset
an input
xarray.Dataset
, which may or may not need adjustment to be compatible with this package- var_cipherdict, optional
a dict of dicts for renaming lat/lon variables to “lat”/”lon”. The form is
{search_str:{lat_name:'lat',lon_name:'lon'},...}
, the code looks for search_str in the dimensions of the ds; based on that, it renames lat_name to ‘lat’ and lon_name to ‘lon’. Common names for these variables (‘latitude’, ‘Latitude’, ‘Lat’, ‘latitude_1’,’nav_lat’,’Y’) are included out of the box.- chg_bndsbool, optional, default =
True
if
True
, the names of variables with “_bnd” in their names are assumed to be dimension bound variables, and are changed as well if the rest of their name matches ‘o’ (for lon) or ‘a’ (for lat). ## DOES THIS WORK FOR “X” and “Y”?
- ds
- Returns
- ds
xarray.Dataset
a dataset with lat/lon variables in the format necessary for this package to function
- ds
- xagg.aux.get_bnds(ds, edges={'lat': [- 90, 90], 'lon': [- 180, 180]}, wrap_around_thresh=5)¶
Builds vectors of lat/lon bounds if not present in ds
Assumes a regular rectangular grid - so each lat/lon bound is 0.5*(gap between pixels) over to the next pixel.
- Parameters
- ds
xarray.Dataset
an xarray dataset that may or may not contain variables “lat_bnds” and “lon_bnds”
- wrap_around_threshnumeric, optional, default =
5
(degrees) the minimum distance between the last pixel edge and the ‘edges’ of the coordinate system for which the pixels are ‘wrapped around’. For example, given ‘lon’ edges of [-180,180] and a wrap_around_thresh of 5 (default), if the calculated edges of pixels match the edge on one side, but not the other (i.e. -180 and 179.4) and this gap (180-179.4) is less than 5, the -180 edge is changed to 179.4 to allow the pixel to ‘wrap around’ the edge of the coordinate system.
- ds
- Returns
- ds
xarray.Dataset
the same dataset as inputted, unchanged if “lat/lon_bnds” already existed, or with new variables “lat_bnds” and “lon_bnds” if not.
- ds
- xagg.aux.normalize(a, drop_na=False)¶
Normalizes the vector a
The vector a is divided by its sum.
- Parameters
- aarray_like
A vector to be normalized
- drop_nabool, optional, default =
False
If drop_na = True, and there are nans in the vector a, then the normalization is calculated using only the non-nan locations in a, and the vector is returned with the nans in their original location. In other words, np.nansum(normalize(a),drop_na=True) == 1.0
If drop_na = False, and nans are present in a, then normalize just returns a vector of np.nan the same length of a.
- Returns
- avector
a, but normalized.
- xagg.aux.subset_find(ds0, ds1)¶
Finds the grid of ds1 in ds0, and subsets ds0 to the grid in ds1
- Parameters
- ds0
xarray.Dataset
an xarray Dataset to be subset based on the grid of ds1; must contain grid variables “lat” or “lon” (could add a fix_ds call)
- ds1
xarray.Dataset
,xarray.DataArray
either an xarray structrue (Dataset, DataArray) with “lat” “lon” variables, or a dictionary with DataArrays [‘lat’] and [‘lon’]. IMPORTANT: ds1 HAS TO BE BROADCAST - i.e. one value of lat, lon each coordinate, with lat and lon vectors of equal length. This can be done e.g. using
ds1.stack(loc=('lat','lon'))
.
- ds0
- Returns
- ds0
xarray.Dataset
The input ds0, subset to the locations in ds1.
- ds0