regi0.geographic

regi0.geographic.find_value_outliers(gdf: geopandas.GeoDataFrame, species_col: str, value_col: str, method: str = 'std', threshold: float = 2.0) → pandas.Series

Finds outlier records based on values of a specific column.

Parameters

gdf (GeoDataFrame) – GeoDataframe with records.
species_col (str) – Column name with the species name for each record.
value_col (str) – Column name with values to find outliers from.
method (str) – Method to find outliers. Can be “std” for Standard Deviation, “iqr” for Interquartile Range or “zscore” for Z Score.
threshold – For the “std” method is the value to multiply the standard deviation with. For the “zscore” method, it is the lower limit (negative) and the upper limit (positive) to compare Z Scores to.

Returns

Boolean Series indicating whether values are outliers.

Return type

pd.Series

regi0.geographic.find_grid_duplicates(gdf: geopandas.GeoDataFrame, species_col: str, resolution: float, bounds: Optional[Union[list, tuple]] = None, keep: Union[bool, str] = False) → pandas.Series

Find records of the same species that are in the same cell of a specific grid.

Parameters

gdf (GeoDataFrame) – GeoDataFrame with records.
species_col (str) – Column name with the species name for each record.
resolution (float) – Grid resolution.
bounds (list or tuple) – Grid bounds (xmin, ym, xmax, ymax). If no bounds are passed, the bounds from gdf will be taken.
keep (str) –
Which duplicates to mark. Can be:
- False: mark all duplicates as True.
- ’first’: mark duplicates as True except for the first occurrence.
- ’last’: mark duplicates as True except for the last occurrence.

Returns

Boolean Series indicating whether records are spatial duplicates.

Return type

pd.Series

Notes

bounds and resolution should match gdf coordinate reference system.

regi0.geographic.get_layer_field(gdf: geopandas.GeoDataFrame, other: Union[str, pathlib.Path, geopandas.GeoDataFrame], field: str, layer: Optional[str] = None) → pandas.Series

Gets the corresponding values of a specific field by performing a spatial join between a GeoDataFrame with records and a GeoDataFrame representing a vector layer.

Parameters

gdf (GeoDataFrame) – GeoDataFrame with records.
other (GeoDataFrame) – GeoDataFrame with the target layer.
field (str) – Name of the field to extract values from.
layer (str) – Layer name. Only has effect when other is a geopackage file.

Returns

Extracted values.

Return type

pd.Series

regi0.geographic.get_layer_field_historical(gdf: geopandas.GeoDataFrame, others_path: str, date_col: str, field: str, **kwargs) → Union[pandas.Series, tuple]

Gets the corresponding values of a specific field by performing a spatial join between a GeoDataFrame with records and multiple historical vector layers.

Parameters

gdf (GeoDataFrame) – GeoDataFrame with records.
others_path (str) – Path of a .gpkg file or a folder containing .shp files with the historical layers.
date_col (str) – Name of the date column in gdf to match historical layers with.
field (str) – Name of the field to extract values from. Must exist in all the historical layers.
**kwargs – Keyword arguments accepted by the _historical function.

Returns

values (pd.Series) – Extracted values.
source (pd.Series) – Corresponding source. Only provided if return_source is True.

regi0.geographic.intersects_layer(gdf: geopandas.GeoDataFrame, other: Union[str, pathlib.Path, geopandas.GeoDataFrame], layer: Optional[str] = None) → pandas.Series

Checks whether records from gdf intersect any feature of other.

Parameters

gdf (GeoDataFrame) – GeoDataFrame with records.
layer (str) – Layer name. Only has effect when other is a geopackage file.
other (str, Path or GeoDataFrame) – GeoDataFrame with features to intersect records with.

Returns

Boolean Series indicating whether each record intersects other.

Return type

pd.Series

regi0.geographic.intersects_layer_historical(gdf: geopandas.GeoDataFrame, others_path: str, date_col: str, **kwargs) → Union[pandas.Series, tuple]

Checks whether records from gdf intersect any feature of their corresponding historical vector layer.

Parameters

gdf (GeoDataFrame) – GeoDataFrame with records.
others_path (str) – Path of a .gpkg file or a folder containing .shp files with the historical layers.
date_col (str) – Name of the date column in gdf to match historical layers with.
kwargs – Keyword arguments accepted by the _historical function.

Returns

intersects – Boolean Series indicating whether each record intersects other.
source – Corresponding source. Only provided if return_source is True.

regi0.geographic.arcgis