regi0.geographic

regi0.geographic.find_value_outliers(gdf: geopandas.GeoDataFrame, species_col: str, value_col: str, method: str = 'std', threshold: float = 2.0) pandas.Series

Finds outlier records based on values of a specific column.

Parameters
  • gdf (GeoDataFrame) – GeoDataframe with records.

  • species_col (str) – Column name with the species name for each record.

  • value_col (str) – Column name with values to find outliers from.

  • method (str) – Method to find outliers. Can be “std” for Standard Deviation, “iqr” for Interquartile Range or “zscore” for Z Score.

  • threshold – For the “std” method is the value to multiply the standard deviation with. For the “zscore” method, it is the lower limit (negative) and the upper limit (positive) to compare Z Scores to.

Returns

Boolean Series indicating whether values are outliers.

Return type

pd.Series

regi0.geographic.find_grid_duplicates(gdf: geopandas.GeoDataFrame, species_col: str, resolution: float, bounds: Optional[Union[list, tuple]] = None, keep: Union[bool, str] = False) pandas.Series

Find records of the same species that are in the same cell of a specific grid.

Parameters
  • gdf (GeoDataFrame) – GeoDataFrame with records.

  • species_col (str) – Column name with the species name for each record.

  • resolution (float) – Grid resolution.

  • bounds (list or tuple) – Grid bounds (xmin, ym, xmax, ymax). If no bounds are passed, the bounds from gdf will be taken.

  • keep (str) –

    Which duplicates to mark. Can be:

    • False: mark all duplicates as True.

    • ’first’: mark duplicates as True except for the first occurrence.

    • ’last’: mark duplicates as True except for the last occurrence.

Returns

Boolean Series indicating whether records are spatial duplicates.

Return type

pd.Series

Notes

bounds and resolution should match gdf coordinate reference system.

regi0.geographic.get_layer_field(gdf: geopandas.GeoDataFrame, other: Union[str, pathlib.Path, geopandas.GeoDataFrame], field: str, layer: Optional[str] = None) pandas.Series

Gets the corresponding values of a specific field by performing a spatial join between a GeoDataFrame with records and a GeoDataFrame representing a vector layer.

Parameters
  • gdf (GeoDataFrame) – GeoDataFrame with records.

  • other (GeoDataFrame) – GeoDataFrame with the target layer.

  • field (str) – Name of the field to extract values from.

  • layer (str) – Layer name. Only has effect when other is a geopackage file.

Returns

Extracted values.

Return type

pd.Series

regi0.geographic.get_layer_field_historical(gdf: geopandas.GeoDataFrame, others_path: str, date_col: str, field: str, **kwargs) Union[pandas.Series, tuple]

Gets the corresponding values of a specific field by performing a spatial join between a GeoDataFrame with records and multiple historical vector layers.

Parameters
  • gdf (GeoDataFrame) – GeoDataFrame with records.

  • others_path (str) – Path of a .gpkg file or a folder containing .shp files with the historical layers.

  • date_col (str) – Name of the date column in gdf to match historical layers with.

  • field (str) – Name of the field to extract values from. Must exist in all the historical layers.

  • **kwargs – Keyword arguments accepted by the _historical function.

Returns

  • values (pd.Series) – Extracted values.

  • source (pd.Series) – Corresponding source. Only provided if return_source is True.

regi0.geographic.intersects_layer(gdf: geopandas.GeoDataFrame, other: Union[str, pathlib.Path, geopandas.GeoDataFrame], layer: Optional[str] = None) pandas.Series

Checks whether records from gdf intersect any feature of other.

Parameters
  • gdf (GeoDataFrame) – GeoDataFrame with records.

  • layer (str) – Layer name. Only has effect when other is a geopackage file.

  • other (str, Path or GeoDataFrame) – GeoDataFrame with features to intersect records with.

Returns

Boolean Series indicating whether each record intersects other.

Return type

pd.Series

regi0.geographic.intersects_layer_historical(gdf: geopandas.GeoDataFrame, others_path: str, date_col: str, **kwargs) Union[pandas.Series, tuple]

Checks whether records from gdf intersect any feature of their corresponding historical vector layer.

Parameters
  • gdf (GeoDataFrame) – GeoDataFrame with records.

  • others_path (str) – Path of a .gpkg file or a folder containing .shp files with the historical layers.

  • date_col (str) – Name of the date column in gdf to match historical layers with.

  • kwargs – Keyword arguments accepted by the _historical function.

Returns

  • intersects – Boolean Series indicating whether each record intersects other.

  • source – Corresponding source. Only provided if return_source is True.