regi0.geographic
- regi0.geographic.find_value_outliers(gdf: geopandas.GeoDataFrame, species_col: str, value_col: str, method: str = 'std', threshold: float = 2.0) pandas.Series
Finds outlier records based on values of a specific column.
- Parameters
gdf (GeoDataFrame) – GeoDataframe with records.
species_col (str) – Column name with the species name for each record.
value_col (str) – Column name with values to find outliers from.
method (str) – Method to find outliers. Can be “std” for Standard Deviation, “iqr” for Interquartile Range or “zscore” for Z Score.
threshold – For the “std” method is the value to multiply the standard deviation with. For the “zscore” method, it is the lower limit (negative) and the upper limit (positive) to compare Z Scores to.
- Returns
Boolean Series indicating whether values are outliers.
- Return type
pd.Series
- regi0.geographic.find_grid_duplicates(gdf: geopandas.GeoDataFrame, species_col: str, resolution: float, bounds: Optional[Union[list, tuple]] = None, keep: Union[bool, str] = False) pandas.Series
Find records of the same species that are in the same cell of a specific grid.
- Parameters
gdf (GeoDataFrame) – GeoDataFrame with records.
species_col (str) – Column name with the species name for each record.
resolution (float) – Grid resolution.
bounds (list or tuple) – Grid bounds (xmin, ym, xmax, ymax). If no bounds are passed, the bounds from gdf will be taken.
keep (str) –
Which duplicates to mark. Can be:
False: mark all duplicates as True.
’first’: mark duplicates as True except for the first occurrence.
’last’: mark duplicates as True except for the last occurrence.
- Returns
Boolean Series indicating whether records are spatial duplicates.
- Return type
pd.Series
Notes
bounds and resolution should match gdf coordinate reference system.
- regi0.geographic.get_layer_field(gdf: geopandas.GeoDataFrame, other: Union[str, pathlib.Path, geopandas.GeoDataFrame], field: str, layer: Optional[str] = None) pandas.Series
Gets the corresponding values of a specific field by performing a spatial join between a GeoDataFrame with records and a GeoDataFrame representing a vector layer.
- Parameters
gdf (GeoDataFrame) – GeoDataFrame with records.
other (GeoDataFrame) – GeoDataFrame with the target layer.
field (str) – Name of the field to extract values from.
layer (str) – Layer name. Only has effect when other is a geopackage file.
- Returns
Extracted values.
- Return type
pd.Series
- regi0.geographic.get_layer_field_historical(gdf: geopandas.GeoDataFrame, others_path: str, date_col: str, field: str, **kwargs) Union[pandas.Series, tuple]
Gets the corresponding values of a specific field by performing a spatial join between a GeoDataFrame with records and multiple historical vector layers.
- Parameters
gdf (GeoDataFrame) – GeoDataFrame with records.
others_path (str) – Path of a .gpkg file or a folder containing .shp files with the historical layers.
date_col (str) – Name of the date column in gdf to match historical layers with.
field (str) – Name of the field to extract values from. Must exist in all the historical layers.
**kwargs – Keyword arguments accepted by the _historical function.
- Returns
values (pd.Series) – Extracted values.
source (pd.Series) – Corresponding source. Only provided if return_source is True.
- regi0.geographic.intersects_layer(gdf: geopandas.GeoDataFrame, other: Union[str, pathlib.Path, geopandas.GeoDataFrame], layer: Optional[str] = None) pandas.Series
Checks whether records from gdf intersect any feature of other.
- Parameters
gdf (GeoDataFrame) – GeoDataFrame with records.
layer (str) – Layer name. Only has effect when other is a geopackage file.
other (str, Path or GeoDataFrame) – GeoDataFrame with features to intersect records with.
- Returns
Boolean Series indicating whether each record intersects other.
- Return type
pd.Series
- regi0.geographic.intersects_layer_historical(gdf: geopandas.GeoDataFrame, others_path: str, date_col: str, **kwargs) Union[pandas.Series, tuple]
Checks whether records from gdf intersect any feature of their corresponding historical vector layer.
- Parameters
gdf (GeoDataFrame) – GeoDataFrame with records.
others_path (str) – Path of a .gpkg file or a folder containing .shp files with the historical layers.
date_col (str) – Name of the date column in gdf to match historical layers with.
kwargs – Keyword arguments accepted by the _historical function.
- Returns
intersects – Boolean Series indicating whether each record intersects other.
source – Corresponding source. Only provided if return_source is True.