regi0
- regi0.match(left: pandas.Series, right: pandas.Series, preprocess: bool = False, fuzzy: bool = False, threshold: float = 0.8) pandas.Series
Compares values between two different Series to check if they match.
- Parameters
left (Series) – Left Series.
right (Series) – Right Series.
preprocess (bool) – Whether to clean and standardize values before comparing them.
fuzzy (bool) – Whether to compare values using fuzzy logic.
threshold (float) – Threshold to define equal values using fuzzy logic.
- Returns
Series with booleans indicating whether the values match.
- Return type
Series
- regi0.read_geographic_table(path: Union[str, pathlib.Path], lon_col: str, lat_col: str, crs: str = 'epsg:4326', drop_empty_coords: bool = False, reset_index: bool = True) geopandas.GeoDataFrame
Reads tabular data (csv, txt, xls or xlsx) and converts it to a GeoDataFrame.
- Parameters
path (str or Path) – Filename with extension. Can be a relative or absolute path.
lon_col (str) – Name of the longitude column.
lat_col (str) – Name of the latitude column.
crs (str) – Coordinate reference system with the corresponding EPSG code. Must be in the form epsg:code.
drop_empty_coords (bool) – Whether to remove rows with missing or incomplete coordinates.
reset_index (bool) – Whether to reset the result’s index after removing rows with missing or incomplete coordinates. Only has effect when drop_empty_coords is True.
- Returns
GeoDataFrame with the records.
- Return type
gpd.GeoDataFrame
- regi0.read_table(path: Union[str, pathlib.Path], **kwargs) pandas.DataFrame
Reads tabular data (csv, txt, xls or xlsx).
- Parameters
path (str or Path) – Filename with extension. Can be a relative or absolute path.
**kwargs – pandas read_csv, read_table and read_excel keyword arguments.
- Returns
DataFrame with the tabular data.
- Return type
pd.DataFrame
- regi0.verify(df: pandas.DataFrame, observed_col: str, expected: pandas.Series, flag_name: str, add_suggested: bool = False, suggested_name: Optional[str] = None, add_source: bool = False, source: Optional[pandas.Series] = None, source_name: Optional[str] = None, drop: bool = False, **kwargs) pandas.DataFrame
Verifies that the values in a specific column from df match some expected values.
- Parameters
df (DataFrame) – DataFrame with values.
observed_col (str) – Name of the column in df with the values to verify.
expected (Series) – Series with expected values. Has to match df length.
flag_name (str) – Name of the resulting column indicating whether the observed values match the expected values.
add_suggested (bool) – Whether to add a column to the result with suggested values for those rows where the observed values do not match the expected values.
suggested_name (str) – Name of the column for the suggested values. Only has effect when add_suggested=True is passed.
add_source (bool) –
source (Series) –
drop (bool) – Whether to drop the rows where the observed values do not match the expected values.
kwargs – Keyword arguments accepted by the match function.
- Returns
Copy of df with extra columns.
- Return type
DataFrame
- regi0.write_table(df: pandas.DataFrame, path: Union[str, pathlib.Path], **kwargs) None
Writes tabular data (csv, txt, xls or xlsx) to disk.
- Parameters
df (pd.DataFrame) – DataFrame to write to disk.
path (str or Path) – Filename with extension. Can be a relative or absolute path.
**kwargs – Keyword arguments for pandas read_csv, read_table and read_excel functions.
- Return type
None