regi0.taxonomic

regi0.taxonomic.get_canonical_name(names: pandas.Series) pandas.Series

Extracts the canonical name (genus and specific epithet) of a Series of scientific names. It does this by removing special characters, numbers and Open Nomenclature qualifiers (such as aff. or cf.) and then taking the first two words.

Parameters

names (Series) – Series with the scientific names.

Returns

Series with the extracted canonical names.

Return type

Series

regi0.taxonomic.get_checklist_fields(names: Union[list, pandas.Series, str], checklist: Union[str, pathlib.Path, pandas.DataFrame], name_field: str, fields: Union[list, str, tuple], add_supplied_names: bool = False, expand: bool = True) pandas.DataFrame

Retrieves values for one or multiple fields from a checklist given some species names.

Parameters
  • names (Series) – Series with species names.

  • checklist (str, Path or DataFrame) – Path to table or DataFrame wih checklist information.

  • name_field (str) – Name of the column in checklist with species names.

  • fields (list, str or tuple) – List of fields (columns) to retrieve from checklist.

  • add_supplied_names (bool) – Whether to add names as an extra column in the result.

  • expand (bool) – Whether to expand result rows to match names size. If False, the number of rows will correspond to the number of unique names in names.

Returns

DataFrame with the values retrieved from checklist.

Return type

DataFrame

regi0.taxonomic.get_checklist_fields_multiple(names: Union[list, pandas.Series, str], filenames: list, name_field: str, fields: Union[list, str], add_supplied_names: bool = False, expand: bool = True, keep_first: bool = True, add_source: bool = False, source_name: str = 'source') pandas.DataFrame

Retrieves values for one or multiple fields from multiple checklists given some species names. If a species name is found on more than one checklist, only the field(s) values for one of them is kept.

Parameters
  • names – Series with species names.

  • filenames – List of checklist file names.

  • name_field – Name of the column in checklist with species names.

  • fields – List of fields (columns) to retrieve from checklist.

  • add_supplied_names – Whether to add names as an extra column in the result.

  • expand – Whether to expand result rows to match names size. If False, the number of rows will correspond to the number of unique names in names.

  • keep_first – Whether to keep the first match from a checklist or use the latest.

  • add_source – Whether to add the checklist name where the values were retrieved from.

  • source_name – Name of the column with the source.

Returns

DataFrame with the values retrieved from the checklists.

Return type

pd.DataFrame

regi0.taxonomic.is_in_checklist(names: Union[list, pandas.Series, str], checklist: pandas.DataFrame, name_field: str, add_supplied_names: bool = False, expand: bool = True) pandas.DataFrame

Checks whether some species names are found in a given checklist.

Parameters
  • names – Series with species names.

  • checklist – DataFrame wih checklist information.

  • name_field – Name of the column in checklist with species names.

  • add_supplied_names – Whether to add names as an extra column in the result.

  • expand – Whether to expand result rows to match names size. If False, the number of rows will correspond to the number of unique names in names.

Returns

DataFrame with a Boolean Series indicating whether names are present in checklist. If add_supplied_names=True is passed, the result will have an extra column.

Return type

pd.DataFrame

regi0.taxonomic.is_in_checklist_multiple(names: Union[list, pandas.Series, str], filenames: list, name_field: str, add_supplied_names: bool = False, expand: bool = True, keep_first: bool = True, add_source: bool = False, source_name: str = 'source') Union[pandas.DataFrame, pandas.Series]

Checks whether some species names are found in a multiple checklist.

Parameters
  • names – Series with species names.

  • filenames – List of checklist file names.

  • name_field – Name of the column in checklist with species names.

  • add_supplied_names – Whether to add names as an extra column in the result.

  • expand – Whether to expand result rows to match names size. If False, the number of rows will correspond to the number of unique names in names.

  • keep_first – Whether to keep the first match from a checklist or use the latest.

  • add_source – Whether to add the checklist name where the values were retrieved from.

  • source_name – Name of the column with the source.

Returns

DataFrame with a Boolean Series indicating whether names are present in the checklists. If add_supplied_names=True or add_source=True, the result will have extra columns.

Return type

pd.DataFrame