species module

species module API:

  • name_backbone

  • name_suggest

  • name_usage

  • name_lookup

  • name_parser

Example usage:

from pygbif import species
species.name_suggest(q='Puma concolor')

species API

species.name_backbone(taxonRank=None, usageKey=None, kingdom=None, phylum=None, class_=None, order=None, superfamily=None, family=None, subfamily=None, tribe=None, subtribe=None, genus=None, subgenus=None, species=None, taxonID=None, taxonConceptID=None, scientificNameID=None, scientificNameAuthorship=None, genericName=None, specificEpithet=None, infraspecificEpithet=None, verbatimTaxonRank=None, exclude=None, strict=None, verbose=None, checklistKey=None, **kwargs)

Match names to the GBIF backbone taxonomy.

Parameters:
  • scientificName – [str] Full scientific name potentially with authorship. (Required)

  • taxonRank – [str], optional Filter by taxonomic rank. See API reference for available values.

  • usageKey – [str], optional The usage key to look up. When provided, all other fields are ignored.

  • kingdom – [str], optional Kingdom to match.

  • phylum – [str], optional Phylum to match.

  • class – [str], optional Class to match.

  • order – [str], optional Order to match.

  • superfamily – [str], optional Superfamily to match.

  • family – [str], optional Family to match.

  • subfamily – [str], optional Subfamily to match.

  • tribe – [str], optional Tribe to match.

  • subtribe – [str], optional Subtribe to match.

  • genus – [str], optional Genus to match.

  • subgenus – [str], optional Subgenus to match.

  • species – [str], optional Species to match.

  • taxonID – [str], optional The taxon ID to look up. Matches to a taxonID will take precedence over scientificName values supplied. A comparison of the matched scientific and taxonID is performed to check for inconsistencies.

  • taxonConceptID – [str], optional The taxonConceptID to match. Matches to a taxonConceptID will take precedence over scientificName values supplied. A comparison of the matched scientific and taxonConceptID is performed to check for inconsistencies.

  • scientificNameID – [str], optional Matches to a scientificNameID will take precedence over scientificName values supplied. A comparison of the matched scientific and scientificNameID is performed to check for inconsistencies.

  • scientificNameAuthorship – [str], optional The scientific name authorship to match against.

  • genericName – [str], optional Generic part of the name to match when given as atomised parts instead of the full name.

  • specificEpithet – [str], optional Specific epithet to match.

  • infraspecificEpithet – [str], optional Infraspecific epithet to match.

  • verbatimTaxonRank – [str], optional Filters by free text taxon rank.

  • exclude – [str], optional An array of usage keys to exclude from the match.

  • strict – [bool], optional If set to True, fuzzy matches only the given name, but never a taxon in the upper classification.

  • verbose – [bool], optional If set to True, shows alternative matches which were considered but then rejected.

  • checklistKey – [str], optional The key of a checklist to use. Default is the GBIF Backbone taxonomy.

name_backbone() return a dictionary with the following keys: ['usage', 'classification', 'diagnostics', 'synonym']

  • usage: Returns the matched name and some details such usage key.

  • classification: Returns the upper classification of the matched name.

  • diagnostics: Returns information about the match, such as match type, issues, and confidence.

  • synonym: Indicates if the matched name is a synonym.

The default is to return the best match for the given name. If there are “multiple equal matches”, name_backbone(), will return a note in the diagnostics section: res["diagnostics"]["note"] = "Multiple equal matches for name".

This note usually happens when a binomial name is provided without authorship. Proving authorship will almost always fix the “multiple equal matches” issue. If verbose=True, the function will return other alternative matches. These are accessible via res['diagnostics']['alternatives'].

If your name does not get a match, the GBIF API will return [matchType] ='NONE'. If the species-level name is not found, the API will sometimes return [matchType] = 'HIGHERRANK'. With higher rank matches, the name is matched to a higher taxonomic rank, such as genus or family. Often supplying authorship will improve matching results.

If strict=True, then higher taxon ranks will not be returned when there is a “fuzzy match”. Higher rank matches will still be returned if the match is exact.

To match names against a specific checklist, you can use the checklistKey parameter. This allows you to specify a checklist from which the name should be matched. If no checklistKey is provided, the GBIF Backbone Taxonomy is used by default.

For more information, see the GBIF API documentation: https://techdocs.gbif.org/en/openapi/v1/species#/Searching%20names/matchNames

Usage:

from pygbif import species
species.name_backbone(scientificName="Helianthus annuus", kingdom="Plantae")
species.name_backbone(scientificName="Poa", taxonRank="GENUS", family="Poaceae")

# Verbose - gives back alternatives
species.name_backbone(scientificName="Helianthus annuus", kingdom="Plantae", verbose=True)

# Strictness
# If strict=True, then higher taxon ranks will not be returned  when there is a "fuzzy match".
# Higher rank matches will still be returned if the match is exact.
species.name_backbone(scientificName="Poa", kingdom="Plantae", verbose=True, strict=False)
species.name_backbone(scientificName="Helianthus annuus", kingdom="Plantae", verbose=True, strict=True)

# Multiple equal matches
species.name_backbone(scientificName="Oenante")
species.name_backbone(scientificName="Oenante", verbose=True)
species.name_backbone(scientificName="Calopteryx")
species.name_backbone(scientificName="Calopteryx", verbose=True)

# Including authorship in scientificName fixes "Multiple equal matches" note
species.name_backbone(scientificName="Calopteryx splendens (Harris, 1780)")
species.name_backbone(scientificName="Oenanthe L.")

# Match using an alternative checklist 
species.name_backbone(scientificName="Calopteryx splendens", checklistKey="7ddf754f-d193-4cc9-b351-99906754a03b")
species.name_suggest(datasetKey=None, rank=None, limit=100, offset=None, **kwargs)

A quick and simple autocomplete service that returns up to 20 name usages by doing prefix matching against the scientific name. Results are ordered by relevance.

Parameters:
  • q – [str] Simple search parameter. The value for this parameter can be a simple word or a phrase. Wildcards can be added to the simple word parameters only, e.g. q=*puma* (Required)

  • datasetKey – [str] Filters by the checklist dataset key (a uuid, see examples)

  • rank – [str] A taxonomic rank. One of class, cultivar, cultivar_group, domain, family, form, genus, informal, infrageneric_name, infraorder, infraspecific_name, infrasubspecific_name, kingdom, order, phylum, section, series, species, strain, subclass, subfamily, subform, subgenus, subkingdom, suborder, subphylum, subsection, subseries, subspecies, subtribe, subvariety, superclass, superfamily, superorder, superphylum, suprageneric_name, tribe, unranked, or variety.

  • limit – [fixnum] Number of records to return. Maximum: 1000. (optional)

  • offset – [fixnum] Record number to start at. (optional)

Returns:

A dictionary

References: http://www.gbif.org/developer/species#searching

Usage:

from pygbif import species

species.name_suggest(q='Puma concolor')
x = species.name_suggest(q='Puma')
species.name_suggest(q='Puma', rank="genus")
species.name_suggest(q='Puma', rank="subspecies")
species.name_suggest(q='Puma', rank="species")
species.name_suggest(q='Puma', rank="infraspecific_name")
species.name_suggest(q='Puma', limit=2)
species.name_lookup(rank=None, higherTaxonKey=None, status=None, isExtinct=None, habitat=None, nameType=None, datasetKey=None, nomenclaturalStatus=None, limit=100, offset=None, facet=False, facetMincount=None, facetMultiselect=None, type=None, hl=False, verbose=False, **kwargs)

Lookup names in all taxonomies in GBIF.

This service uses fuzzy lookup so that you can put in partial names and you should get back those things that match. See examples below.

Parameters:
  • q – [str] Query term(s) for full text search (optional)

  • rank – [str] CLASS, CULTIVAR, CULTIVAR_GROUP, DOMAIN, FAMILY, FORM, GENUS, INFORMAL, INFRAGENERIC_NAME, INFRAORDER, INFRASPECIFIC_NAME, INFRASUBSPECIFIC_NAME, KINGDOM, ORDER, PHYLUM, SECTION, SERIES, SPECIES, STRAIN, SUBCLASS, SUBFAMILY, SUBFORM, SUBGENUS, SUBKINGDOM, SUBORDER, SUBPHYLUM, SUBSECTION, SUBSERIES, SUBSPECIES, SUBTRIBE, SUBVARIETY, SUPERCLASS, SUPERFAMILY, SUPERORDER, SUPERPHYLUM, SUPRAGENERIC_NAME, TRIBE, UNRANKED, VARIETY (optional)

  • verbose – [bool] If True show alternative matches considered which had been rejected.

  • higherTaxonKey – [str] Filters by any of the higher Linnean rank keys. Note this is within the respective checklist and not searching nub keys across all checklists (optional)

  • status

    [str] (optional) Filters by the taxonomic status as one of:

    • ACCEPTED

    • DETERMINATION_SYNONYM Used for unknown child taxa referred to via spec, ssp, …

    • DOUBTFUL Treated as accepted, but doubtful whether this is correct.

    • HETEROTYPIC_SYNONYM More specific subclass of SYNONYM.

    • HOMOTYPIC_SYNONYM More specific subclass of SYNONYM.

    • INTERMEDIATE_RANK_SYNONYM Used in nub only.

    • MISAPPLIED More specific subclass of SYNONYM.

    • PROPARTE_SYNONYM More specific subclass of SYNONYM.

    • SYNONYM A general synonym, the exact type is unknown.

  • isExtinct – [bool] Filters by extinction status (e.g. isExtinct=True)

  • habitat – [str] Filters by habitat. One of: marine, freshwater, or terrestrial (optional)

  • nameType

    [str] (optional) Filters by the name type as one of:

    • BLACKLISTED surely not a scientific name.

    • CANDIDATUS Candidatus is a component of the taxonomic name for a bacterium that cannot be maintained in a Bacteriology Culture Collection.

    • CULTIVAR a cultivated plant name.

    • DOUBTFUL doubtful whether this is a scientific name at all.

    • HYBRID a hybrid formula (not a hybrid name).

    • INFORMAL a scientific name with some informal addition like “cf.” or indetermined like Abies spec.

    • SCINAME a scientific name which is not well formed.

    • VIRUS a virus name.

    • WELLFORMED a well formed scientific name according to present nomenclatural rules.

  • datasetKey – [str] Filters by the dataset’s key (a uuid) (optional)

  • nomenclaturalStatus – [str] Not yet implemented, but will eventually allow for filtering by a nomenclatural status enum

  • limit – [fixnum] Number of records to return. Maximum: 1000. (optional)

  • offset – [fixnum] Record number to start at. (optional)

  • facet – [str] A list of facet names used to retrieve the 100 most frequent values for a field. Allowed facets are: datasetKey, higherTaxonKey, rank, status, isExtinct, habitat, and nameType. Additionally threat and nomenclaturalStatus are legal values but not yet implemented, so data will not yet be returned for them. (optional)

  • facetMincount – [str] Used in combination with the facet parameter. Set facetMincount={#} to exclude facets with a count less than {#}, e.g. http://bit.ly/1bMdByP only shows the type value ACCEPTED because the other statuses have counts less than 7,000,000 (optional)

  • facetMultiselect – [bool] Used in combination with the facet parameter. Set facetMultiselect=True to still return counts for values that are not currently filtered, e.g. http://bit.ly/19YLXPO still shows all status values even though status is being filtered by status=ACCEPTED (optional)

  • type – [str] Type of name. One of occurrence, checklist, or metadata. (optional)

  • hl – [bool] Set hl=True to highlight terms matching the query when in fulltext search fields. The highlight will be an emphasis tag of class gbifH1 e.g. q='plant', hl=True. Fulltext search fields include: title, keyword, country, publishing country, publishing organization title, hosting organization title, and description. One additional full text field is searched which includes information from metadata documents, but the text of this field is not returned in the response. (optional)

Returns:

A dictionary

References:

http://www.gbif.org/developer/species#searching

Usage:

from pygbif import species

# Look up names like mammalia
species.name_lookup(q='mammalia')

# Paging
species.name_lookup(q='mammalia', limit=1)
species.name_lookup(q='mammalia', limit=1, offset=2)

# large requests, use offset parameter
first = species.name_lookup(q='mammalia', limit=1000)
second = species.name_lookup(q='mammalia', limit=1000, offset=1000)

# Get all data and parse it, removing descriptions which can be quite long
species.name_lookup('Helianthus annuus', rank="species", verbose=True)

# Get all data and parse it, removing descriptions field which can be quite long
out = species.name_lookup('Helianthus annuus', rank="species")
res = out['results']
[ z.pop('descriptions', None) for z in res ]
res

# Fuzzy searching
species.name_lookup(q='Heli', rank="genus")

# Limit records to certain number
species.name_lookup('Helianthus annuus', rank="species", limit=2)

# Query by habitat
species.name_lookup(habitat = "terrestrial", limit=2)
species.name_lookup(habitat = "marine", limit=2)
species.name_lookup(habitat = "freshwater", limit=2)

# Using faceting
species.name_lookup(facet='status', limit=0, facetMincount='70000')
species.name_lookup(facet=['status', 'higherTaxonKey'], limit=0, facetMincount='700000')

species.name_lookup(facet='nameType', limit=0)
species.name_lookup(facet='habitat', limit=0)
species.name_lookup(facet='datasetKey', limit=0)
species.name_lookup(facet='rank', limit=0)
species.name_lookup(facet='isExtinct', limit=0)

# text highlighting
species.name_lookup(q='plant', hl=True, limit=30)

# Lookup by datasetKey
species.name_lookup(datasetKey='3f8a1297-3259-4700-91fc-acc4170b27ce')
species.name_usage(name=None, data='all', language=None, datasetKey=None, uuid=None, sourceId=None, rank=None, shortname=None, limit=100, offset=None, **kwargs)

Lookup details for specific names in all taxonomies in GBIF.

Parameters:
  • key – [fixnum] A GBIF key for a taxon

  • name – [str] Filters by a case insensitive, canonical namestring, e.g. ‘Puma concolor’

  • data – [str] The type of data to get. Default: all. Options: all, verbatim, name, parents, children, related, synonyms, descriptions, distributions, media, references, speciesProfiles, vernacularNames, typeSpecimens, root

  • language – [str] Language. Expects a ISO 639-1 language codes using 2 lower case letters. Languages returned are 3 letter codes. The language parameter only applies to the /species, /species/{int}, /species/{int}/parents, /species/{int}/children, /species/{int}/related, /species/{int}/synonyms routes (here routes are determined by the data parameter).

  • datasetKey – [str] Filters by the dataset’s key (a uuid)

  • uuid – [str] A uuid for a dataset. Should give exact same results as datasetKey.

  • sourceId – [fixnum] Filters by the source identifier.

  • rank – [str] Taxonomic rank. Filters by taxonomic rank as one of: CLASS, CULTIVAR, CULTIVAR_GROUP, DOMAIN, FAMILY, FORM, GENUS, INFORMAL, INFRAGENERIC_NAME, INFRAORDER, INFRASPECIFIC_NAME, INFRASUBSPECIFIC_NAME, KINGDOM, ORDER, PHYLUM, SECTION, SERIES, SPECIES, STRAIN, SUBCLASS, SUBFAMILY, SUBFORM, SUBGENUS, SUBKINGDOM, SUBORDER, SUBPHYLUM, SUBSECTION, SUBSERIES, SUBSPECIES, SUBTRIBE, SUBVARIETY, SUPERCLASS, SUPERFAMILY, SUPERORDER, SUPERPHYLUM, SUPRAGENERIC_NAME, TRIBE, UNRANKED, VARIETY

  • shortname – [str] A short name..need more info on this?

  • limit – [fixnum] Number of records to return. Default: 100. Maximum: 1000. (optional)

  • offset – [fixnum] Record number to start at. (optional)

References: See http://www.gbif.org/developer/species#nameUsages for details

Usage:

from pygbif import species

species.name_usage(key=1)

# Name usage for a taxonomic name
species.name_usage(name='Puma', rank="GENUS")

# All name usages
species.name_usage()

# References for a name usage
species.name_usage(key=2435099, data='references')

# Species profiles, descriptions
species.name_usage(key=5231190, data='speciesProfiles')
species.name_usage(key=5231190, data='descriptions')
species.name_usage(key=2435099, data='children')

# Vernacular names for a name usage
species.name_usage(key=5231190, data='vernacularNames')

# Limit number of results returned
species.name_usage(key=5231190, data='vernacularNames', limit=3)

# Search for names by dataset with datasetKey parameter
species.name_usage(datasetKey="d7dddbf4-2cf0-4f39-9b2a-bb099caae36c")
species.name_parser(**kwargs)

Parse taxon names using the GBIF name parser

Parameters:

name – [str] A character vector of scientific names. (required)

reference: http://www.gbif.org/developer/species#parser

Usage:

from pygbif import species
species.name_parser('x Agropogon littoralis')
species.name_parser(['Arrhenatherum elatius var. elatius',
  'Secale cereale subsp. cereale', 'Secale cereale ssp. cereale',
  'Vanessa atalanta (Linnaeus, 1758)'])