species module¶
species module API:
- name_backbone
- name_suggest
- name_usage
- name_lookup
- name_parser
Example usage:
from pygbif import species
species.name_suggest(q='Puma concolor')
species API¶
-
species.
name_backbone
(rank=None, kingdom=None, phylum=None, clazz=None, order=None, family=None, genus=None, strict=False, verbose=False, offset=None, limit=100, **kwargs)¶ Lookup names in the GBIF backbone taxonomy.
Parameters: - name – [str] Full scientific name potentially with authorship (required)
- rank – [str] The rank given as our rank enum. (optional)
- kingdom – [str] If provided default matching will also try to match against this if no direct match is found for the name alone. (optional)
- phylum – [str] If provided default matching will also try to match against this if no direct match is found for the name alone. (optional)
- class – [str] If provided default matching will also try to match against this if no direct match is found for the name alone. (optional)
- order – [str] If provided default matching will also try to match against this if no direct match is found for the name alone. (optional)
- family – [str] If provided default matching will also try to match against this if no direct match is found for the name alone. (optional)
- genus – [str] If provided default matching will also try to match against this if no direct match is found for the name alone. (optional)
- strict – [bool] If
True
it (fuzzy) matches only the given name, but never a taxon in the upper classification (optional) - verbose – [bool] If
True
show alternative matches considered which had been rejected. - offset – [int] Record to start at. Default:
0
- limit – [int] Number of results to return. Default:
100
If you are looking for behavior similar to the GBIF website when you search for a name, name_backbone may be what you want. For example, a search for Lantanophaga pusillidactyla on the GBIF website and with name_backbone will give back as a first result the correct name Lantanophaga pusillidactylus.
A list for a single taxon with many slots (with
verbose=False
- default), or a list of length two, first element for the suggested taxon match, and a data.frame with alternative name suggestions resulting from fuzzy matching (withverbose=True
).If you don’t get a match GBIF gives back a list of length 3 with slots synonym, confidence, and
matchType='NONE'
.reference: https://www.gbif.org/developer/species#searching
Usage:
from pygbif import species species.name_backbone(name='Helianthus annuus', kingdom='plants') species.name_backbone(name='Helianthus', rank='genus', kingdom='plants') species.name_backbone(name='Poa', rank='genus', family='Poaceae') # Verbose - gives back alternatives species.name_backbone(name='Helianthus annuus', kingdom='plants', verbose=True) # Strictness species.name_backbone(name='Poa', kingdom='plants', verbose=True, strict=False) species.name_backbone(name='Helianthus annuus', kingdom='plants', verbose=True, strict=True) # Non-existent name species.name_backbone(name='Aso') # Multiple equal matches species.name_backbone(name='Oenante')
-
species.
name_suggest
(datasetKey=None, rank=None, limit=100, offset=None, **kwargs)¶ A quick and simple autocomplete service that returns up to 20 name usages by doing prefix matching against the scientific name. Results are ordered by relevance.
Parameters: - q – [str] Simple search parameter. The value for this parameter can be a
simple word or a phrase. Wildcards can be added to the simple word parameters only,
e.g.
q=*puma*
(Required) - datasetKey – [str] Filters by the checklist dataset key (a uuid, see examples)
- rank – [str] A taxonomic rank. One of
class
,cultivar
,cultivar_group
,domain
,family
,form
,genus
,informal
,infrageneric_name
,infraorder
,infraspecific_name
,infrasubspecific_name
,kingdom
,order
,phylum
,section
,series
,species
,strain
,subclass
,subfamily
,subform
,subgenus
,subkingdom
,suborder
,subphylum
,subsection
,subseries
,subspecies
,subtribe
,subvariety
,superclass
,superfamily
,superorder
,superphylum
,suprageneric_name
,tribe
,unranked
, orvariety
. - limit – [fixnum] Number of records to return. Maximum:
1000
. (optional) - offset – [fixnum] Record number to start at. (optional)
Returns: A dictionary
References: http://www.gbif.org/developer/species#searching
Usage:
from pygbif import species species.name_suggest(q='Puma concolor') x = species.name_suggest(q='Puma') species.name_suggest(q='Puma', rank="genus") species.name_suggest(q='Puma', rank="subspecies") species.name_suggest(q='Puma', rank="species") species.name_suggest(q='Puma', rank="infraspecific_name") species.name_suggest(q='Puma', limit=2)
- q – [str] Simple search parameter. The value for this parameter can be a
simple word or a phrase. Wildcards can be added to the simple word parameters only,
e.g.
-
species.
name_lookup
(rank=None, higherTaxonKey=None, status=None, isExtinct=None, habitat=None, nameType=None, datasetKey=None, nomenclaturalStatus=None, limit=100, offset=None, facet=False, facetMincount=None, facetMultiselect=None, type=None, hl=False, verbose=False, **kwargs)¶ Lookup names in all taxonomies in GBIF.
This service uses fuzzy lookup so that you can put in partial names and you should get back those things that match. See examples below.
Parameters: - q – [str] Query term(s) for full text search (optional)
- rank – [str]
CLASS
,CULTIVAR
,CULTIVAR_GROUP
,DOMAIN
,FAMILY
,FORM
,GENUS
,INFORMAL
,INFRAGENERIC_NAME
,INFRAORDER
,INFRASPECIFIC_NAME
,INFRASUBSPECIFIC_NAME
,KINGDOM
,ORDER
,PHYLUM
,SECTION
,SERIES
,SPECIES
,STRAIN
,SUBCLASS
,SUBFAMILY
,SUBFORM
,SUBGENUS
,SUBKINGDOM
,SUBORDER
,SUBPHYLUM
,SUBSECTION
,SUBSERIES
,SUBSPECIES
,SUBTRIBE
,SUBVARIETY
,SUPERCLASS
,SUPERFAMILY
,SUPERORDER
,SUPERPHYLUM
,SUPRAGENERIC_NAME
,TRIBE
,UNRANKED
,VARIETY
(optional) - verbose – [bool] If
True
show alternative matches considered which had been rejected. - higherTaxonKey – [str] Filters by any of the higher Linnean rank keys. Note this is within the respective checklist and not searching nub keys across all checklists (optional)
- status –
[str] (optional) Filters by the taxonomic status as one of:
ACCEPTED
DETERMINATION_SYNONYM
Used for unknown child taxa referred to via spec, ssp, …DOUBTFUL
Treated as accepted, but doubtful whether this is correct.HETEROTYPIC_SYNONYM
More specific subclass ofSYNONYM
.HOMOTYPIC_SYNONYM
More specific subclass ofSYNONYM
.INTERMEDIATE_RANK_SYNONYM
Used in nub only.MISAPPLIED
More specific subclass ofSYNONYM
.PROPARTE_SYNONYM
More specific subclass ofSYNONYM
.SYNONYM
A general synonym, the exact type is unknown.
- isExtinct – [bool] Filters by extinction status (e.g.
isExtinct=True
) - habitat – [str] Filters by habitat. One of:
marine
,freshwater
, orterrestrial
(optional) - nameType –
[str] (optional) Filters by the name type as one of:
BLACKLISTED
surely not a scientific name.CANDIDATUS
Candidatus is a component of the taxonomic name for a bacterium that cannot be maintained in a Bacteriology Culture Collection.CULTIVAR
a cultivated plant name.DOUBTFUL
doubtful whether this is a scientific name at all.HYBRID
a hybrid formula (not a hybrid name).INFORMAL
a scientific name with some informal addition like “cf.” or indetermined like Abies spec.SCINAME
a scientific name which is not well formed.VIRUS
a virus name.WELLFORMED
a well formed scientific name according to present nomenclatural rules.
- datasetKey – [str] Filters by the dataset’s key (a uuid) (optional)
- nomenclaturalStatus – [str] Not yet implemented, but will eventually allow for filtering by a nomenclatural status enum
- limit – [fixnum] Number of records to return. Maximum:
1000
. (optional) - offset – [fixnum] Record number to start at. (optional)
- facet – [str] A list of facet names used to retrieve the 100 most frequent values
for a field. Allowed facets are:
datasetKey
,higherTaxonKey
,rank
,status
,isExtinct
,habitat
, andnameType
. Additionallythreat
andnomenclaturalStatus
are legal values but not yet implemented, so data will not yet be returned for them. (optional) - facetMincount – [str] Used in combination with the facet parameter. Set
facetMincount={#}
to exclude facets with a count less than {#}, e.g. http://bit.ly/1bMdByP only shows the type valueACCEPTED
because the other statuses have counts less than 7,000,000 (optional) - facetMultiselect – [bool] Used in combination with the facet parameter. Set
facetMultiselect=True
to still return counts for values that are not currently filtered, e.g. http://bit.ly/19YLXPO still shows all status values even though status is being filtered bystatus=ACCEPTED
(optional) - type – [str] Type of name. One of
occurrence
,checklist
, ormetadata
. (optional) - hl – [bool] Set
hl=True
to highlight terms matching the query when in fulltext search fields. The highlight will be an emphasis tag of classgbifH1
e.g.q='plant', hl=True
. Fulltext search fields include:title
,keyword
,country
,publishing country
,publishing organization title
,hosting organization title
, anddescription
. One additional full text field is searched which includes information from metadata documents, but the text of this field is not returned in the response. (optional)
Returns: A dictionary
References: Usage:
from pygbif import species # Look up names like mammalia species.name_lookup(q='mammalia') # Paging species.name_lookup(q='mammalia', limit=1) species.name_lookup(q='mammalia', limit=1, offset=2) # large requests, use offset parameter first = species.name_lookup(q='mammalia', limit=1000) second = species.name_lookup(q='mammalia', limit=1000, offset=1000) # Get all data and parse it, removing descriptions which can be quite long species.name_lookup('Helianthus annuus', rank="species", verbose=True) # Get all data and parse it, removing descriptions field which can be quite long out = species.name_lookup('Helianthus annuus', rank="species") res = out['results'] [ z.pop('descriptions', None) for z in res ] res # Fuzzy searching species.name_lookup(q='Heli', rank="genus") # Limit records to certain number species.name_lookup('Helianthus annuus', rank="species", limit=2) # Query by habitat species.name_lookup(habitat = "terrestrial", limit=2) species.name_lookup(habitat = "marine", limit=2) species.name_lookup(habitat = "freshwater", limit=2) # Using faceting species.name_lookup(facet='status', limit=0, facetMincount='70000') species.name_lookup(facet=['status', 'higherTaxonKey'], limit=0, facetMincount='700000') species.name_lookup(facet='nameType', limit=0) species.name_lookup(facet='habitat', limit=0) species.name_lookup(facet='datasetKey', limit=0) species.name_lookup(facet='rank', limit=0) species.name_lookup(facet='isExtinct', limit=0) # text highlighting species.name_lookup(q='plant', hl=True, limit=30) # Lookup by datasetKey species.name_lookup(datasetKey='3f8a1297-3259-4700-91fc-acc4170b27ce')
-
species.
name_usage
(name=None, data='all', language=None, datasetKey=None, uuid=None, sourceId=None, rank=None, shortname=None, limit=100, offset=None, **kwargs)¶ Lookup details for specific names in all taxonomies in GBIF.
Parameters: - key – [fixnum] A GBIF key for a taxon
- name – [str] Filters by a case insensitive, canonical namestring, e.g. ‘Puma concolor’
- data – [str] The type of data to get. Default:
all
. Options:all
,verbatim
,name
,parents
,children
,related
,synonyms
,descriptions
,distributions
,media
,references
,speciesProfiles
,vernacularNames
,typeSpecimens
,root
- language – [str] Language. Expects a ISO 639-1 language codes using 2 lower
case letters. Languages returned are 3 letter codes. The language parameter
only applies to the
/species
,/species/{int}
,/species/{int}/parents
,/species/{int}/children
,/species/{int}/related
,/species/{int}/synonyms
routes (here routes are determined by thedata
parameter). - datasetKey – [str] Filters by the dataset’s key (a uuid)
- uuid – [str] A uuid for a dataset. Should give exact same results as datasetKey.
- sourceId – [fixnum] Filters by the source identifier.
- rank – [str] Taxonomic rank. Filters by taxonomic rank as one of:
CLASS
,CULTIVAR
,CULTIVAR_GROUP
,DOMAIN
,FAMILY
,FORM
,GENUS
,INFORMAL
,INFRAGENERIC_NAME
,INFRAORDER
,INFRASPECIFIC_NAME
,INFRASUBSPECIFIC_NAME
,KINGDOM
,ORDER
,PHYLUM
,SECTION
,SERIES
,SPECIES
,STRAIN
,SUBCLASS
,SUBFAMILY
,SUBFORM
,SUBGENUS
,SUBKINGDOM
,SUBORDER
,SUBPHYLUM
,SUBSECTION
,SUBSERIES
,SUBSPECIES
,SUBTRIBE
,SUBVARIETY
,SUPERCLASS
,SUPERFAMILY
,SUPERORDER
,SUPERPHYLUM
,SUPRAGENERIC_NAME
,TRIBE
,UNRANKED
,VARIETY
- shortname – [str] A short name..need more info on this?
- limit – [fixnum] Number of records to return. Default:
100
. Maximum:1000
. (optional) - offset – [fixnum] Record number to start at. (optional)
References: See http://www.gbif.org/developer/species#nameUsages for details
Usage:
from pygbif import species species.name_usage(key=1) # Name usage for a taxonomic name species.name_usage(name='Puma', rank="GENUS") # All name usages species.name_usage() # References for a name usage species.name_usage(key=2435099, data='references') # Species profiles, descriptions species.name_usage(key=5231190, data='speciesProfiles') species.name_usage(key=5231190, data='descriptions') species.name_usage(key=2435099, data='children') # Vernacular names for a name usage species.name_usage(key=5231190, data='vernacularNames') # Limit number of results returned species.name_usage(key=5231190, data='vernacularNames', limit=3) # Search for names by dataset with datasetKey parameter species.name_usage(datasetKey="d7dddbf4-2cf0-4f39-9b2a-bb099caae36c")
-
species.
name_parser
(**kwargs)¶ Parse taxon names using the GBIF name parser
Parameters: name – [str] A character vector of scientific names. (required) reference: http://www.gbif.org/developer/species#parser
Usage:
from pygbif import species species.name_parser('x Agropogon littoralis') species.name_parser(['Arrhenatherum elatius var. elatius', 'Secale cereale subsp. cereale', 'Secale cereale ssp. cereale', 'Vanessa atalanta (Linnaeus, 1758)'])