Software Tools

A major goal of ORNIS is to develop a new suite of web-based, open-source technological tools for validating georeferences, taxonomic identifications, and collection dates (or at least flagging records with high probabilities of error).

Taxonomic Standardization

ORNIS programmers will develop software for exploring and identifying taxonomic discrepancies in biocollections databases, based on an avian taxonomy compendium in which which all synonymies and equivalencies have been established. Users will specify a target authority list from among the choices, and then will be able to:

  • Detect and flag names not represented on the chosen authority list ("nonstandard names")
  • Identify nonstandard names represented on other authority lists for interpretation and correction
  • Confirm choice of name for those names that translate unambiguously into the chosen authority list
  • Flag names for which additional information (e.g., locality, phenotype) is necessary for translation

Locality Consistency

Consistency among locality fields in biocollections databases provides a first test for georeferencing error. ORNIS programmes will develop a workbench in which geographic references assigned via automated georeferencing will be compared with data fields for country, state, and county. If the geographic position indicated by the georeferenced coordinates does not fall within the appropriate polygon specified by the other fields (country, state, county), then logical inconsistency exists.

Examples of locality inconsistencies for California data in the Museum of Vertebrate Zoology:

Species Ecology

Another error-checking tool uses ecological niche-modeling to detect occurrence points that represent ecological outliers for a particular species. This approach combines the pool of specimen data for a species with environmental layers and a machine-learning algorithm for modeling species' ecological niches. When species' occurrence points are overlaid on the geographic predictions of the ecological niche models, outliers (i.e., points outside of the predicted geographic range) have a high probability of error in either species identification or georeferencing. Implementation of this tool through ORNIS will allow collections to flag particular specimens as potentially erroneous and which require further checking.

Collector Itineraries

The cross-institutional nature of the ORNIS network will enable an approach to error-detection that includes both temporal and spatial elements (i.e., "collecting events"). Collector's specimen locality records can be ordered by date of collection, and distances traveled during one day or a few days can be calculated. These distances can then be filtered to detect unexpectedly long distances that might indicate erroneous data records. Implementation of this tool requires a community architecture such as ORNIS because most collectors' specimens are scattered across multiple institutions.