Abstract
The Ecological Traitdata Standard (ETS) is a collection of terms for datasets on quantitative and qualitative organism properties (i.e. traits). It also includes recommendations on how to structure such data for upload in public databases.
Suggested Citation
To refer to this version of the ETS please cite:
Schneider et al. (2017) Ecological Traitdata Standard, v0.5, URL: https://ecologicaltraitdata.github.io/TraitDataStandard/v0.5/ , DOI: 10.5281/zenodo.1041733
Please also cite the methods paper for the rationale and general considerations of semantic standardization of trait data:
Florian D. Schneider, Malte Jochum, Gaetane LeProvost, Andreas Ostrowski, Caterina Penone, Nadja K. Simons (in preparation) Introducing an Ecological Trait-data Standard.
Contribute
The Ecological Traitdata Standard is under continuous open source development, hosted on Github. Please refer to the Github Issues page for discussion and revision of individual terms, and settle the issue here before filing a pull-request that implements an update.
Definition
Glossary of terms
This defined vocabulary aims at providing all essential columns for raw data of functional trait measurements for ecological research. Most terms relate to terms from the Darwin Core Standard and it’s Extensions (terms of DWC are referenced thus in field ‘Refines’; the full Darwin Core Standard can be found here: http://rs.tdwg.org/dwc/terms/index.htm)
The glossary of terms is ordered into a core section with essential columns for trait data, extensions which are allowing to provide additional layers of information, as well as a vocabulary for metadata information of particular importance for trait data. A final section provides defined terms for lists of trait definitions, also termed a Trait Thesaurus.
Extensions:
We provide three extensions of the vocabulary, that allow for additional information on the trait measurement.
- the
Occurrence
extension contains information on the level of individual specimens, such as date and location and method of sampling and preservation, or physiological specifications of the phenotype, such as sex, life stage or age.
- the
MeasurementOrFact
extension takes information at the level of single measurements or reported values, such as the original literature from where the value is cited, the method of measurement or statistical method of aggregation.
- The
BiodiversityExploratories
extension provides columns for localisation for trait data from the Biodiversity Exploratories sites (www.biodiversity-exploratories.de).
Structure of trait data
The traitdata standard implies that the trait data should be stored in a long table format containing one measurement per row described by the terms provided in the core section. (see our Methods paper for further considerations on data structure and format).
There are two ways of integrating the information provided by the extensions:
- within the same data file: additional terms are provided to describe columns containing properties concerning the measurement or the occurrence of the specimen. The output file may be stored as a csv or txt table.
- in separate data files: the main file refers to additional data files via identifiers, contained in fields like
measurementID
or occurrenceID
. This is usually the case if the occurrences are externally documented, for instance as specimens from a museum. This also applies if the data are stored in additional data tables, e.g. within an Excel spreadsheet or as a Darwin Core Archive (as proposed for TraitBank).
The R package ‘traitdataform’ (https://www.github.com/fdschneider/traitdataform) provides tools to transfer heterogeneous datasets into a longtable format and to create standardized taxa and trait columns, based on public ontologies.
Core traitdata columns
For the essential primary data (trait value, taxon assignment, trait name), the trait data standard recommends to report the original naming and value scheme as used by the data provider. However, to ensure compatibility with other datasets, the original data provider’s information should be duplicated into standardized columns indexed by appending Std
to the column name. This ensures compatibility on the provider’s side and transparency for data users on the reported measurements and facts, and enables checking for inconsistencies and misspellings in the complete dataset provided by the author. If provided, the standardized fields allow merging heterogeneous data sources into a single table to perform further analyses. This practice of double bookkeeping of trait data has successfully established for the TRY database on plant traits, for instance (Kattge et al. 2011. TRY – a global database of plant traits. Global Change Biology, 17, 2905–2935).
By linking to (public) ontologies via the field taxonID
, further taxonomic information can be extracted for analysis. Alternatively, taxonID
may also link to an accompanying datasheet that contains information on the taxonomic resolution or specification of the observation.
Similarly, linking to trait terminologies (a ‘Thesaurus’) via the field traitID
allows an unambiguous interpretation of the trait measurement. If no online ontology is available, an accompanying dataset should specify the trait definition. For setting up such a Thesaurus, we propose the use of terms provided in section ‘Traitlist’ below.
traitUnit
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#traitunit |
Refines |
http://rs.tdwg.org/dwc/terms/measurementUnit |
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
Reports the unit that the author’s raw data were measured in, if applicable (only for numeric values). For unitless numerical values, use ‘unitless’ in this field. For factorial values, leave empty or provide NA. |
Comment |
For numerical values report unit in format for lengths “mm”, for volumes “mm3” / “mm^3”, areas “mm2” / “mm^2”, for movement “m/s”, or for volume to surface ratios: “mm3/mm2” |
scientificNameStd
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#scientificnamestd |
Refines |
http://rs.tdwg.org/dwc/terms/scientificName |
Replaces |
NA |
Version |
v0.5 |
DateIssued |
07/07/2017 |
DateModified |
01/11/2017 |
Definition |
The full name, with authorship and date information if known. This should resolve to the currently valid (zoological) or accepted (botanical) taxon. |
Comment |
Provide reference taxonomy in metadata of the dataset. Examples: “Coleoptera” (order), “Vespertilionidae” (family), “Manis” (genus), “Ctenomys sociabilis” (genus + specificEpithet), “Ambystoma tigrinum diaboli” (genus + specificEpithet + infraspecificEpithet), “Roptrocerus typographi (Györfi, 1952)” (genus + specificEpithet + scientificNameAuthorship), “Quercus agrifolia var. oxyadenia (Torr.) J.T. Howell” (genus + specificEpithet + taxonRank + infraspecificEpithet + scientificNameAuthorship). For discussion see http://terms.tdwg.org/wiki/dwc:scientificName |
taxonRank
go to top | direct link
valueType |
factor |
Identifier |
http://rs.tdwg.org/dwc/terms/taxonRank |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
The taxonomic rank of the most specific name in the scientificName. Recommended best practice is to use a controlled vocabulary. |
Comment |
This is to clarify cases where information is not given on a species level. Examples: “subspecies”, “varietas”, “forma”, “species”, “genus”. For discussion see http://terms.tdwg.org/wiki/dwc:taxonRank |
measurementID
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#measurementid |
Refines |
http://rs.tdwg.org/dwc/terms/measurementID |
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
An identifier for the MeasurementOrFact (information pertaining to measurements, facts, characteristics, or assertions). May be a global unique identifier or an identifier specific to the data set. |
Comment |
Links multi-value trait measurements, e.g. x-y-z coordinates of a morphometric landmark, biochemical compound quantities for different chainlengths. In this case, the trait names must specifiy the sub-measurement, e.g. “landmark32_x”, and must be specified in a reference trait list, given in the field “measurementMethod”. |
occurrenceID
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#occurrenceid |
Refines |
http://rs.tdwg.org/dwc/terms/occurrenceID |
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. |
Comment |
For a specimen in the absence of a bona fide global unique identifier, for example, use the form: “urn:catalog:[institutionCode]:[collectionCode]:[catalogNumber]. Examples:”urn:lsid:nhm.ku.edu:Herps:32“,”urn:catalog:FMNH:Mammal:145732“. For discussion see http://terms.tdwg.org/wiki/dwc:occurrenceID. This is important for the analysis of co-variation of morphometric data or intraspecific variation. It also couples multiple measurements on a single specimen, which also could be a leaf or a single bone without an assignment to an individual organism. If available, upload related dataset to describe specimens more precisely, e.g. environmental parameters or identity related information. |
warnings
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#warnings |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
warnings on the quality or reliability of the reported trait value. |
Comment |
warnings from autogenerated data should be stored here, e.g. regarding a lack of match between the provided taxonID and the ontology, or the trait names or values, a mismatch in the units provided and the unit expected according to the trait table. User defined warnings and flags can be added as well, e.g. ‘NOTUSE’ to mark data that are unreliable or erroneous. |
Extension: Measurement or Fact
This section provides additional information about a reported measurement or fact and in most cases can easily be included as extra columns to the core dataset. The columns would contain detail on the methodology of measuring and reporting of aggregated data.
In case of facts reported from literature or from expert knowledge, or cited from other databases, please include bibliographic citation of the original data source in field ‘references’.
basisOfRecordDescription
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#basisofrecorddescription |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
Adding more detail to the basisOfRecord. |
Comment |
If life specimens were sampled, where did they come from? Have they been reared in cultivation? If literature data or online database, provide type of literature, e.g. textbook, website, URL, etc. If preserved specimens were used, which method of preservation? In case of expert knowledge, give the name of the authority. |
measurementResolution
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#measurementResolution |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
If the trait information was originally given on another taxonomic level than species. Applies mainly for literature and expert knowledge data. not applying for measured data. The hierarchical level to which the trait data would refer. |
Comment |
For example, information given in literature could state ‘most species in this genus are winged’, but the trait data could be given for each species in this genus. |
measurementMethod
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#measurementMethod |
Refines |
http://rs.tdwg.org/dwc/terms/measurementMethod |
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
Applies primarily to measured data. The method, tools and scales used to measure a (numerical) trait value. A description of or reference to (publication, URI) the method or protocol used to determine the measurement, fact, characteristic, or assertion. |
Comment |
Should be a concise and standardised text entry or reference (publication, URI), referring to a particular method (e.g. ‘direct weighing’, ‘length-mass regression’, ‘intertegular span’, ‘length between node X and y’ ) and measurement conditions (e.g. certain temperature or humidity, name of device or scale used for measurement). To avoid repetition or lengthy entries, authors should use global identifiers of methodological terms if available, or enter dataset specific identifiers and provide a more detailed description of the method or protocol used to determine the measurement, fact, characteristic, or assertion in the metadata of the dataset. |
measurementDeterminedBy
go to top | direct link
valueType |
character |
Identifier |
http://rs.tdwg.org/dwc/terms/measurementDeterminedBy |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
A list (concatenated and separated) of names of people, groups, or organizations who determined the value of the measurement. |
Comment |
The recommended best practice is to separate the values with a vertical bar (‘|’). Examples: “Rob Guralnick”, “Julie Woodruff | Eileen Lacey”. Can be encoded by dataset-specific identifiers for reasons of privacy. This is kept as a co-factor for repeated measurements. |
measurementDeterminedDate
go to top | direct link
valueType |
Date |
Identifier |
http://rs.tdwg.org/dwc/terms/measurementDeterminedDate |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
The date on which the MeasurementOrFact was made. Recommended best practice is to use an encoding scheme, such as ISO 8601:2004(E). |
Comment |
Examples: “1963-03-08T14:07-0600” is 8 Mar 1963 2:07pm in the time zone six hours earlier than UTC, “2009-02-20T08:40Z” is 20 Feb 2009 8:40am UTC, “1809-02-12” is 12 Feb 1809, “1906-06” is Jun 1906, “1971” is just that year, “2007-03-01T13:00:00Z/2008-05-11T15:30:00Z” is the interval between 1 Mar 2007 1pm UTC and 11 May 2008 3:30pm UTC, “2007-11-13/15” is the interval between 13 Nov 2007 and 15 Nov 2007. For discussion see http://terms.tdwg.org/wiki/dwc:measurementDeterminedDate |
aggregateMeasure
go to top | direct link
valueType |
logical |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#aggregateMeasure |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
Is measurementValue reporting an individual measurement or an aggregate Measure? Takes a binary entry: TRUE or FALSE |
Comment |
This is flagging aggregate data in an unambiguous way. Aggregate measures are often reported for repeated measures, e.g. replicate measurements of leaf size from a single plant individual or for grouped measurement, e.g. for weightings of a counted number of specimens (e.g. leaves or small organisms). |
dispersion
go to top | direct link
valueType |
numeric |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#dispersion |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
If aggregate measure of multiple individuals or specimens, the numeric value of dispersion (variance or standard deviation) for the mean value reported in measurementValue_user (no unit conversion is provided by the R-package!). Defaults to 0. If a value is provided, report the statistical method in the field statisticalMethod. |
Comment |
|
statisticalMethod
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#statisticalMethod |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
For aggregated measures, the method for data aggregation or averaging as well as the variation or range. |
Comment |
E.g. ‘mean and standard deviation’, ‘median and 95% confidence interval’, ‘mean and variance’, ‘mean and range of values’, ‘median and 95% interquantile range’ |
Extension: Occurrence
This section of columns aims for identifying the methodology and primary source of the data and keep the reference to the actual specimen (e.g. for museum collections or related data analysis). It also may add detail on the condition of the observed specimen, its sex, morphotype or developmental stage. Especially for analyses of intra-specific trait variation, this composes valuable data.
For most trait data compiled from literature or expert knowledge, the level of information on an ‘occurrence’ would not apply, since no specific individual has been observed. In this case, the field ‘occurrenceID’ should be left blank in the core data. In cases where different aggregate ranges or averages are reported for male and female individuals, the columns sex or developmental stage may be used without the reference to an occurrence ID.
lifeStage
go to top | direct link
valueType |
character |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#lifeStage |
Refines |
http://rs.tdwg.org/dwc/terms/lifeStage |
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
The age class or life stage of the biological individual(s) at the time the Occurrence was recorded. Recommended best practice is to use a controlled vocabulary. |
Comment |
Recommended factor levels are: seed, seedling, sapling, adult, egg, larval_instar_1, larval_instar_2, larval_instar_3, … , pupa; For very taxon-specific life stages, it is recommended to provide detailed explanation in the metadata of the dataset. |
preparations
go to top | direct link
valueType |
character |
Identifier |
http://rs.tdwg.org/dwc/terms/preparations |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
For preserved specimens, a list (concatenated and separated) of preparations and preservation methods for the occurence. |
Comment |
Do not report procedures for measurement or sampling here (see samplingProtocol and measurementMethod). The recommended best practice is to separate the values with a vertical bar (‘|’). Examples: “fossil”, “cast”, “photograph”, “DNA extract”, “skin |”skull | skeleton“,”whole animal (ETOH) | tissue (EDTA)“. For discussion see http://terms.tdwg.org/wiki/dwc:preparations |
samplingProtocol
go to top | direct link
valueType |
character |
Identifier |
http://rs.tdwg.org/dwc/terms/samplingProtocol |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
The name of, reference to, or description of the method or protocol used for obtaining the specimen. |
Comment |
Examples: “UV light trap”, “mist net”, “bottom trawl”, “ad hoc observation”, “point count”, “Penguins from space: faecal stains reveal the location of emperor penguin colonies, http://dx.doi.org/10.1111/j.1466-8238.2009.00467.x”, “Takats et al. 2001. Guidelines for Nocturnal Owl Monitoring in North America. Beaverhill Bird Observatory and Bird Studies Canada, Edmonton, Alberta. 32 pp.”, “http://www.bsc-eoc.org/download/Owl.pdf”. For discussion see http://terms.tdwg.org/wiki/dwc:samplingProtocol |
eventDate
go to top | direct link
valueType |
Date |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#eventDate |
Refines |
http://rs.tdwg.org/dwc/terms/eventDate |
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
|
Definition |
The date-time or interval during which an Event occurred. For occurrences, this is the date-time when the event was recorded. Not suitable for a time in a geological context. Recommended best practice is to use an encoding scheme, such as ISO 8601:2004(E). For lower precision, use year, month and day field instead. |
Comment |
Note: this is not to record the date when the specimens were measured (use measurementDeterminedDate for this). If applicable, at least provide a year. Providing a date is highly viable for studies analysing temporal variation in traits. Examples: “1963-03-08T14:07-0600” is 8 Mar 1963 2:07pm in the time zone six hours earlier than UTC, “2009-02-20T08:40Z” is 20 Feb 2009 8:40am UTC, “1809-02-12” is 12 Feb 1809, “1906-06” is Jun 1906, “1971” is just that year, “2007-03-01T13:00:00Z/2008-05-11T15:30:00Z” is the interval between 1 Mar 2007 1pm UTC and 11 May 2008 3:30pm UTC, “2007-11-13/15” is the interval between 13 Nov 2007 and 15 Nov 2007. For discussion see http://terms.tdwg.org/wiki/dwc:eventDate |
locationID
go to top | direct link
valueType |
factor |
Identifier |
http://rs.tdwg.org/dwc/terms/locationID |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
An identifier for the set of location information (data associated with dcterms:Location). May be a global unique identifier or an identifier specific to the data set. |
Comment |
Could report the plot within the experimental setting which would be further specified in the metadata or in a separate dataset. For discussion see http://terms.tdwg.org/wiki/dwc:locationID |
geodeticDatum
go to top | direct link
valueType |
factor |
Identifier |
http://rs.tdwg.org/dwc/terms/geodeticDatum |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
The ellipsoid, geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude as based. Recommended best practice is use the EPSG code as a controlled vocabulary to provide an SRS, if known. Otherwise use a controlled vocabulary for the name or code of the geodetic datum, if known. Otherwise use a controlled vocabulary for the name or code of the ellipsoid, if known. If none of these is known, use the value “unknown”. |
Comment |
Examples: “EPSG:4326”, “WGS84”, “NAD27”, “Campo Inchauspe”, “European 1950”, “Clarke 1866”. For discussion see http://terms.tdwg.org/wiki/dwc:geodeticDatum |
verbatimLocality
go to top | direct link
valueType |
character |
Identifier |
http://rs.tdwg.org/dwc/terms/verbatimLocality |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
The specific description of the place. |
Comment |
Less specific geographic information can be provided in other geographic terms of darwin core (higherGeography, continent, country, stateProvince, county, municipality, waterBody, island, islandGroup). This term may contain information modified from the original to correct perceived errors or standardize the description. Example: “25 km NNE Bariloche por R. Nac. 237”. For discussion see http://terms.tdwg.org/wiki/dwc:verbatimLocality |
country
go to top | direct link
valueType |
character |
Identifier |
http://rs.tdwg.org/dwc/terms/country |
Refines |
|
Replaces |
NA |
Version |
|
DateIssued |
|
DateModified |
|
Definition |
The name of the country or major administrative unit in which the Location occurs. Recommended best practice is to use a controlled vocabulary such as the Getty Thesaurus of Geographic Names. |
Comment |
Examples: “Germany”, “Denmark”, “Colombia”, “España”. For discussion see http://terms.tdwg.org/wiki/dwc:country; |
Extension: Biodiversity Exploratories
This section records location in the context of the exploratories. From ExploratotriesPlotID
a detailled georeference can be inferred. Additional spatial resolution, e.g. on subplots, may be provided in locationID
of section sampling event.
ExploratoriesPlotID
go to top | direct link
valueType |
factor |
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#ExploratoriesPlotID |
Refines |
|
Replaces |
NA |
Version |
v0.3 |
DateIssued |
07/07/2017 |
DateModified |
01/11/2017 |
Definition |
EP plot ID (or also any valid Gridplot ID or VIP ID) where the measured specimen was extracted. |
Comment |
Only for specimen that were extracted from the exploratories directly (or direct offspring, if hatched in the lab). Please also report it, even if this was not part of your research question and provide a Date (a year at last) if available. |
Terms for Traitlists (a ‘Trait Thesaurus’)
A trait thesaurus or ontology assigns descriptive trait names with A) an unambiguous definition of the trait and B) an expected format of measured values or reported facts, and might additionally provide C) semantic relationships for deriving a hierarchical or tree-based classification of traits.
Traits are not only defined in terms of their interpretation, but are ideally also standardised in terms of numerical units and, even more important, the use of harmonized factor levels. This is challenging, given the range of data types that fall within datasets of functional traits. Numerical values represent measurements of length, volumes, ratios, rates or timespans. Integer values may apply to count data (e.g. eggs per clutch). Binary data (encoded as 0 or 1) or logical data (coded as TRUE or FALSE) may apply to qualitative traits such as specific behaviour during mating (e.g. are territories defended) or specialisation to a given habitat (e.g. species restricted to relicts of primeval forests). Many traits are categorical and allow for a constrained set of factor levels, such as sex differences in wing morphology (both sexes winged, both sexes unwinged, only males winged, only females winged) or unconstrained entries such as color. Categorical traits often are ordinal, i.e. they have a logical sequence as in the case of life stages or hibernation stages, or habitat preference traits such as horizontal stratum use. Finally there are specific formats of multivariate trait values, e.g. x-y-z coordinates of a landmark measured in a standardized 3D space [ref] or relative abundance of chain-lengths in biochemical compounds []. To cope with this variety of data types, definitions should refer to other well-defined terms of other ontologies that describe standard units, morphological body parts, protein characteristics (Protein Ontology) or chemical terms (e.g. the ChEBI, http://www.obofoundry.org/ontology/chebi.html).
Online ontologies extend into (machine readable) semantic web resources by providing a hierarchical classification of traits or a relational tree of functional traits. Each trait definition may link to a broader or narrower term. For instance, the definition of ‘femur length of first leg, left side’ is narrower than ‘femur length’ which is narrower than ‘leg trait’ which is narrower than ‘locomotion trait’. (Ref semantic database methods) This links traits of similar functional meaning and allows cross-taxon comparative studies at the level of broader terms.
Ontologies for functional traits are being developed for different organism groups, mostly centered around certain research questions or subjects of study. To date, the TRY database takes the most inclusive approach on functional traits for vascular plants (Kattge). For some animal groups, similar approaches do exist, but few are available as an online ontology.
As a starting point for creating an ontology for functional traits, we propose the following terms for trait lists (also termed ‘Thesaurus’), to describe functional traits that are in the focus of the research project.
Using this standardized terminology will allow merging trait definitions from multiple sources. We encourage providing these lookup tables as an open resource on public terminology servers to enable a global referencing. The benefit of such classifications will increase if open Application Programming Interfaces (APIs) provide a way to extract the definitions and higher-level trait hierarchies programmatically via software tools. To harmonize trait data across databases, future trait standard initiatives should provide this functionality. Online ontologies hosted with accredited ontology servers have the advantage of providing a persistent and direct link of the term on the internet (a Uniform Resource Identifier, URI). Terminology portals or registries, such as the GFBio Terminology Service or ontobee, may provide a central host for trait ontologies.
valueType
go to top | direct link
valueType |
|
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#valueType |
Refines |
|
Replaces |
NA |
Version |
v0.4 |
DateIssued |
27/09/2017 |
DateModified |
|
Definition |
type of trait values. Defines the trait as numeric, integer, categorical, binary/logical, or character. |
Comment |
Numerical values represent measurements of length, volumes, ratios, rates or timespans. Integer values apply to count data (e.g. eggs per clutch). Binary data (encoded as 0 or 1) or logical data (coded as TRUE or FALSE) may apply to qualitative traits such as specific behaviour during mating (e.g. are territories defended) or specialisation to a given habitat (e.g. species restricted to relicts of primeval forests). Categorical traits should define a constrained set of factor levels, such as sex differences in wing morphology (both sexes winged, both sexes unwinged, only males winged, only females winged) or unconstrained entries such as color. Ordinal categorical traits may be better encoded as integer values, e.g. a logical sequence as in the case of life stages or hibernation stages, or habitat preference traits such as horizontal stratum use. |
factorLevels
go to top | direct link
valueType |
|
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#factorLevels |
Refines |
|
Replaces |
NA |
Version |
v0.4 |
DateIssued |
27/09/2017 |
DateModified |
|
Definition |
the constrained vocabulary for categorical traits or ordinal binary traits. |
Comment |
Ordinal traits may be encoded with numerically indexed factor levels; e.g. 1_egg, 2_larvae, 3_pupae, 4_adult; the field traitDescription should define the factor levels; |
factorLevels
go to top | direct link
valueType |
|
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#factorLevels |
Refines |
|
Replaces |
NA |
Version |
v0.4 |
DateIssued |
27/09/2017 |
DateModified |
|
Definition |
the constrained vocabulary for categorical traits or ordinal binary traits. |
Comment |
Ordinal traits may be encoded with numerically indexed factor levels; e.g. 1_egg, 2_larvae, 3_pupae, 4_adult; the field traitDescription should define the factor levels; |
relationSource
go to top | direct link
valueType |
|
Identifier |
http://ecologicaltraitdata.github.io/TraitDataStandard/#relationSource |
Refines |
|
Replaces |
NA |
Version |
v0.4 |
DateIssued |
24/10/2017 |
DateModified |
|
Definition |
the relation to the original source. |
Comment |
equals: trait is taken directly from the sources without modification / takenfrom: the trait is based on a trait list which does not include formal descriptions or trait-specific references (e.g. the trait was used in a scientific publication or is provided in taxonomic literature) / refines: the trait gives a more precice definition of the original trait / broadens: the trait gives a more general definition of the original trait / inheritsfrom: the trait is slight modification of the original trait but changes in the definition of the original trait should be reflected in this trait as well (e.g. Body_length_max is a modification of Body_length). |
Comments
go to top | direct link