Standardize trait names and harmonize measured values and reported facts
Source:R/standardize.R
standardize_traits.Rd
Adds columns to a traitdata table with standardized trait names and relates them to globally unique identifiers via URIs. Optionally converts units of values and renames factor levels into accepted terms.
Usage
standardize_traits(
x,
thesaurus = attributes(x)$thesaurus,
rename = NULL,
categories = c("No", "Yes"),
output = "logical",
...
)
Arguments
- x
a traitdata object (as returned by
as.traitdata()
) or a data table containing at least the column `verbatimScientificName.- thesaurus
an object of class 'thesaurus' (as returned by
as.thesaurus()
).- rename
a named vector to map user-provided names to thesaurus object names (see Details).
- categories
target categories for binary/logical traits harmonization.
- output
behaviour of
fixlogical()
. seefixlogical()
.- ...
parameters to be ignored, forwarded from wrapper function
standardize()
.
Details
The function matches the trait names provided in 'verbatimTraitName' to the traits provided in the thesaurus (in field 'trait'). Matching must be exact (case sensitive). Fuzzy matching may be provided in a later version of the package.
The function parameter 'rename' should be provided to map trait names where
user-provided names and thesaurus names are different. In this case, rename
should be a named vector with the target names used in the thesaurus as
names, and the original names as provided in 'verbatimTraitName' as value.
E.g. rename = c()
See also
Other standardize:
standardize_taxa()
,
standardize()
Other standardize:
standardize_taxa()
,
standardize()
Examples
pulldata("carabids")
#> Direct call to data source failed. Please check internet connectivity and re-load data!
#> The dataset 'carabids' has successfully been downloaded!
dataset1 <- as.traitdata(carabids,
taxa = "name_correct",
traits = c("body_length", "antenna_length", "metafemur_length"),
units = "mm",
keep = c(datasetID = "source_measurement", measurementRemark = "note"),
metadata = list(
bibliographicCitation = attributes(carabids)$citeAs,
author = "Fons van der Plas",
license = "http://creativecommons.org/publicdomain/zero/1.0/"
)
)
#> Input is taken to be a species -- trait matrix. If this is not the case, please provide parameters!
traitlist <- as.thesaurus(
body_length = as.trait("body_length", expectedUnit = "mm", valueType = "numeric",
identifier = "http://t-sita.cesab.org/BETSI_vizInfo.jsp?trait=Body_length"),
antenna_length = as.trait("antenna_length", expectedUnit = "mm", valueType = "numeric",
identifier = "http://t-sita.cesab.org/BETSI_vizInfo.jsp?trait=Antenna_length"),
metafemur_length = as.trait("metafemur_length", expectedUnit = "mm", valueType = "numeric",
identifier = "http://t-sita.cesab.org/BETSI_vizInfo.jsp?trait=Femur_length")
)
dataset1Std <- standardize_traits(dataset1, thesaurus = traitlist)
## Example: matching of original names to thesaurus
pulldata("heteroptera_raw")
#> The dataset 'heteroptera_raw' has successfully been downloaded!
dataset2 <- as.traitdata(heteroptera_raw,
taxa = "SpeciesID",
traits = c("Body_length", "Antenna_Seg1", "Antenna_Seg2",
"Antenna_Seg3", "Antenna_Seg4", "Antenna_Seg5", "Hind.Femur_length"),
units = "mm",
keep = c(sex = "Sex", references = "Source", lifestage = "Wing_development"),
metadata = list(
bibliographicCitation = attributes(heteroptera_raw)$citeAs,
license = "http://creativecommons.org/publicdomain/zero/1.0/"
)
)
#> Input is taken to be an occurrence table/an observation -- trait matrix
#> (i.e. with individual specimens per row and multiple trait measurements in columns).
#> If this is not the case, please provide parameters!
traits2 <- as.thesaurus(
Body_length = as.trait("Body_length",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "From the tip of the head to the end of the abdomen"),
Antenna_Seg1 = as.trait("Antenna_Seg1",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "Length of first antenna segment",
broaderTerm = "http://ecologicaltraitdata.github.io/TraitDataList/Antenna_length"),
Antenna_Seg2 = as.trait("Antenna_Seg2",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "Length of second antenna segment",
broaderTerm = "http://ecologicaltraitdata.github.io/TraitDataList/Antenna_length"),
Antenna_Seg3 = as.trait("Antenna_Seg3",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "Length of third antenna segment",
broaderTerm = "http://ecologicaltraitdata.github.io/TraitDataList/Antenna_length"),
Antenna_Seg4 = as.trait("Antenna_Seg4",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "Length of fourth antenna segment",
broaderTerm = "http://ecologicaltraitdata.github.io/TraitDataList/Antenna_length"),
Antenna_Seg5 = as.trait("Antenna_Seg5",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "Length of fifth antenna segment (only Pentatomoidea)",
broaderTerm = "http://ecologicaltraitdata.github.io/TraitDataList/Antenna_length"),
Hind.Femur_length = as.trait("Hind.Femur_length",
expectedUnit = "mm", valueType = "numeric",
traitDescription = "Length of the femur of the hind leg",
broaderTerm = "http://t-sita.cesab.org/BETSI_vizInfo.jsp?trait=Femur_length")
)
dataset2Std <- standardize_traits(dataset2,
thesaurus = traits2
)