-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export data as Darwin Core #256
Comments
@jreubens @jonasmortelmansvliz, some questions:
|
I notice that the
vs
I think I'll include |
@peterdesmet regarding the questions:
|
Thanks!
Can archival tags be associated with an animal, or only acoustic tags? |
|
|
Yes, archival tags are also associated with an animal. It has a serial number which is linked to the animal-ID |
|
@peterdesmet do you mean an example of a tag serial number? You can take sensor ID |
@PieterjanVerhelst thanks, I see that is a G5 pop-off tag with pressure and temperature sensors. In Darwin Core we express occurrences, i.e. observations/detections of an organism at a place and time. Am I correct in understanding that is not what archival tags record, unless they are a combined archival-acoustic tag? |
@peterdesmet the archival tags indeed register pressure and temperature. They do not log positions. Positions are obtained based on the logged pressure and temperature data through a modelling method called 'geolocation'. So if you want tracking data obtained from the archival tags into Darwin Core, this will be processed data. Note that geolocation modelling requires certain assumptions and links to specific databases, so when these are changed, a slightly different trajectory can be obtained. Or in other words: there is some error on the position. It is not as accurate as acoustic telemetry. |
Great, in that case I am keeping processed position data from archival data out of scope for Darwin Core. |
regarding 7. this information goes through IMIS. |
All fields are now mapped in #257. Remaining questions @jonpye @jreubens @jonasmortelmansvliz
|
@jdpye comments:
Here are all the possible manufacturers:
Are there some we should shorten? |
@jdpye, here's how the identifiers would differ if we include the manufacturer:
Note that within a dataset the identifiers are unique in both cases. Adding the manufacturer would only solve the use case where we want to combine multiple datasets where the |
oops, I see these now, I'm @jdpye here and @jonpye on Google-side. I would say 1000m (or more) from a center point is reasonable. If you are looking for a flat upper bound for coordinateUncertaintyInMeters, the accepted range of a high-powered tag in open water is around that. We place gates 800m apart to cover most cases of degraded signal transmission. This is a highly variable situation, with range testing potentially able to provide a better answer, so with the caveat that we're going to try and do better if the research programme can tell us a better number, we can stick with 1000m as a generic 'acoustic telemetry' upper bound. This paper that describes an open-water experiment has a nice figure and a more nuanced view of how reasonably likely detections at each distance would be before taking into account other factors. https://animalbiotelemetry.biomedcentral.com/articles/10.1186/s40317-017-0142-y In river systems and turbid waters you get a lot less detectability and a lot of dependency on environmental conditions: https://link.springer.com/article/10.1007/s10750-021-04556-3 So, how much of this can we even capture with a single (high?) number? And where we can do better, a paragraph in the metadata about how the range estimation was done at each station could be included. There are examples in https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13322 of how to extrapolate conditions that drive variability once you can characterize them from a few exemplar stations, for example, and this could be one approach researchers use to quantify their predicted station range. My take is, if they give us expected ranges, report those, and if they don't, shoot high on the detection variability, 1000m or even higher. And describe in the metadata what the source of the coordinateUncertainty was, a round high estimate based on the technologies in play, or a specific characterized estimate based on what the researchers were able to calculate. |
Thanks @jdpye, I'll have the Any feedback on default coordinateUncertainty for human obs (capture, release, ...) and regarding those identifiers? |
I don't have any info that would overrule the standard commercial GPS precision being ~30m, I'd be happy to roll with that. |
For the standard vocabularies for lifestage, we at OTN hold a link to the NERC vocabulary: http://vocab.nerc.ac.uk/collection/S11/current/ |
coordinateUncertainty for human obs coordinateUncertainty for detections lifestage |
i like the idea of having a parameter for the function, but i can see the scope creeping. Would the user provide a different blanket value for the whole export, or a conditional blanket value, or be required to sub in their own per instrument or per event column of data? The default is what nearly all existing data will be published with so we all definitely have to be happy with the 1000m being an acceptable signal that 'we don't know better'. |
@jdpye you're right, the user would just provide a blanket value, so I won't add a parameter. I'll set the value for the detections at 1000m (with the user always having the option to improve upon that before publishing). |
Another item @jdpye and I discussed on slack is whether I'm personally not a fan of this idea for two reasons:
I would therefore suggest to include the original identifier as is in the published data. |
Thanks @peterdesmet I think it's reasonable to say that GBIF would currently (always?) have to assume the identifiers are local and make use of approaches that use If you cared to join across ETN datasets, then having a prefix may help (i.e. your Aside: For identifier schemes like DOI, ARK, LSID etc. it may be more reasonable to assert relationships across publishers as it's more likely they do indeed mean the same thing, but that isn't what you're considering here. Does that help? |
@timrobertson100 thanks, yes. It aligns with my thinking that using the original (non-prefixed)
We already make sure that the original (non-prefixed) |
All questions/issues (#256 (comment)), closing issue. |
Belatedly, I agree with reasoning from @peterdesmet and @timrobertson100 |
Add function to export data as Darwin Core
Human Observations
Detections
subsampled by hour: first of 3 record(s)
acoustic telemetry
The text was updated successfully, but these errors were encountered: