Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_MINDEPTHMAXDEPTH_FROM_VERBATIM #55

Open
iDigBioBot opened this issue Jan 5, 2018 · 36 comments
Open

TG2-AMENDMENT_MINDEPTHMAXDEPTH_FROM_VERBATIM #55

iDigBioBot opened this issue Jan 5, 2018 · 36 comments
Labels
Amendment CODED Completeness CORE TG2 CORE tests SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID c5658b83-4471-4f57-9d94-bf7d0a96900c
Label AMENDMENT_MINDEPTHMAXDEPTH_FROM_VERBATIM
Description Proposes amendments of the values of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they can be interpreted from dwc:verbatimDepth.
TestType Amendment
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:minimumDepthInMeters
dwc:maximumDepthInMeters
Information Elements Consulted dwc:verbatimDepth
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters or dwc:maximumDepthInMeters are bdq:NotEmpty or dwc:verbatimDepth is bdq:Empty; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they can be unambiguously interpreted from dwc:verbatimDepth; otherwise NOT_AMENDED.
Data Quality Dimension Completeness
Term-Actions MINDEPTHMAXDEPTH_FROM_VERBATIM
Parameter(s)
Source Authority
Specification Last Updated 2024-08-30
Examples [dwc:minimumDepthInMeters="", dwc:maximumDepthInMeters="", dwc:verbatimDepth="10 feet": Response.status=FILLED_IN, Response.result=dwc:minimumDepthInMeters="3.048", dwc:maximumDepthInMeters="3.048", Response.comment="dwc:verbatimDepth contains interpretable values"]
[ dwc:minimumDepthInMeters="", dwc:maximumDepthInMeters="", dwc:verbatimDepth="x": Response.status=NOT_AMENDED, Response.result=, Response.comment="dwc:verbatimDepth does not contain an interpretable value"]
Source
References
Example Implementations (Mechanisms) Kurator/FilteredPush geo_ref_qc Library DOI: 10.5281/zenodo.14064324
Link to Specification Source Code https://github.com/FilteredPush/geo_ref_qc/blob/v2.0.1/src/main/java/org/filteredpush/qc/georeference/DwCGeoRefDQ.java#L1121
Notes If dwc:verbatimDepth has a single value rather than a range, the minimum and maximum values should be amended with the same value. When transforming units, the transformation should be reversible, not adjusting the number of significant digits or adjusting the rounding. For example, transform fathoms to meters by multiplying by 1.8288 and retaining added significant digits (verbatim depth of 10 fathoms to minimum and maximum depths in meters of 18.288). Implementations should be capable of interpreting verbatim data in at least meters, fathoms, and feet, in the form of either a single value or a range. The units must be specified in the verbatim data to be interpretable.
@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 16, 2018
@chicoreus
Copy link
Collaborator

Established general principle in discussion: When doing an amendment that performs a transformation, do so in a manner which is reversible (thus transform fathoms to meters with a different number of significant digits in the example).

@ArthurChapman
Copy link
Collaborator

Shouldn’t we add as a Prerequisite that dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are both EMPTY?

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 18, 2022

Changed "AMENDED" to "FILLED_IN" in accordance with discussions April 16.

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 16, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField" and "Output Type" to "TestType".

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@chicoreus
Copy link
Collaborator

The form of this specification doesn't match more recent ones.

Propose reworing the dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are not EMPTY clause to produce NOT_AMENDED if either are populated (consistent with our more recent handling of FILLED_IN amendments).

Also, the Notes indicate that both min and max values should be set together, so make and/or just and. Thus changing from:

INTERNAL_PREREQUISITES_NOT_MET if dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable or dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are not EMPTY; FILLED_IN the value of dwc:minimumDepthInMeters and/or dwc:maximumDepthInMeters if they could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED

to

INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are EMPTY and either dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are not EMPTY and could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

The comment #55 (comment) should be reflected in the notes.

Something like:

If the dwc:verbatimDepth has a single value rather than a range, the minimum and maximum values should be amended with the same value. When transforming units, the transformation should be reversable, not adjusting the number of significant digits or adjusting the rounding. For example, transform fathoms to meters by multiplying by 1.8288 and retaining added significant digits (verbatim depth of 10 fathoms to minimum and maximum depths in meters of 1̅8.288).

If we can include the combining overline symbol (U+0305), then we can provide both reversibility and an indication of the number of significant digits (which will take calculation). This won't quite correctly represent the introduced error (10 meters being 10 meters +/- 0.5 meter, 18 fathoms being 18 +/- 0.5 fathom). Adding this character will probably add complexity for downstream consumers, but as we are moving in and out of darwin core as strings, it is a possibility.

@ArthurChapman
Copy link
Collaborator

@chicoreus - this doesn't look right to me. Your new wording is suggesting filling in dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are not EMPTY - BUT if they are not EMPTY - you don't want to replace what is there.

I think it should be

INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are not EMPTY and either dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are EMPTY and could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Jul 30, 2024

With this one and #68 - this issue of "and" or "and/or" in the first line. If the verbatimDepth only has a value for depth - you don't want anything filled in if either dwc:minimumDepthInMeters and dwc:maximumDepthInMeters have a value

This issue arises where the verbatimDepth says either "minimumDepth" has a value or "maximumDepth" has a value - then you could fill in the dwc:minimumDepthInMeters and dwc:maximumDepthInMeters respectively as long as the relevant didn't already have a value. I guess that could be covered by the wording "unambiguously interpretable"

@chicoreus
Copy link
Collaborator

chicoreus commented Jul 31, 2024 via email

@chicoreus
Copy link
Collaborator

We should also note, when converting from fathoms to meters, use meters = fathoms * 1.8288 rather than meters = fathoms * 1.828804. The 1.828804 is for one fathom = 6 US State Plane Feet. This will introduce a trivial error on US chart data older than the most recent charts, but for biodiversity purposes, not relevant.

@chicoreus
Copy link
Collaborator

@ArthurChapman good catch. There is an extraneous not in there. It should read:

INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are EMPTY and either dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are EMPTY and could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

@chicoreus
Copy link
Collaborator

And for the notes, another suggestion to augment:

If the dwc:verbatimDepth has a single value rather than a range, the minimum and maximum values should be amended with the same value. When transforming units, the transformation should be reversible, not adjusting the number of significant digits or adjusting the rounding. For example, transform fathoms to meters by multiplying by 1.8288 and retaining added significant digits (verbatim depth of 10 fathoms to minimum and maximum depths in meters of 1̅8.288). Implementations should be capable of interpreting verbatim data in at least meters, fathoms, and feet, in the form of either a single value or a range. The units must be specified in the verbatim data to be interpretable.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jul 31, 2024
…XDEPTH_FROM_VERBATIM following proposed specification rather than current, along with unit test.
@ArthurChapman
Copy link
Collaborator

Expected Response and Notes updated following discussion above, and Specification Last Updated - updated.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jul 31, 2024
…TION_FROM_VERBATIM and improving implementation of tdwg/bdq#55 AMENDMENT_MINDEPTH-MAXDEPTH_FROM_VERBATIM.  Implementations up to date with current specifications, along with unit tests.
@chicoreus
Copy link
Collaborator

Discussion of handling of validation cases in the form verbatmDepth="Min depth = 10m" and data observed in the wild in the form verbatimDepth="<10m" and verbatimDepth=">10m" leads to the suggestion that we change the specification to handle cases where only one bound is specified differently:

Thus interpreting verbatimDepth="<10m" as maximumDepthInMeters=10m, with no value in minimumDepthInMeters and verbatimDepth>10m as minimumDepthInMeters=10m with no value in maximumDepthInMeters

Current interpretation of the geo_ref_qc implementation is that verbatimDepth="<10m" is interpreted as minimum
DepthInMeters=0, maximumDepthInMeters=10, while verbatimDepth=">10m" is treated as uninterpretable as it does not provide for an interpretable maximum bound.

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 3, 2024

No comments on my suggested Expected Response? I think it will work with scenarios that have been mentioned.

My only query would be if, for example dwc:verbatimDepth="100m", do we AMEND dwc:minimumDepthInMeters, or dwc:maximumDepthInMeters, or neither?

I should add this example as an edge case in the Test Data.

The same argument will apply to #68

@tucotuco
Copy link
Member

tucotuco commented Aug 3, 2024

Amend both, no?

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 3, 2024

Thanks @tucotuco , hmm, yes. Why not.

My Expected Response will still work I guess, so changing

INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters and dwc:maximumDepthInMeters are EMPTY and either dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are EMPTY and could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

to

INTERNAL_PREREQUISITES_NOT_MET if dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are EMPTY and could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

and updated Specification Last Updated.

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 3, 2024

@chicoreus - your amended Notes crashed my Python dump program: "UnicodeEncodeError: 'charmap' codec can't encode character '\u0305'" so edited out, unless there was some hidden meaning?

@chicoreus
Copy link
Collaborator

chicoreus commented Aug 3, 2024

Text @Tasilee replaced was:

If dwc:verbatimDepth has a single value rather than a range, the minimum and maximum values should be amended with the same value. When transforming units, the transformation should be reversible, not adjusting the number of significant digits or adjusting the rounding. For example, transform fathoms to meters by multiplying by 1.8288 and retaining added significant digits (verbatim depth of 10 fathoms to minimum and maximum depths in meters of 1̅8.288). Implementations should be capable of interpreting verbatim data in at least meters, fathoms, and feet, in the form of either a single value or a range. The units must be specified in the verbatim data to be interpretable.

The u0305 is the combining overline (overhead bar) character, indicating when over a digit not at the right most end of a number the number of significant digits (or if over the rightmost digit, indicating that digit is repeating to infinity (e.g 1.33̅3)). Thus the example is suggesting converting 10 fathoms to 18.288 meters, as a reversible transformation, but with the number of significant digits (two) retained by including the overhead bar, thus: 1̅8.288 See the notation in: https://en.wikipedia.org/wiki/Significant_figures#Multiplication_and_division

This needs discussion. One alternative is to convert 10 fathoms to 18 meters, retaining the two significant digits, but not being reversible. Another alternative is to convert 10 fathoms to 18.288 meters, reversible, but implying additional precision. Another alternative would be to convert 10 fathoms to the range 9.5 to 10.5 fathoms and from there to minDepthInMeters=17.3736 maxDepthInMeters=19.2024. The removed text is making the proposal that if we allow for the overhead bar (standard notation for significant digits), then we could have things both ways, a reversible transformation and retention of the number of significant digits.

As with @Tasilee 's conversion script, adding the overhead bar unicode character is likely to cause problems for downstream consumers of the amended data, as this isn't a typical convention in biodiversity informatics data. Each of the possibilities has tradeoffs, and we should have a good rationale for what decision we make. At this point, we've asserted that reversibility is an important principle for amendments that make numeric transformations.

@chicoreus
Copy link
Collaborator

chicoreus commented Aug 3, 2024 via email

@tucotuco
Copy link
Member

tucotuco commented Aug 4, 2024

From the moment I first encountered significant digits I did not believe in them. I still don't. They are a great way to propagate and augment uncertainty. If that is what one is after, by all means... By the way, 10 has one significant digit, so the significant digit-based amendment would be to 20, not to 18. That is outside the upper bound of the conservative range method (9.5 - 10.5). Horrible.

In the absence of uncertainty measures for depth I am in favor of recording the exact result of the transformation, because it is reversible, and because it is supported by the original verbatim value, which would hopefully accompany the amended values.

@chicoreus
Copy link
Collaborator

chicoreus commented Aug 4, 2024 via email

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 12, 2024
@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Aug 21, 2024

Changed
INTERNAL_PREREQUISITES_NOT_MET if dwc:verbatimDepth is EMPTY or the value is not unambiguously interpretable; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they are EMPTY and could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

To
INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters or dwc:maximumDepthInMeters are not EMPTY or dwc:verbatimDepth is EMPTY; FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they could be unambiguously determined from dwc:verbatimDepth; otherwise NOT_AMENDED.

and updated Specification Last Updated

@ArthurChapman
Copy link
Collaborator

Changed "or" to "and" in the INTERNAL_PREREQUISITES_NOT_MET to allow for either latitude or longitude to be filled in independantly

updated Specification Last Updated

@chicoreus
Copy link
Collaborator

chicoreus commented Aug 21, 2024

Rationalle for @ArthurChapman's choice of "and" instead of "or" is that if one of multiple targets to be filled in contains a value, but others do not, then a test proposing the amendment needs to evaluate the consistency of the conclusion with the value that exists in the target with the source value that is being considered. For example if minimumDepthInMeters=6.4, maximumDepthInMeters="", and verbatimDepth="3.5-10 fathoms", the test must assess the consistency of 3.5 fathoms with 6.4 meters as well as filling in 10 fathoms as the maximum. This mixes concerns of validation of internal consistency with concerns about amending to fill in values. This was more obvious to us when examining filling in coordinates from verbatim values.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 22, 2024
…was disabled, fixes and enhancements to better conform implementations of tdwg/bdq#55 and tdwg/bdq#69 to expectations.  Metadata not updated yet.
@Tasilee
Copy link
Collaborator

Tasilee commented Aug 26, 2024

I agree with "INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters or dwc:maximumDepthInMeters are NOT EMPTY". This simplifies the implementation, no mixing and matching.

As has been reiterated, it is the "and" in "FILLED_IN the value of dwc:minimumDepthInMeters and dwc:maximumDepthInMeters if they can be unambiguously interpreted.." that raises issues with some dwc:verbatim text. If we have a single candidate value for dwc:minimumDepthInMeters or dwc:maximumDepthInMeters for example, dwc:verbatim="min depth=100m" or "dwc:verbatim="max depth=100m" do we fill the single candidate. Currently this scenario will produce NOT_AMENDED.

@chicoreus
Copy link
Collaborator

From an email thread on this issue:

NOT_AMENDED is an appropriate response here (not necessarily the
easiest to implement). We can't unambiguously interpret the minimum
and maximum depths from just a maximum depth in verbatim depth.

From verbatimDepth="Maximum depth 100m":

We could interpret any of the following three possibilities:

(1) minimumDepth=0, maximumDepth=100
(2) minimumDepth=100, maximumDepth=100
(3) minimumDepth=?, maximumDepth=100

The presence of "and" in the expected response means we need to provide both values, thus we can't (without changing the specification) return just maximumDepth=100 without a minimumDepth.

Since there is more than one possible interpretation, we can't provide both unambiguously, and the NOT_AMENDED response would fit with the current specification.

@chicoreus
Copy link
Collaborator

Note that the explicit setting of a lower bound differs from verbatimDepth=100m, which would be interpreted as maximumDepthInMeters=100, minimumDetphInMeters=100.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 27, 2024
…ta, including updates and additions of backing methods, unit tets, and some cleanup. Includes bugfixes and improvements for tdwg/bdq#55 and substantive bugfixes to tdwg/bdq#201.
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 27, 2024
@chicoreus chicoreus changed the title TG2-AMENDMENT_MINDEPTH-MAXDEPTH_FROM_VERBATIM TG2-AMENDMENT_MINDEPTHMAXDEPTH_FROM_VERBATIM Aug 30, 2024
@chicoreus
Copy link
Collaborator

Removing hyphen from name to make TERM_ACTON consistent.

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 24, 2024

Could I suggest a slightly more flexible Expected Response given the edge case on current DataID 321

dwc:minimumDepthInMeters="", dwc:maximumDepthInMeters="", dwc:verbatimDepth="Maximum depth 100m"

INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumDepthInMeters or dwc:maximumDepthInMeters are bdq:NotEmpty or dwc:verbatimDepth is bdq:Empty; FILLED_IN the value of dwc:minimumDepthInMeters and/or dwc:maximumDepthInMeters if they can be unambiguously interpreted from dwc:verbatimDepth; otherwise NOT_AMENDED.

This would result in

dwc:minimumDepthInMeters="", dwc:maximumDepthInMeters="100" ?

@ArthurChapman
Copy link
Collaborator

Seems reasonable to me

@chicoreus
Copy link
Collaborator

chicoreus commented Sep 24, 2024 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 24, 2024

@chicoreus. The Expected Response/Specification (if clear and concise) must inform the Notes, not the reverse. That said, I will bow to your position to move this along, and edit the Test Data accordingly.

@chicoreus
Copy link
Collaborator

chicoreus commented Sep 24, 2024 via email

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Nov 10, 2024
@chicoreus chicoreus added the CODED label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment CODED Completeness CORE TG2 CORE tests SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2
Projects
None yet
Development

No branches or pull requests

5 participants