Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make schemaKey required and improve validation and migration functions #77

Merged
merged 15 commits into from
Sep 21, 2021

Conversation

satra
Copy link
Member

@satra satra commented Aug 14, 2021

@yarikoptic - this is an initial attempt to handle schemaKey, but this requires a weird thing from a python perspective.

in pydantic, a class is initialized with kwargs. this PR adds schemaKey to kwargs to get the validator to trigger a mismatch. this works for passing data from an existing document. however, this also means that any initialization of the class will also require the appropriate schemaKey. see the test_missing_schemakey in the PR. this is quite weird.

the alternative is a function that checks every object (i.e. dictionary) in a json record for the presence of schemaKey, but then this function would need to be called before initializing the class, which also seems weird.

things addressed

  • adding schemaKey as a requirement to JSON schema

@codecov
Copy link

codecov bot commented Sep 16, 2021

Codecov Report

Merging #77 (a586e89) into master (7d89226) will increase coverage by 0.25%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #77      +/-   ##
==========================================
+ Coverage   89.19%   89.44%   +0.25%     
==========================================
  Files          13       14       +1     
  Lines        1342     1355      +13     
==========================================
+ Hits         1197     1212      +15     
+ Misses        145      143       -2     
Flag Coverage Δ
unittests 89.44% <100.00%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
dandischema/consts.py 100.00% <100.00%> (ø)
dandischema/exceptions.py 100.00% <100.00%> (ø)
dandischema/metadata.py 96.89% <100.00%> (+0.09%) ⬆️
dandischema/models.py 94.15% <100.00%> (+0.37%) ⬆️
dandischema/tests/test_metadata.py 100.00% <100.00%> (ø)
dandischema/tests/test_models.py 95.89% <100.00%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d89226...a586e89. Read the comment docs.

@yarikoptic
Copy link
Member

@yarikoptic - this is an initial attempt to handle schemaKey, but this requires a weird thing from a python perspective.

Could you please elaborate on what you mean by "handle" in this context?

@satra
Copy link
Member Author

satra commented Sep 16, 2021

schemaKey was not being stored in object, resulting in no clear way of assigning some types. and apparently since it wasn't required validation wasn't catching it.

this PR (which has changed since the original attempt) now:

  • adds schemaKey as a required property for every object
  • detects and raises a migration error for some known places where conflict could arise
  • adds schemaKey for places where there should not be a conflict
  • adds jsonschema based validation to the validate function.

so handle means to always track schemaKey

@satra satra changed the title handling schemaKey make schemaKey required and improve validation and migration functions Sep 17, 2021
@satra satra marked this pull request as ready for review September 17, 2021 01:46
@satra
Copy link
Member Author

satra commented Sep 17, 2021

@yarikoptic - i think this is good for a review/merge. i would still like to get seb and dorota's PRs in before release.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments/notes, nothing critical really

continue
error_list.append([error.message, tuple(error.absolute_path)])
if error_list:
raise jsonschema.ValidationError(str(error_list))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would collapse all errors into one string of a list, making it impossible to print it nicely or "analyze" above.
I guess should be ok for now, but I think we might need to come up with some consistent mechanism between dandischema and dandi-cli to "encapsulate" multiple errors as a singular exception which would

  • contain .errors list of all individual errors/exceptions (not sure if wrapped into some adapter here to provide consistent interface across errors)
  • have nice __str__ and/or __repr__

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have this. let me look into if python itself has a way of doing this.

@@ -168,7 +195,7 @@ def validate(obj, schema_version=None, schema_key=None, missing_ok=False):
reraise = True
messages.append(el["msg"])
if reraise:
ValueError(messages)
raise ValueError(messages)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and that would be another location to use the aforementioned encapsulation of errors.

@satra
Copy link
Member Author

satra commented Sep 20, 2021

@yarikoptic - any further thoughts on this?

continue
error_list.append([error])
if error_list:
raise ValueError(error_list)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing prior discussion, Let's at least define our own class ValidationError(ValueError) may be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to do it now? or later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no objections, will push that change in 5 minutes. Better not to delay ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we would need to reactor anyways though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok then go for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any other changes you plan, or should i merge?

Otherwise we were reraising original (non-documented) pydantic exception in case of
not missing_ok

PS also added forgotten in previous commit exceptions.py
@satra satra merged commit e91ccfb into master Sep 21, 2021
@satra satra deleted the enh/migrate branch September 21, 2021 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants