Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy dandietag code from dandi-cli #58

Merged
merged 2 commits into from
Jul 20, 2021
Merged

Copy dandietag code from dandi-cli #58

merged 2 commits into from
Jul 20, 2021

Conversation

jwodder
Copy link
Member

@jwodder jwodder commented Jul 9, 2021

Closes #57.

@jwodder jwodder added the minor Increment the minor version when merged label Jul 9, 2021
@codecov
Copy link

codecov bot commented Jul 9, 2021

Codecov Report

Merging #58 (336cb40) into master (3f1e4b7) will increase coverage by 0.15%.
The diff coverage is 96.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #58      +/-   ##
==========================================
+ Coverage   95.91%   96.07%   +0.15%     
==========================================
  Files          11       13       +2     
  Lines        1053     1274     +221     
==========================================
+ Hits         1010     1224     +214     
- Misses         43       50       +7     
Flag Coverage Δ
unittests 96.07% <96.83%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
dandischema/digests/dandietag.py 94.77% <94.77%> (ø)
dandischema/digests/tests/test_dandietag.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3f1e4b7...336cb40. Read the comment docs.


part_size = mb(64)

if file_size > tb(5):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an fyi that on amazon this is terabytes not tebibytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having said that, it's possible that amazon's info is not consistent across services. but they are generally clear about TB/TiB nomenclature.

Comment on lines +12 to +21
def mb(bytes_size: int) -> int:
return bytes_size * 2 ** 20


def gb(bytes_size: int) -> int:
return bytes_size * 2 ** 30


def tb(bytes_size: int) -> int:
return bytes_size * 2 ** 40
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are mebi, gibi, and tebi - bytes. perhaps its a good time to consolidate these options in relation to aws.

# 5MB is the minimum part size allowed by S3
MIN_PART_SIZE = mb(5)
# 5GB is the maximum part size allowed by S3
MAX_PART_SIZE = gb(5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the one piece of info that gives me pause in terms of what amazon is accepting, we should check on our large files what the size of each part is. because 5 GiB > 5 GB, so if 5GB is the limit then i would assume 5 GiB would not be accepted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would indeed be nice to clear it up, but I would keep it as is for the purpose of this PR to provide as smooth (no side-effects) transition from dandi-cli to dandischema

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll file an issue. this can be merged without these changes.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a tiny bit of extra tests

raise ValueError("Partial update extended past end of file")


class ETagHashlike:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class (and used .partial_update) seems to lack tests code coverage -- we never had it for this portion or it was triggered by dandi-cli other functionality somehow? should be easy to add a dedicated unittest to ensure a more complete test coverage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test for ETagHashlike added.

@yarikoptic
Copy link
Member

ok, let's get this merged -- test coverage is improved although not 100% yet, and dandi-cli errors are due to newer metadata shema version which is not yet supported by dandi-api

@yarikoptic yarikoptic merged commit 3f30ea5 into master Jul 20, 2021
@yarikoptic yarikoptic deleted the gh-57 branch July 20, 2021 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor Increment the minor version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move dandi-etag related logic from dand-cli to dandischema
3 participants