Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to unzip the kaggle datasets? #1

Open
qAp opened this issue Aug 12, 2020 · 1 comment
Open

How to unzip the kaggle datasets? #1

qAp opened this issue Aug 12, 2020 · 1 comment

Comments

@qAp
Copy link

qAp commented Aug 12, 2020

Hi
Are the zip files downloaded from Kaggle meant to be extracted with a simple unzip command?

Following the instructions in the README, three zip files are downloaded with:

kaggle d download kbrodt/oc-t1-1024-z19-p1
kaggle d download kbrodt/oc-t1-1024-z19-p2
kaggle d download kbrodt/oc-t1-1024-z19-p3

I then ran unzip '*.zip' but this only produced three files, with which I don't know what to do with:

train_tier_1_tiles_1024_z19.zip.partaa
train_tier_1_tiles_1024_z19.zip.partab
train_tier_1_tiles_1024_z19.zip.partac

file train_tier_1_tiles_1024_z19.zip.partaa gives:

train_tier_1_tiles_1024_z19.zip.partaa: Zip archive data, at least v1.0 to extract

I've also tried zip -FF in case the data is corrupted, as well as extracting with jar, but have not been able to go beyond these *.parta* files.

Any ideas? Thanks

@kbrodt
Copy link
Owner

kbrodt commented Mar 22, 2021

Hi,

I've splitted zip file into multiple via split to fit kaggle limit of 20Gb. You can use cat to merge all of them, e.g. cat train_tier_1_tiles_1024_z19.zip.parta* > train_tier_1_tiles_1024_z19.zip. Try both, merge downloaded files into one and then unzip or unzip and then merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants