-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chunk objects when uploading via PutObject #30
Conversation
@dharmab I'm not sure if you have an environment where you're comfortable testing out a patch, but I've pushed a dev image containing this change (#30 (comment)). It's seemed to work well in my testing but would be great to get additional eyes. If you do have an environment where you can test this, you can either velero plugin remove velero/velero-plugin-for-microsoft-azure:v1.0.0 # change the version tag if you've already upgraded to v1.0.1
velero plugin add steveheptio/velero-plugin-for-microsoft-azure:chunk-objects |
Signed-off-by: Steve Kriss <[email protected]>
Signed-off-by: Steve Kriss <[email protected]>
@skriss I'll be able to test this within |
I've upgraded my impacted cluster to Velero 1.2 and will test this PR there. |
I deployed the plugin and ran a test backup, which failed without logging any errors:
The log content is somewhat sensitive so I cannot post it publicly, but it just contained the standard info logs about the items it was backing up. |
Hmm. Are there any errors in the server log (e.g. |
I checked the server logs we shipped to our log collector and I only see "level=info" logs from the velero server. |
@dharmab can you find in the server logs from around where the backup finished running? Any info from that would be helpful for tracking this down. I'll try some more tests and see if I can figure out what's happening too |
I wonder if you're running into pod resource issues - do you have CPU/mem requests/limits set for the velero pod? It's possible the chunks need to be smaller, since this is reading 100MB at a time which could be pushing past memory limits. |
Pod resource limits aren't the issue- we had to raise our limits very high due to vmware-tanzu/velero#2069, which was only fixed in 1.3.0 a few days ago. I'll see if I can create a reproducible case with a script to create a few thousand configmaps or something. |
OK; I did further testing and was unable to reproduce so any additional info you can provide from the logs, etc. will be very helpful for figuring this out. |
I was able to reproduce and test this successfully with Velero 1.3. I think the mentioned memory leak in Velero 1.2 tainted the result. Steps to reproduce: dd if=/dev/urandom bs=1024 count=256 of=256k-padding.txt
kubectl create namespace velero-azure-issue-26
for i in {0..2048}; do kubectl -n velero-azure-issue-26 create configmap padding-$i --from-file 256k-padding.txt; done This creates ~512MB of configmaps on a cluster (takes a while to run if the apiserver isn't on localhost). Note that excess random data is needed to prevent Velero from shrinking the upload below 256MiB through compression. With the above configmaps present, plugin 1.0.0 fails to create backups while the patch works. Unfortunately there aren't any reproducible error logs in Velero when the backup fails- not sure why. |
alright, this is ready for review from other maintainers. For anyone following, I pushed a new image, |
@carlisia @nrb @ashish-amarnath gentle reminder that this needs some review, 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question, otherwise lgtm.
backupstoragelocation.md
Outdated
# for more information on block blobs. | ||
# | ||
# Optional (defaults to 104857600, i.e. 100MB). | ||
blockSizeInBytes: "10485760" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the value here be "104857600"? I think 10485760 corresponds to 10MB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I put a non-default value in here since the field is not required if you just want the default. I could put something totally different (e.g. 1024) so it's not confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhhhhh. Could it be left blank?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh, I'll just put the default value in. I'd like to have a sample value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's good too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Signed-off-by: Steve Kriss <[email protected]>
Signed-off-by: Steve Kriss <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
closes #26
I've pushed a dev image containing this change to
steveheptio/velero-plugin-for-microsoft-azure:chunk-objects
.