Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Ceph's namespaces (not kubernetes namespaces) #798

Closed
simonpie opened this issue Jan 17, 2020 · 34 comments · Fixed by #1035
Closed

Support for Ceph's namespaces (not kubernetes namespaces) #798

simonpie opened this issue Jan 17, 2020 · 34 comments · Fixed by #1035
Assignees

Comments

@simonpie
Copy link

Describe the feature you'd like to have

Support for Ceph's namespace as described here.

What is the value to the end user? (why is it a priority?)

This would allow multiple kubernetes clusters to share one unique ceph cluster without creating a pool per kubernetes cluster. Pools in ceph can be a computationally expensive as stated in documentation and namespaces are the recommended way to segregate tenants/users.

How will we know we have a good solution? (acceptance criteria)

The kubernetes admin can specify a ceph namespace to use in the helm chart and all ceph's operations done in the backend are automatically using --namespace=$mynamespace. Hence, all rbd operations are going to the specified namespace.

A different kubernetes cluster using a different ceph client, could be assigned to a different ceph's namespace and would not be able to see the image done by the first cluster.

@simonpie
Copy link
Author

So, are there any plans to implement this ?

@mehdy
Copy link
Contributor

mehdy commented Feb 22, 2020

If no one's going to implement it. I'd be glad to do it.

@Madhu-1 Should I start working on it?

@simonpie
Copy link
Author

We can help in testing the solution.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Feb 23, 2020

@mehdy yes please go ahead

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Feb 25, 2020

@mehdy assigned this issue to you.

@mehdy
Copy link
Contributor

mehdy commented Feb 25, 2020

@Madhu-1 Good. Thanks.

@mehdy
Copy link
Contributor

mehdy commented Mar 25, 2020

@Madhu-1 I've been working on this lately and I found out that it's necessary to detect the namespace inside genVolFromID. So it makes sense to add the namespace to CSIIdentifier but ceph allows namespace names to be 64 chars long.
I'm not sure it's a good idea to do it this way because of the maxVolIDLen. And this is possible only if we put the namespace name as a string and not its hex. And I'm not sure if changing the CSIIdentifier is a breaking change or not?

Is there a way we could know the storageclass of a volume for its ID? or using csiConfigFile to somehow detect the namespace maybe?

@simonpie
Copy link
Author

@mehdy Are you planning to allow the namespace to be different for each volume ? I would have thought that it should be part of the csi plugin configuration. Fixed for all volume in a a storage class.

@mehdy
Copy link
Contributor

mehdy commented Mar 25, 2020

@simonpie No, I want it to be a storageclass configuration as you said. The problem is like this:
A request to delete a volume has only the volume_id attribute and I cannot determine the namespace from its storageclass.

@simonpie
Copy link
Author

Understood, thank you for the clarification. Can it be added in labels ? And used of course.

@ShyamsundarR
Copy link
Contributor

ShyamsundarR commented Apr 2, 2020

There are 3 ways to provide this configuration,

Via the StorageClass as being discussed above

Issue: DeleteVolume and DeleteSnapshot needs this encoded in some form in the volume_id. The volume_id also has length restrictions, so some form of numeric namespace to ID encoding would be preferred (as discussed in comments above).

I would suggest we not go this way, as this is not a StorageClass requirement, rather a per-kubernetes (or, ContainerOrchestrator (CO)) cluster requirement.

Considering this to be a CO per-cluster level requirement, opens up 2 other options, as follows, to configure the namespace.

Pass this in as a CLI option to the CSI plugins, say --namespace=$namespace as suggested

This restricts the namespace across Ceph clusters that the CSI instance, on the CO cluster, would use, to be the same namespace value. This maybe a desirable requirement, but need not be made mandatory.

IOW, if this is CLI argument and the said CO cluster is using 2 external Ceph clusters to provide storage then, the namespace that this CSI instance uses across these 2 ceph clusters MUST be the same.

Pass this in by extending the csi-config.yaml to add a per-clusterID namespace parameter

This can hence take different values across different clusterIDs, which in turn are to distinguish between different Ceph clusters, or even some usage of the same Ceph cluster but differentiating the IDs.

The clusterID to use is passed in via the StorageClass already, and hence the StorageClass can selectively direct requests to the via the ID, to the right namespace.

The clusterID is also encoded in the volume_id for use in DeleteVolume/Snapshot requests, giving it the ability to read and use the namespace for the said clusterID for operations.

I would suggest we proceed with (3) above to enable the feature and its use-cases.

NOTE: We possible also need to think about instanceID as well, which was when sharing the same namespace across CO clusters

Thoughts? @dillaman @Madhu-1 @mmgaggle

@dillaman
Copy link

dillaman commented Apr 2, 2020

For the use-case provided, I'd vote for option (3). However, I think there is an alternate use-case where k8s tenants map to RBD namespaces. That helps to support isolation between tenants, different defaults can be configured, potentially different Ceph user caps, etc.

@mehdy
Copy link
Contributor

mehdy commented Apr 4, 2020

I personally think if somehow it was possible to do it via storageclasses it would the best implementation. I think it's a very similar configuration to pool where it's provided via storageclass. But I think the 3rd option is also good enough!
So I'd be happy with the 3rd one. Nudge me to proceed when you decided on the solutions.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 5, 2020

We need to get what is the exact requirement here. Is it sharing same cluster with in single kubernetes cluster or sharing it with different kubernetes cluster.

If we want to use the different ceph namespace with in the same kubernetes cluster with different namespace the above option 3 will work but you have to create different clusterid with same monitor info .

We can keep default ceph namespace name in cluster info in configmap, but we can provide an option to override it via storageclass

Imo if we implement it via storageclass it would minimize the duplication of monitor information and also we can have single cluster configuration in configmap

What are the challenge if we want to implement this with storageclass.

I think Only we need to think about where to store this namespace name

@ShyamsundarR are there any other challenges?

@ShyamsundarR
Copy link
Contributor

We need to get what is the exact requirement here. Is it sharing same cluster with in single kubernetes cluster or sharing it with different kubernetes cluster.

Would like to hear from folks following this issue. From the issue description I would state it is the latter, IOW share a ceph cluster/rbd-pool across different kubernetes clusters.

If we want to use the different ceph namespace with in the same kubernetes cluster with different namespace the above option 3 will work but you have to create different clusterid with same monitor info .

Agree, my thoughts as well.

We can keep default ceph namespace name in cluster info in configmap, but we can provide an option to override it via storageclass

StorageClass is a cluster wide resource, so I am not sure its access/use is restricted to certain namespaces, so per-tenant StorageClass, as of now, seems to be a no-go.

As long as the intention to mention this in the StorageClass is to make it evident or easier to configure, and not to restrict its use to certain tenants, thinking in this direction has it's merits.

Imo if we implement it via storageclass it would minimize the duplication of monitor information and also we can have single cluster configuration in configmap

What are the challenge if we want to implement this with storageclass.

I think Only we need to think about where to store this namespace name

@ShyamsundarR are there any other challenges?

The one discussed challenge, as above, is the need to encode/represent the namespace in the volume_id, other than this there I cannot think of other issues in this regard.

@madddi
Copy link
Contributor

madddi commented Apr 11, 2020

Joining the discussion because I was working on #931, running into the same problem 🙂

Our requirements are using on Ceph cluster with multiple Kubernetes clusters.

I'd vote for option 3) for clarity. Yes, possibly duplicating monitor configurations is not optimal. But there are two reasons why this would make sense to me:

  1. Regarding multitenancy:

StorageClass is a cluster wide resource, so I am not sure its access/use is restricted to certain namespaces, so per-tenant StorageClass, as of now, seems to be a no-go.

Because of this, it would not really help to specify the namespace in the StorageClass. Tenant separation is not possible with multiple StorageClasses on Kubernetes.

  1. Configuring the namespace in config.json is a clear way to tell the user that the whole provisioner will work in this namespace. Encoding such values in other IDs or similar makes it harder in my experience for users to work directly with them these resources. For example, if I'd want to migrate volumes from one pool to another or restore them from backup, I'd might be forced to do this by hand and create PersistentVolumes with a script. Encoding and decoding the namespace in volume_id will then be another thing to figure out and implement to make the provisioner work afterwards.

@madddi
Copy link
Contributor

madddi commented Apr 23, 2020

Hi, is there any way to find a decision here? This blocks #931.

We'd really like to deploy the provisioner on our clusters but need this feature first...

Cc @ShyamsundarR

@ShyamsundarR
Copy link
Contributor

Hi, is there any way to find a decision here? This blocks #931.

We'd really like to deploy the provisioner on our clusters but need this feature first...

Cc @ShyamsundarR

I think there is interest in the feature and option 3 seems the best way forward to address the use-cases. I would state we have waited enough for alternatives or preference voting, and hence should proceed with the same.

@madddi do you want to pick this up as part of #931? I am tied up on certain priorities as of this week (and a little into the next).

@simonpie
Copy link
Author

Just adding my two cents.

My original requirement was to have multiple kubernetes clusters use one ceph swarm.

But, I could definitely use having multiple user/tenant/namespace in a single kubernetes cluster use ceph independently, that is each kubernetes namespace have volume in a different ceph namespace. I did not realize at the beginning that storage class usage could not be limited to specific kubernetes namespace.

@madddi
Copy link
Contributor

madddi commented Apr 24, 2020

@ShyamsundarR Thanks! I'll start working on #931 then.

Regarding the namespace topic: I wouldn't like to do this as part of #931 to keep the PR small. Additionally I don't have that much time to spare right now, too... Maybe someone else who already offered to start working on this can pick it up?

madddi added a commit to dg-i/ceph-csi that referenced this issue Apr 29, 2020
The name of the CephFS SubvolumeGroup for the CSI volumes was hardcoded to "csi". To make permission management in multi tenancy environments easier, this commit makes it possible to configure the CSI SubvolumeGroup.

related to ceph#798 and ceph#931
madddi added a commit to dg-i/ceph-csi that referenced this issue Apr 29, 2020
The name of the CephFS SubvolumeGroup for the CSI volumes was hardcoded to "csi". To make permission management in multi tenancy environments easier, this commit makes it possible to configure the CSI SubvolumeGroup.

related to ceph#798 and ceph#931
@mehdy
Copy link
Contributor

mehdy commented May 2, 2020

Just to be verbose, I'll continue working on this issue with the third option (csi-config.yaml).

mergify bot pushed a commit that referenced this issue May 4, 2020
The name of the CephFS SubvolumeGroup for the CSI volumes was hardcoded to "csi". To make permission management in multi tenancy environments easier, this commit makes it possible to configure the CSI SubvolumeGroup.

related to #798 and #931
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 11, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 12, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 12, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 18, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 18, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 19, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 20, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 20, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 20, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
mehdy added a commit to mehdy/ceph-csi that referenced this issue May 28, 2020
Make sure to operate within the namespace if any given when dealing
with rbd images and snapshots and volume and snapshot journals.

Re-run the entire e2e tests one more time using a namespace.

Closes: ceph#798

Signed-off-by: Mehdy Khoshnoody <[email protected]>
@clwluvw
Copy link
Member

clwluvw commented Jun 23, 2020

@Madhu-1 Do you have any release date for this feature? It's really needed to prevent creating many pools!! :(

@clwluvw
Copy link
Member

clwluvw commented Jun 24, 2020

ping @Madhu-1 @nixpanic

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 24, 2020

If everything goes good this will be part of 3.0.0 release. There is already a issue create to track release status

@clwluvw
Copy link
Member

clwluvw commented Jun 25, 2020

@Madhu-1 Can you please tell more details about the things (issues) that should be done before this issue and a due date for release 3.0.0?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 25, 2020

cc @humblec

@humblec
Copy link
Collaborator

humblec commented Jun 29, 2020

@Madhu-1 Can you please tell more details about the things (issues) that should be done before this issue and a due date for release 3.0.0?

@clwluvw this RFE or enhancement was not tracked for 3.0.0 release, so the ideal thing here is to target it for next minor release of 3.x ( ie 3.1.0). We did an initial triage on getting this in 3.0.0 release as @Madhu-1 mentioned, however the first impression was that, this could shake the other work going on to add snapshot,restore,clone features for cephfs and rbd. Thats why the release call decision to consider this as last tem for 3.0.0. If that didnt work out we may consider this for 3.1.0. However we are still hopeful to get this in for 3.0.0.

@clwluvw
Copy link
Member

clwluvw commented Jun 29, 2020

@humblec I thought it could be merged and doesn't impact anything like other cleanup PRs! :)

@simonpie
Copy link
Author

What are the ETAs for version 3.0.0 and for 3.1 ? Or where could I find that information ?

@humblec
Copy link
Collaborator

humblec commented Jul 17, 2020

What are the ETAs for version 3.0.0 and for 3.1 ? Or where could I find that information ?

3.0 Release issue is here # #865

@simonpie
Copy link
Author

Hello,

Am I to understand from this :
https://github.com/ceph/ceph-csi/milestone/6

that ceph's namespace will not be in v3.1.0 ?

@pawanthegemini
Copy link

The feature is supposed to provide multi-tenancy, which means multiple k8s clusters will be able to use a single ceph cluster. What about the usage limit on per user basis? A user can create an rbd image of any size within its namespace as there is no such quota or restrictions either on a user basis or on a namespace basis to keep each individual user in check.
Any plans to add quotas as well along with namespace-based segregation which is already made available from the last few ceph-csi versions for client kernel's 4.18.x.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 29, 2022

The feature is supposed to provide multi-tenancy, which means multiple k8s clusters will be able to use a single ceph cluster. What about the usage limit on per user basis? A user can create an rbd image of any size within its namespace as there is no such quota or restrictions either on a user basis or on a namespace basis to keep each individual user in check. Any plans to add quotas as well along with namespace-based segregation which is already made available from the last few ceph-csi versions for client kernel's 4.18.x.

@pawanthegemini ceph needs to provide quota support at the rbd namespace level (which is outside of cephcsi). Please open an issue with ceph to request it.

@pawanthegemini
Copy link

Sure, thanks for the update. Created a request with ceph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants