-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: add proposal for CephFS fscrypt integration #2912
Conversation
Cc @jtlayton @kotreshhr appreciated your view on this proposal to integrate CephFS encryption with CSI. |
management systems | ||
- `fscrypt` handles key derivation, storage of wrapped keys and metadata | ||
|
||
The current CephFS subvolume root will remain untouched with the exception that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we referring to SubVoumeGroup here as subvolume root
or its the filesystem
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subvolume root refers to the root directory of the subvolume. What gets mapped into a pod once mounted.
On my testsetup this would be:
$ bin/ceph fs subvolume info a csi-vol-4fa8245c-9b00-11ec-bdd8-eed58c1c7c89 csi
{
...
"path": "/volumes/csi/csi-vol-4fa8245c-9b00-11ec-bdd8-eed58c1c7c89/32a6c9c4-d83a-482b-8abf-e6b0ea676f3a",
...
}
$ sudo mount -t ceph -o name=admin,secret=.. 192.168.122.1:40687:/volumes/csi/csi-vol-4fa8245c-9b00-11ec-bdd8-eed58c1c7c89/32a6c9c4-d83a-482b-8abf-e6b0ea676f3a /subvolume_root
subdirectory. The root will contain a `/.fscrypt` directory managed by `fscrypt`. | ||
|
||
`fscrypt` requires access to a mounted filesystem and therefore the encryption setup | ||
must take place in the `NodeStageVolume` request handler instead of `CreateVolume`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont we need to create a protector key
at time of createvolume
and store ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The protector keys are stored wrappend in the '/.fscrypt' directory on the volume. We therefore need filesystem access which we don't have in CreateVolume, hence NodeStageVolume.
|
||
## Dependencies | ||
|
||
The proposed change is tailored to CephFS and requires CephFS support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like one of the CephFS string ( from ...and requires.... ) has to be replaced...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of CephFS in that sentence :). It tries to express that this is change is agnostic to CephFS, except for the fact that it is part of Ceph CSI's CephFS integration.
@irq0 first of all this is a well formatted/written design doc , Thanks for that.. 👍 ps # the diagram is not rendering properly for me, so I may be missed the |
82b1bc7
to
d7a9040
Compare
Added runtime and build dependencies to the Dependencies section
Odd. Maybe the mermaid live editor works better: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jfyi There is a feature request on cephfs for metadata capability similar to rbd
https://tracker.ceph.com/issues/54472
Waiting for CephFS team input on this l.. hopefully we will have it soon! thanks ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks reasonable to me. I'm afraid I don't have enough intimate knowledge of Ceph CSI to know what the best method to use is.
- No `/.fscrypt` on the subvolume root | ||
|
||
Drawbacks | ||
- `fscryptctl` is a C tool and does not lend itself to be integrated into Ceph CSI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw: fscryptctl
is very simple and just calls a bunch of ioctls. You could probably drive those ioctls from Go as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is not a big drawback. Using the policy functions from fscrypt would also be an option. (https://github.com/google/fscrypt/blob/master/metadata/policy.go)
@irq0 can you please correct the CI linter failure ? |
subvolumes. | ||
|
||
Due to the way `fscrypt` stores metadata, subvolumes have a regular root | ||
directory containing a `/.fscrypt` directory and a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the .fscrypt
directory is one of the main reasons I wanted the CSI team to think about this. This directory is not generally an issue with local filesystems like ext4, as admins almost always mount the root of the fileystem, and you can generally ensure that this directory is available to userland applications.
Contrast that with something like NFS or Ceph, where mounting a subdirectory of an export is very common. You could end up in a situation where you've run fscrypt setup
on an upper-level directory but then someone mounts a subdirectory of it. The .fscrypt
directory won't be available on the client at that point.
The placement of the .fscrypt directory is crucial if you intend to use the fscrypt binary under the hood. I wonder if it may even be better to do something like store the info that's currently in .fscrypt
directly in a RADOS objects instead, which would sort of sidestep the whole issue. That would mean a major overhaul for fscrypt, however (or a rewrite).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll split the anwer into the .fscrypt directory and RADOS object alternative.
I think we should limit the discussion to the Ceph CSI context.
We basically have two options:
- Integrate Ceph CSI with the
fscrypt
tooling as is (this design
doc) - Create another key management option that is specific to CephFS
(Alternative [ceph-csi-ksm]; Fscrypt Metadata on RADOS below)
- The
.fscrypt
directory is necessary for 1. -
- and 2. can coexist on the same filesystem, since they are
user land key management systems that ultimately set a fscrypt
policy in the kernel.
- and 2. can coexist on the same filesystem, since they are
I'm in favor with 1. Why? It would enable users to manually get a
secret from K8s secrets or Vault, mount a CephFS somewhere and use the
fscrypt
tool without changes with that secret to unlock the
encrypted data. No Ceph specific tools required.
The .fscrypt metadata directory and Ceph CSI
To make things a bit more concrete, here is what a freshly created
PV/PVC looks like on an otherwise empty CephFS with an implementation
based on the doc:
# mount -t ceph -o name=admin,secret=$(bin/ceph auth get-key client.admin) \
$(bin/ceph mon dump --format json | jq -r '.mons[] | .addr' | sed -e 's/\/0//'):/ \
/mnt
# find /mnt
/mnt
/mnt/volumes
/mnt/volumes/_csi:csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd.meta
/mnt/volumes/csi
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/protectors
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/protectors/46ef26343bfaa137
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/policies
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/policies/451ca267026dd6704b85efdccef305f8
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/ceph-csi-encrypted
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/.meta
This would add a .fscrpyt
directory on each subvolume / PVC. To
manually use fscrypt
a user would have to mount
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad
Since Ceph CSI owns the subvolumes below CSI anyway, I don't think
there is a drawback to this.
We could move the .fscrypt
directory as high as the filesystem or
the csi subvolume group. Basically having it next to the "meta" files.
Fscrypt Metadata on RADOS
An implementation could be as straight forward as using two objects,
one for protectors one for policies. Each with an object map mapping
keyid to a fscrypt metadata blob. The entries would directly correspond to the files in
/.fscrypt/{protectors,policies}/$keyid.
The problem would be access control in the spirit of fscrypt
.
/.fscrypt
is basically set up like /tmp
(sticky). Created by root,
but non-root users are allowed to add/remove their own protectors.
The fscrypt design doc lists the following requirements for a metadata store:
Metadata Requirements
There are a few properties that we want our metadata storage to have.For a filesystem that is set up for encryption, a user can create an
encrypted directory (and its associated metadata) without being rootAny user with access to an encrypted directory (via standard UNIX
permissions) and the correct credentials can unlock the directory,
regardless of who set up the directory.A non-root user cannot delete another user’s metadata, which would
make the files corresponding to that metadata unreadable (see above
Threat Model).An encrypted directory can be protected with a Protector whose data
is on another filesystem. This is necessary to protect a folder on a
USB drive with a user’s login password, for example.We do NOT require that any user be able to set up encryption on a
filesystem, as this involves making privileged changes to the system.
I'm not familiar enough with the Ceph FS POSIX user translation to
RADOS auth, but I suspect user access to the fscrypt metadata
objects won't be easy.
I agree, this is a major overhaul or a new Ceph FS specific key management solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem would be access control in the spirit of fscrypt. /.fscrypt is basically set up like /tmp (sticky). Created by root, but non-root users are allowed to add/remove their own protectors.
Good point. That would be tricky to replicate in bare RADOS with cephx creds. You'd probably have to layer some enforcement on top. Blech.
That said...one thing that the local fscrypt filesystem folks don't need to worry about is idmapping. I imagine consistent uid/gid mapping is a must for k8s nodes so I'll assume that's not a problem here.
We could move the .fscrypt directory as high as the filesystem or the csi subvolume group. Basically having it next to the "meta" files.
The key point here is that the .fscrypt
dir has to be consistently reachable, so you need to consistently mount the same cephfs directory. If you're dealing with a pretty flat hierarchy where every tenant is mounting his own subvolume, you should be ok. If, on the other hand, you have a situation where some clients mount at a higher point in the directory tree then things get more iffy.
I imagine Ceph CSI keeps things fairly flat though, so you should be OK.
Having multiple fscrypt directories (one for each subvolume) seems like the best approach, IMO. Allowing access to the keys in there should be safe (since they are just part of the overall KDF), but it'd be better to deny access to it when tenants don't need it.
@irq0 can you please revisit the comments from Jeff and the linter failures? |
7e30023
to
16ae0c6
Compare
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for joining todays standup call @irq0!
As discussed during the call, it would be a great start to have this function on RBD volumes with ext4. Once CephFS in the kernel supports this, it can be enabled for CephFS too. In the mean time, including and stabilizing fscrypt within Ceph-CSI can already get started.
Note that our e2e suite uses minikube, and hence the minikube-iso should have a kernel with fscrypt enabled (minikube kernel config).
Add proposal document covering key management integration of Ceph CSI and https://github.com/google/fscrypt Updates: ceph#1563 Signed-off-by: Marcel Lauhoff <[email protected]>
16ae0c6
to
7512279
Compare
Add proposal document covering key management integration
of Ceph CSI and https://github.com/google/fscrypt
Updates: #1563
Signed-off-by: Marcel Lauhoff [email protected]