Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0185] Redistribute redistributable software #185

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

Ekleog
Copy link
Member

@Ekleog Ekleog commented Dec 15, 2024

Co-authored-by: Matt McHenry <[email protected]>
@7c6f434c
Copy link
Member

I think a drawback for the plan as-is would be that some of the stuff is actually pretty large so space usage at Hydra/cache goes up.

A missing related piece of work here is to look at the reverse dependencies of unfree packages and to check if they are a good idea to build. For normal packages we have some «it's large but downloading is longer than the build» packages marked as hydraPlatforms=[]. For stuff like TPTP («minimally» unfree license, a build compiling a few executables, some gigabytes of passive data) the position on whether storage:build-time is worth is currently «who cares it's technically unfree».

@Ekleog Ekleog force-pushed the redistribute-redistributable branch from 87cb264 to 27241c3 Compare December 19, 2024 11:42
@Ekleog
Copy link
Member Author

Ekleog commented Dec 19, 2024

Good points! I just added them to the RFC. Also I'll incidentally say that TPTP seems to already be marked as hydraPlatforms = [], but it's still worth reviewing the changes.

@7c6f434c
Copy link
Member

I think that given the skew of unfree stuff towards large binaries, significantly increasing the evaluation time and somewhat increasing the storage growth are separate points to mention. (Probably we need to ask for feedback from the infrastructure team on all that at some point)

@ShamrockLee
Copy link

ShamrockLee commented Dec 19, 2024

significantly increasing the evaluation time

@7c6f434c Why would binary-based packages significantly increase the evaluation time? Nixpkgs requires packages to pass strict evaluation, which means that downloading would never occur during evaluation.

I haven't experiment it, but I guess that packages that requires a long evaluation time typically falls into the categories below:

  • have a large number of requisites (direct and indirect dependencies)
  • have custom overriding applied to dependent packge sets
  • read a lock file to produce a set of vendored packages
  • are made up from large auto-generated Nix expressions (usually by those *2nix commands line tool)

none of the above are specific to unfree or binary-based packages.

Update: Some binary-based packages might be built with legacy versions of libraries, which would require custom overriding if such version is uncommon in Nixpkgs. Still, such situation also occurs to large packages like TensorFlow, and small projects with few dependencies wouldn't take too long to evaluate.

@7c6f434c
Copy link
Member

Evaluation time will increase because life is hard. Basically, even though glibc does not have non-free dependencies, its evaluations «as if it is allowed to have some non-free deps» and «as if it must be transitively free» have the same result, but are not the exact same evaluation. The same for every package in the closure of the large graphical ISOs (which are also evaluated in the non-free-redistributable allowed jobset)

@Ekleog
Copy link
Member Author

Ekleog commented Dec 19, 2024

I think I already listed the two points you're mentioning? The evaluation time issue was listed in the settings from the start; and I just added the built size issue as an unresolved question, as it's currently unclear whether it's negligible or not

@Ekleog
Copy link
Member Author

Ekleog commented Dec 19, 2024

Also you seem to be hinting at a doubling of the eval time but I don't think that'd be the case. Hydra would eval a pkgset that'd consist of essentially unfreeAllowedNixpkgs // filterAttrs drvsThatNeedToBeFoss noUnfreeNixpkgs. So we'd be adding eval only by the partial evaluation time of the closure of required-foss derivations; which should be very far from the full eval time, considering relatively few derivations would require no-unfree

@vcunat
Copy link
Member

vcunat commented Dec 20, 2024

We could do the Hydra's eval simply as now but with unfree allowed – and do this check separately in a channel-blocking job (executed on a builder instead of centralized with the eval). We have similar checks already in the tarball job (pkgs/top-level/make-tarball.nix), even though we've reduced them recently.

CI might check this as well, but such regressions seem quite unlikely to me.

@7c6f434c
Copy link
Member

Oh right, evaluating all the ISOs is not negligible, but indeed can be pushed to a build

@djahandarie
Copy link

Thank you so much for working on this. Since MongoDB is the biggest offender which causes many people serious trouble day-to-day to build, perhaps we could also consider a phased rollout plan where that is the first thing to be included 😄

@7c6f434c
Copy link
Member

And MongoDB indeed has license which avoids most general concerns, in the sense that the source is available, and both arbitrary patches (as they are derivative, they are presumed same-license-as-MongoDB in Nixpkgs anyway) and running inside a network-isolated sandbox are permitted without restriction.

This is not true for all unfree-redistributable things…

@Ekleog
Copy link
Member Author

Ekleog commented Dec 29, 2024

The way I understand (and mean) the current RFC text, all currently unfree redistributable packages would stay out of hydra until marked buildableOnHydra. We could then start with just marking SSPL as buildableOnHydra, but it will be a license/package-specific discussion.

Are there any remaining concerns on the current RFC, that I could address? :)

@7c6f434c
Copy link
Member

@NixOS/infra-build just so that all of you see it…

@Mic92
Copy link
Member

Mic92 commented Dec 29, 2024

@NixOS/infra-build just so that all of you see it…

No objections to this RFC

According to [this discussion](https://github.com/NixOS/nixpkgs/issues/83433), the current statu quo dates back to the 20.03 release meeting.
More than four years have passed, and it is likely worth rekindling this discussion, especially now that we actually have a Steering Committee.

Recent exchanges have been happening in [this issue](https://github.com/NixOS/nixpkgs/issues/83884).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, we also started building all the redistribuable+unfree packages in the nix-community sister project.

See all the unfree-redis* jobsets here: https://hydra.nix-community.org/project/nixpkgs
It's only ~400 packages. The builds are available at https://nix-community.cachix.org/

The jobset is defined in nixpkgs to make upstreaming easier:
https://github.com/NixOS/nixpkgs/blob/master/pkgs/top-level/release-unfree-redistributable.nix

If this RFC passes it will be even better as users don't necessarily know about or want to trust a secondary cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great to know, thank you! Though we may need to do a bit more to properly handle the "cannot be run on hydra" point that was raised above.

I can already see on the hydra link you sent that eval takes <1min, so should be a negligible addition to hydra's current eval times. Build times seem to take ~half a day. AFAIU there's a single machine running the jobs. If I read correctly, hydra currently has ~5 builders, and one trunk-combined build takes ~1 day. So it means that the build times would increase by at most ~10%, and probably less considering that there is probably duplication between what the nix-community hydra builds and what nixos' hydra is already building. I'm also not taking into account machine performance, which is probably stronger on nixos' hydra than nix-community's hydra.

I think this means eval/build times are things we can reasonably live with, and if we get any surprise we can always rollback.

There's just one thing I can't find in the links you sent to properly adjust the unresolved questions: do you know how large one build closure is on nix-community's hydra? I don't know how to get it on nixos' hydra either but it'd still help confirm there's zero risk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means eval/build times are things we can reasonably live with, and if we get any surprise we can always rollback.

Yes, especially since the way the unfree-redis jobset is put together is by evaluating and filtering trough all the nixpkgs derivations. So most likely the combined eval time is much smaller than the addition of both.

There's just one thing I can't find in the links you sent to properly adjust the unresolved questions: do you know how large one build closure is on nix-community's hydra?

The best I can think of is to build a script that takes all the successful store paths, pulls them from the cache, runs nix path-info -s on them and then sums up the value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your answer! I actually more or less found the answer from Hydra's UI. Here is my script:

curl https://hydra.nix-community.org/jobset/nixpkgs/cuda/channel/latest > hydra-jobs
cat hydra-jobs | grep '<td><a href="https://hydra.nix-community.org/build/' | cut -d '"' -f 2 > job-urls
for u in $(cat job-urls); curl "$u" 2>/dev/null | grep -A 1 'Output size' | tail -n 1 | cut -d '>' -f 2 >> job-sizes; wc -l < job-sizes | head -c -1; echo -n " / "; wc -l < job-urls; end
awk '{sum += $1} END {print sum}' job-sizes
# NVidia kernel packages take ~1.3GiB each and there are 334-164 = 170
# Total: 215G, so 45G without NVidia kernel packages

I got the following results:

  • For unfree-redist-full, a total of 215G, including 200G for NVidia kernel packages and 15G for the rest of the software
  • For cuda, a total of 482G

Unfortunately I cannot run the same test on NixOS' hydra, considering that it disabled the channels API.

I just updated the RFC with these numbers, it might make sense to not build all of cuda on hydra at first, considering the literally hundreds of duplicated above-1G derivations :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with the current Hydra workflows I'd estimate that very roughly as uploading 2 TB per month to S3. (we rebuild stuff) Except that we upload compressed NARs, so it would be less.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly, that it'd be reasonable to do the following?

  1. Just push everything, and
  2. if compression is not good enough rollback CUDA & NVidia kernels; and
  3. even if we need to rollback, the added <1T would not be an issue to keep "forever"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. To me it doesn't even feel like a technical question. (3. is WIP so far, I think. There's no removal from cache.nixos.org yet.)

@ryantrinkle
Copy link

@Lassulus IMO this looks fine. We could potentially define a process for someone who owns IP and wants their IP excluded from Hydra to submit a request to have it removed; could be as simple as adding documentation somewhere that says "email us at ".

@vcunat
Copy link
Member

vcunat commented Mar 20, 2025

I think we even had a case where a package with a free license was requested to be removed from nixpkgs by upstream.

@7c6f434c
Copy link
Member

I think we even had a case where a package with a free license was requested to be removed from nixpkgs by upstream.

Correct, and some of such requests were denied.

@infinisil
Copy link
Member

Alright looks like we'll have a full shepherd team for the next RFCSC iteration. To speed things up, I'm hereby proposing to accept the RFC as-is without any further discussion and to immediately initiate FCP. To the other (confirmed and unconfirmed) shepherds @roberth @Mic92 and @Lassulus: Please 👍 this comment if you agree with this.


# Examples and Interactions
[examples-and-interactions]: #examples-and-interactions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to make sure some particular expectations are managed: While we're working on staging-next, we're looking at Hydra to identify regressions and fix them before staging-next is merged to master. With this change, there will be new Hydra jobs for non-free packages. The license terms of those packages could make it difficult or outright prevent us doing things to fix them, or even to try to reproduce locally, so it's not going to be possible in the general case to give these packages the same level of protection from regressions as we try to give free packages. So it should be understood that even though these packages are now built by Hydra and available in the binary cache, they shouldn't be expected to be any less likely to be broken by the staging process (or other PRs) than they currently are.


# Examples and Interactions
[examples-and-interactions]: #examples-and-interactions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to make sure some particular expectations are managed: While we're working on staging-next, we're looking at Hydra to identify regressions and fix them before staging-next is merged to master. With this change, there will be new Hydra jobs for non-free packages. The license terms of those packages could make it difficult or outright prevent us doing things to fix them, or even to try to reproduce locally, so it's not going to be possible in the general case to give these packages the same level of protection from regressions as we try to give free packages. So it should be understood that even though these packages are now built by Hydra and available in the binary cache, they shouldn't be expected to be any less likely to be broken by the staging process (or other PRs) than they currently are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully (and by intention) the runnableOnHydra would exclude licenses that put obligations on Hydra just from running the tests, and thus mere reproduction would not bring onerous obligations; but indeed actually identifying the issue often comes boils down to reverse engineering a part of the program.

@emilazy
Copy link
Member

emilazy commented Mar 20, 2025

I meant to leave a detailed reply to this RFC a long time ago but unfortunately haven’t had the time. Here’s my best effort of the most important points.

For full disclosure, I’d probably vote against building or distributing any kind of non‐free, non‐firmware software if we were doing an open vote, even with an ideal implementation. I think that we shouldn’t spend our limited contributor time, or our limited compute and storage resources, on packages we will in many cases be unable to properly fix or guarantee will keep working on NixOS, and I want to avoid a slippery slope towards removing the default prohibition of evaluating non‐free packages.

However, that is essentially an ideological argument, and while I don’t know what the ultimate consensus would be if we spent years exhaustively involving everyone, there is clearly strong desire from many people for this. For the rest of this comment I’ll accept in principle the idea of building and distributing genuinely redistributable non‐free software and only address the challenges and limitations we’ll face in doing so.

I appreciate that care has gone into considering our current somewhat vague definition of unfreeRedistributable and ensuring that we carefully consider what Hydra can actually safely deal with. Without that change, I’d be strongly against this proposal. With that modification, this proposal seems much safer to me. However, it implies a lot of limitations that don’t seem to have been discussed much so far, and there are still plenty of remaining complications:

  • There is a tension between ensuring that the things we accept compiling/testing/distributing on our infrastructure are actually things we are fully allowed to do so, and having the most important non‐free packages that users tend to want.

    The most important example is CUDA – despite being listed in the RFC, we could not legally build and redistribute the CUDA SDK. The EULA very explicitly says “For clarity, you may not distribute or sublicense the SDK as a stand-alone product” and that to redistribute it “Your application must have material additional functionality, beyond the included portions of the SDK”, “The distributable portions of the SDK shall only be accessed by your application”, and “Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only”, among other strict conditions. Regardless of the amount of cache it’d take or how long Hydra would spend building it, we simply cannot distribute CUDA to users without ignoring the licence or negotiating with NVIDIA. This is probably the number one pain point that currently has users setting up their own build resources or using third‐party caches that accept greater legal risk, but they would still have to do so under this proposal.

    (It also requires us to inform NVIDIA of any non‐compliant use or distribution of the SDK we become aware of. It’s unclear if this means the Foundation would be be legally obligated to tell NVIDIA if anyone representing it hears anything about a NixOS user doing something with CUDA that’s against the EULA, but it seems very unfortunate if so. Certainly it’d require that the Foundation immediately report itself for redistributing the CUDA SDK, though!)

    Another example in the RFC is TeamSpeak; the TeamSpeak licence is way too long so I only skimmed it, but it does not seem to permit redistribution of any kind.

    More borderline cases include things like UnRAR; per the license, we would have to audit downstream packages to ensure that they don’t use it to recreate the RAR compression algorithm, and there is a somewhat onerous licence termination clause that will apply if we fail to. I’m not sure that’s something we’d want to sign up for.

    The SSPL, as used by MongoDB, is also tricky. It says:

    If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge, under the terms of this License. Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Program or modified version.

    “Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.

    Consider NixOS tests. A user can modify the NixOS test for MongoDB and submit a pull request. ofborg would then automatically run that test (as would Hydra, after merge). Does that mean we’re “mak[ing] the functionality of the Program or a modified version available to third parties as a service”? If so, we would have to ensure that we offer a direct download of MongoDB including any patches we apply, and do not use any software on our infrastructure that we do not supply source of. I believe we use proprietary management software for some of our Mac infrastructure, for instance, which would be problematic under this clause. The SSPL is designed to be effectively impossible to comply with if you offer a SaaS.

    The Elastic License could be complicated for the same reason – “You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software”.

    Admittedly, ofborg and Hydra would be a very weird and terrible way to offer software as a service, but I’d still want to have a lawyer check that it definitely doesn’t count! If they wouldn’t sign off on that, then we’d be without both CUDA and MongoDB, probably the two non‐free packages users are most eager to see cached. At that point I really wonder whether it’s worth the effort and expense implementing this RFC and ensuring ongoing compliance will take.

    I believe we could, however, distribute and run DragonflyDB, SurrealDB, Nomad, and Terraform; we probably don’t do anything that could qualify as providing a competitive paid SaaS offering and so would be allowed our “production” use under the Business Source License.

  • The vast majority of our non‐free binary packages need to use patchelf to make those binaries work with our /nix/store paths. Most of this software is distributed under licences that only permit unmodified redistribution. We would be distributing patched binaries. I am not a lawyer – it may be that it is perfectly acceptable to modify the binaries for interoperability with our packaging system, and distribute the result; it may be that it does not count as true “modification” at all, or that our goal of interoperability exempts us from the restriction. However, I do not think that this is obviously the case, and it may depend on jurisdiction. Personally, I would want a lawyer with strong experience in software licensing law in the jurisdictions of our project infrastructure and the Foundation to sign off on this before we accept the risk.

  • We have a systemic licence compliance problem, even for FOSS. We currently fail to include licence text and copyright notices in almost all our built packages, despite being legally obligated to do so, as they are derivative works of the source code. We also don’t have downloads for our modified versions of packages that we package, despite that being required by copyleft licences.

    While I have wanted to start tackling this problem, it will be a big task to fix this and establish comprehensive monitoring for it, and I think is unlikely to be solved without funding directed at the problem.

    So far this hasn’t been a problem, because FOSS is built on goodwill and we do make an attempt to track licensing information in our package definitions; nobody has come after us because FOSS projects generally want to be redistributed and appreciate being packaged in Linux distributions.

    With non‐free software, that goodwill will be much less reliable. We’ll be dealing with software often from well-resourced corporations, who have explicitly chosen not to offer their software under lenient terms but instead impose substantial restrictions, and rather than dealing with a relatively small number of common, well‐understood FOSS licences with simple requirements, we’ll often be dealing with bespoke pages of legalese designed to be complicated to comply with.

    Given that we are currently terrible at even including licence texts in packages, I really doubt our unfreeRedistributable packages do any better at this on average. Since we’d have an explicit flag under this RFC, that won’t be an immediate problem, but we will really want to ensure that we have adequate review of licence compliance for any new package setting it. (In particular though, the flag should probably be on individual packages rather than licences for this reason. That still wouldn’t handle cases where we have to audit downstream users for compliance, though.)

  • Although ensuring that the installer ISOs do not depend on non‐free packages is a good start, it will not be sufficient to ensure that non‐free dependencies do not leak into the build closure of other free packages that ought not to require them. Since Hydra will be evaluating with non‐free packages enabled, and many contributors also allow them, this will only surface when a user tries to evaluate such packages with the default settings. That feels like somewhat painful UX to me and unfortunate for software freedom. We should probably have a system to flag up such “closure freeness regressions”.

  • Removing all versions of things from the cache is pretty painful, especially if there might be downstream packages including the code from static linking. That’s already an existing problem we have to deal with when licences are incorrectly marked, but I’m just noting that if we have to do it more regularly we’ll probably have to get better at it.

@7c6f434c
Copy link
Member

Looking at the inofficial-Nixpkgs-CUDA-cache threads, it is not clear to me, what is the current theory to justify redistribution of modified (patchelf) CUDA binaries.

@ruro
Copy link

ruro commented Mar 20, 2025

IANAL. This is not legal advice.

My understanding is that the sections quoted by @emilazy such as "you may not distribute or sublicense the SDK as a stand-alone product" refer specifically to the License Agreement for "derivative works" and for the case of redistributing CUDA "as incorporated in object code format into a software application" (in other words - for applications / libraries that link against CUDA).

Later on the same page, there is a "CUDA Toolkit Supplement to Software License Agreement" section (all emphasis mine):

2. CUDA Toolkit Supplement to Software License Agreement for NVIDIA Software Development Kits

The terms in this supplement govern your use of the NVIDIA CUDA Toolkit SDK under the terms of your license agreement (“Agreement”) as modified by this supplement. Capitalized terms used but not defined below have the meaning assigned to them in the Agreement.

This supplement is an exhibit to the Agreement and is incorporated as an integral part of the Agreement. In the event of conflict between the terms in this supplement and the terms in the Agreement, the terms in this supplement govern.

and then, a bit later

2.3. Operating Systems

Those portions of the SDK designed exclusively for use on the Linux or FreeBSD operating systems, or other operating systems derived from the source code to these operating systems, may be copied and redistributed for use in accordance with this Agreement, provided that the object code files are not modified in any way (except for unzipping of compressed files).

This seems to directly conflict with the earlier "you may not distribute or sublicense the SDK as a stand-alone product" clause. So my best guess is that Section 1 of the EULA is intended to describe the licensing for "end users" of the SDK and Section 2 is intended for "system integrators" (Linux distributions, package indices, etc).

It's possible that patchelfed binaries are "modified" and so this clause doesn't apply. But on the other hand, this clause says that you can't modify object code files specifically. You could argue that ELF itself is just an archive format, so patchelf is effectively just changing the RPATH metadata in this "archive" (without touching the actual "object code" that is stored inside sections like .text).

Either way, I think that NVIDIAs intention behind section 2.3 was to allow exactly what we want to do. So even if the current License doesn't allow patchelf, I think that it should be relatively easy to get NVIDIA to modify the wording of this clause a bit or to provide us with an exemption (the License itself even suggests to contact them at [email protected]). Also, IIRC, conda-forge and some other package indices are currently already doing similar "repackaging/redistribution" without getting sued.

@ruro
Copy link

ruro commented Mar 20, 2025

Also, even if distributing patchelfed CUDA itself turns out not to be viable, it still might be possible to build distribute applications that link against the patchelfed CUDA. That way users only have to download+patch CUDA itself locally and can then substitute all of the free CUDA-dependent libraries / applications from the cache. (To be clear, this would probably require implementing extra functionality in Hydra that would allow building specific derivations without pushing them to the cache).

@vcunat
Copy link
Member

vcunat commented Mar 20, 2025

I think our current binary cache needs to have complete closures, i.e. you can leave out some (transitive) runtime dependencies. Oh and I think the build farm isn't yet able to builds stuff without pushing them into the cache.

@emilazy
Copy link
Member

emilazy commented Mar 20, 2025

Later on the same page, there is a "CUDA Toolkit Supplement to Software License Agreement" section (all emphasis mine):

Thanks! I apologize for missing that. (A good example of how difficult it can be to interpret licence text…)

However, I wonder if it is really sufficient, even if we ignore the problem of modification (which seems risky to me, potentially riskier than your average basic patchelf: we have a complex CUDA packaging that does a lot of extensive work to get things working properly under Nix). If you follow the link from “The portions of the SDK that are distributable under the Agreement are listed in Attachment A.”, it lists a bunch of libraries that can be distributed with applications (“with applications developed by you”). But it does not list, for instance, required developer tools like nvcc that are required to compile CUDA programs. It seems to me that that would still fall under “Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only”.

Perhaps, as you said, 2.3 is what grants the permission for tools like nvcc? In that case, we’d need to ensure that we filter out anything not designed for Linux or FreeBSD directly in the FOD (including anything cross‐platform, I suppose, given the use of “exclusively”?). Perhaps we can say that of course everything in the toolkit download for Linux is “designed exclusively for use on the Linux or FreeBSD operating systems” and so that’s fine. But at that point I really do worry about patchelf; the fact that “provided that the object code files are not modified in any way (except for unzipping of compressed files)” singles out unzipping as the sole allowed modification makes it really seem to me that the agreement does not intend to allow the degree of modifications our packaging necessarily must perform. We would be relying on an indication that nothing our CUDA packaging does counts as “modification” in any jurisdiction relevant to the project, and that therefore the agreement’s attempt to forbid it is void.

I find “ELF is not an object code format” a legal theory that is incredibly unlikely to hold up, but who knows, I’m not a lawyer :)

I agree that we could reach out to NVIDIA for a special exemption. I also agree that other projects have been relatively cowboy about this and have not yet been bitten by it. Still, if we’re going to deliberately violate EULAs because we think we can get away with it, I’d rather we do it with eyes open. (And I would personally be rather unhappy about it.)

Also, even if distributing patchelfed CUDA itself turns out not to be viable, it still might be possible to build distribute applications that link against the patchelfed CUDA. That way users only have to download+patch CUDA itself locally and can then substitute all of the free CUDA-dependent libraries / applications from the cache. (To be clear, this would probably require implementing extra functionality in Hydra that would allow building specific derivations without pushing them to the cache).

Even ignoring Hydra/cache limitations, I don’t think this is true. The EULA explicitly grants only limited rights to modify the SDK, regardless of its redistribution requirements (“Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK”). To build packages using CUDA, Hydra would have to modify the CUDA SDK in a manner that (by the premise of the suggestion) is prohibited. Even apart from that, we’d also either have to build CUDA separately on every builder that builds a CUDA‐using package, or establish that transferring that modified version around the project infrastructure doesn’t count as “redistribution” (this would be a pretty controversial theory, I think) and set up an entirely parallel secret CUDA cache.

I think our current binary cache needs to have complete closures, i.e. you can leave out some (transitive) runtime dependencies.

*cannot, right?

@ruro
Copy link

ruro commented Mar 20, 2025

In that case, we’d need to ensure that we filter out anything not designed for Linux or FreeBSD directly in the FOD (including anything cross‐platform, I suppose, given the use of “exclusively”?). Perhaps we can say that of course everything in the toolkit download for Linux is “designed exclusively for use on the Linux or FreeBSD operating systems” and so that’s fine. But at that point I really do worry about patchelf; the fact that “provided that the object code files are not modified in any way (except for unzipping of compressed files)” singles out unzipping as the sole allowed modification makes it really seem to me that the agreement does not intend to allow the degree of modifications our packaging necessarily must perform. We would be relying on an indication that nothing our CUDA packaging does counts as “modification” in any jurisdiction relevant to the project, and that therefore the agreement’s attempt to forbid it is void.

I think, I wasn't quite clear in my previous comment. I am not suggesting that we should just YOLO the NVIDIA license and hope that we don't get sued (although some other projects apparently have done so).

My argument wasn't based on a legal interpretation of the text of the license, but on the (inferred) intentionality behind this text. It seems to me that the second part of this EULA was intended to allow downstream linux distributions / package indices / etc to redistribute CUDA. So even if the Foundation lawyers decide that the current wording of the EULA doesn't allow for redistribution, it should probably be possible to get NVIDIA to adjust the wording.


I think our current binary cache needs to have complete closures, i.e. you can leave out some (transitive) runtime dependencies.

Going to assume you meant "can't leave out".

Aren't binary caches just glorified file shares? I didn't test this, but I would imagine that nix should be able to safely handle the case when somebody just deleted one of the paths in the remote binary cache without also deleting its referrers.

Oh and I think the build farm isn't yet able to builds stuff without pushing them into the cache.

Yes, unfortunately that would probably require implementing extra functionality in Hydra.

@ruro
Copy link

ruro commented Mar 20, 2025

The EULA explicitly grants only limited rights to modify the SDK, regardless of its redistribution requirements (“Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK”). To build packages using CUDA, Hydra would have to modify the CUDA SDK in a manner that (by the premise of the suggestion) is prohibited.

Hmmm... I was under the impression that you can do whatever you want with CUDA code/binaries as long as you don't distribute the results. But now that you mention it, you're probably right.

But if modifying CUDA in such ways is illegal (even without redistribution), wouldn't that mean that Nixpkgs/NixOS users that set cudaSupport = true and build the derivations locally are also currently violating the EULA?

@emilazy
Copy link
Member

emilazy commented Mar 20, 2025

Hmmm... I was under the impression that you can do whatever you want with any code/binaries as long as you don't distribute the results.

It may be fair use, and perhaps some jurisdictions even allow it in general? I think there are two complications in this case: one, that “copying” tends to be interpreted quite liberally in the context of computers, but two, that more importantly, in this case, the NixOS Foundation would be explicitly agreeing to an EULA contract that forbids them from doing so.

But if modifying CUDA in such ways is illegal (even without redistribution), wouldn't that mean that Nixpkgs/NixOS users that set cudaSupport = true and build the derivations locally are also currently violating the EULA?

Yeah, possibly. And I think that it’s not totally without legal risk to distribute scripts that let people do that. But it’s at least much less risky. The Foundation itself never agreed to the EULA and isn’t distributing anything directly covered by it.

@Ekleog
Copy link
Member Author

Ekleog commented Mar 21, 2025

Just to be sure I didn't miss anything: outside of the discussion of whether specific licenses can actually be redistributable or not (which we should have again if the RFC lands, in one separate thread per such license/package), is there any change I should make to this RFC?

The only thing I did find against the RFC as-is is "it's too much effort if no package ends up benefitting from it"; but it's actually just adding one flag so we've probably already spent more effort here than implementing the RFC will ever be.

This being said, I did skip over the license-specific discussion (because I'm not focusing on any specific project yet and we will have to discuss this again anyway), so I may have missed another point within these parts.

@emilazy
Copy link
Member

emilazy commented Mar 21, 2025

At a minimum, I think the RFC should not list examples of packages that would be offered if our ability to safely legally build and redistribute them is questionable. I would remove at least CUDA and TeamSpeak, and likely MongoDB and UnRAR as well.

I do not think that implementation is just a matter of adding one flag, because I think the compliance burden to mitigate increased legal risk will be meaningful and require resources from the Foundation. I think that the RFC listing almost no drawbacks and describing the change as “basically risk-free” is a misleading picture in light of the concerns I’ve raised and the fact that discussion of the legality of distributing even some of the listed examples has ended up with “maybe the Foundation could negotiate with NVIDIA for licence changes”. That compliance and review burden trades off against the software that would become more conveniently available to users; if the most desired non‐free software still wouldn’t be viable for us to distribute, then the cost‐benefit becomes a lot worse. So I do think that this is relevant to the question of whether the RFC should be accepted.

I do strongly think a per‐package rather than per‐licence flag would be better, given how common complicated use restrictions are for non‐free software; unlike with FOSS, two non‐free packages under the same licence are much less likely to be able to be treated interchangeably. Especially for licences like the Business Source License where the compliance picture looks significantly different based on the Additional Use Grant – though arguably we should be encoding all of those as separate variants to begin with.

Beyond that, I do personally disagree with the aim of the RFC, as described in the second paragraph of #185 (comment). But I accept that it’s unlikely that we will be able to come to a consensus on the ideological matters here in the RFC discussion format, so I have focused on the problems and limitations of implementing the core idea. However, I am still concerned about the second‐to‐last bullet point of #185 (comment). I believe that non‐free software is likely to unnecessarily leak into the closure of free packages even if the installers are treated specially.

@Ekleog
Copy link
Member Author

Ekleog commented Mar 21, 2025

At a minimum, I think the RFC should not list examples of packages that would be offered if our ability to safely legally build and redistribute them is questionable.

The RFC explicitly mentions multiple times that the examples are illustrative only. Literally all closed-source packages are going to be questionable until we spend all the effort to validate what we can and should do; and there's no value in having this discussion until this RFC lands.

So I disagree about this: there could be no example otherwise, and I think the RFC is already explicit enough that the examples are in no way guaranteed to ever be approved.

I do not think that implementation is just a matter of adding one flag, because I think the compliance burden to mitigate increased legal risk will be meaningful and require resources from the Foundation

The risk will come when we change packages to actually start using this flag, hence my describing this RFC as risk-free. This being said, I'll add a paragraph in the drawbacks hopefully this evening, to say that each new approved use of this flag will carry risk and should be reviewed properly.

So the drawback will mention an increase in legal review requests to the foundation.

Especially for licences like the Business Source License where the compliance picture looks significantly different based on the Additional Use Grant – though arguably we should be encoding all of those as separate variants to begin with.

This is exactly how I'm seeing us moving forward when we start to think of using the new flags in licenses: either the unfree license is definitely fine and we can whitelist the license, or we'll need to define a new license type for each package we want to start building on hydra.

Which also semantically makes more sense to me than just dumping everything in the "unfree" dumpster, as there's even more differences between proprietary licenses than between FOSS licenses.

I believe that non‐free software is likely to unnecessarily leak into the closure of free packages even if the installers are treated specially.

That's a good point, thank you!

I'm trying to think, and it'd likely make sense to me, to run an eval job that verifies our licenses are semantically valid: eg. no MIT package depends on a GPL package without the codegen exception, etc.

This being said, such an effort is likely beyond the scope of this RFC. When I get to it this evening, I'll write down that as a drawback, and add the eval job as future work!

@Mic92
Copy link
Member

Mic92 commented Mar 21, 2025

I meant to leave a detailed reply to this RFC a long time ago but unfortunately haven’t had the time. Here’s my best effort of the most important points.

For full disclosure, I’d probably vote against building or distributing any kind of non‐free, non‐firmware software if we were doing an open vote, even with an ideal implementation. I think that we shouldn’t spend our limited contributor time, or our limited compute and storage resources, on packages we will in many cases be unable to properly fix or guarantee will keep working on NixOS, and I want to avoid a slippery slope towards removing the default prohibition of evaluating non‐free packages.

I am not convinced by this argument. We are also supporting macOS, which requires a lot more proprietary components (macOS SDK) and is harder to test. It by the way also doesn't conveniently ship a meta.license attribute in nixpkgs because it would be "unfree". This just enables a few more packages that can be tested on otherwise free operating systems.

However, that is essentially an ideological argument, and while I don’t know what the ultimate consensus would be if we spent years exhaustively involving everyone, there is clearly strong desire from many people for this. For the rest of this comment I’ll accept in principle the idea of building and distributing genuinely redistributable non‐free software and only address the challenges and limitations we’ll face in doing so.

I appreciate that care has gone into considering our current somewhat vague definition of unfreeRedistributable and ensuring that we carefully consider what Hydra can actually safely deal with. Without that change, I’d be strongly against this proposal. With that modification, this proposal seems much safer to me. However, it implies a lot of limitations that don’t seem to have been discussed much so far, and there are still plenty of remaining complications:

* There is a tension between ensuring that the things we accept compiling/testing/distributing on our infrastructure are actually things we are fully allowed to do so, and having the most important non‐free packages that users tend to want.
  **The most important example is CUDA – despite being listed in the RFC, we could not legally build and redistribute the CUDA SDK.** [The EULA](https://docs.nvidia.com/cuda/eula/index.html) very explicitly says “For clarity, you may not distribute or sublicense the SDK as a stand-alone product” and that to redistribute it “Your application must have material additional functionality, beyond the included portions of the SDK”, “The distributable portions of the SDK shall only be accessed by your application”, and “Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only”, among other strict conditions. Regardless of the amount of cache it’d take or how long Hydra would spend building it, we simply cannot distribute CUDA to users without ignoring the licence or negotiating with NVIDIA. This is probably the number one pain point that currently has users setting up their own build resources or using third‐party caches that accept greater legal risk, but they would still have to do so under this proposal.
  (It also requires us to inform NVIDIA of any non‐compliant use or distribution of the SDK we become aware of. It’s unclear if this means the Foundation would be be legally obligated to tell NVIDIA if anyone representing it hears anything about a NixOS user doing something with CUDA that’s against the EULA, but it seems very unfortunate if so. Certainly it’d require that the Foundation immediately report itself for redistributing the CUDA SDK, though!)

The "Modification" needs to be interpreted in the legal sense. This means, are we creating a derivative work of CUDA. Patchelf is not creating a derived work, it's an automated patching process that is not even specific to CUDA of parts of the program that are not even the copywrite-protectable part of CUDA. Elf header existed way before CUDA and CUDA did not invented anything new here. If you would interpret this in the technical sense, you wouldn't be allowed to load the binary into memory because it is creating a modified copy of the original ELF file (i.e. patching up references in memory).

  Another example in the RFC is TeamSpeak; the TeamSpeak licence is _way too long_ so I only skimmed it, but it does not seem to permit redistribution of any kind.
  More borderline cases include things like UnRAR; per [the license](https://fedoraproject.org/wiki/Licensing:Unrar), we would have to audit downstream packages to ensure that they don’t use it to recreate the RAR compression algorithm, and there is a somewhat onerous licence termination clause that will apply if we fail to. I’m not sure that’s something we’d want to sign up for.

The license you linked says, you cannot use the UnRaR code to re-create the RAR compression algorithm. This is different from you said.

  The SSPL, as used by MongoDB, is also tricky. It says:
  > If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge, under the terms of this License. Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Program or modified version.
  > “Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.
  
  
  Consider NixOS tests. A user can modify the NixOS test for MongoDB and submit a pull request. ofborg would then automatically run that test (as would Hydra, after merge). Does that mean we’re “mak[ing] the functionality of the Program or a modified version available to third parties as a service”? If so, we would have to ensure that we offer a direct download of MongoDB including any patches we apply, and do not use _any_ software on our infrastructure that we do not supply source of. I believe we use proprietary management software for some of our Mac infrastructure, for instance, which would be problematic under this clause. The SSPL is designed to be effectively impossible to comply with if you offer a SaaS.

We publish the NixOS tests over the internet, this is enough for a user to reproduce the source code. And no, a NixOS test is not a service for third-parties, it runs in a nix build and cannot access by other parties this way.

  The Elastic License could be complicated for the same reason – “You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software”.

NixOS is not doing this, so why would this be a problem?

  Admittedly, ofborg and Hydra would be a very weird and terrible way to offer software as a service, but I’d still want to have a lawyer check that it definitely doesn’t count! If they wouldn’t sign off on that, then we’d be without both CUDA and MongoDB, probably the two non‐free packages users are most eager to see cached. At that point I really wonder whether it’s worth the effort and expense implementing this RFC and ensuring ongoing compliance will take.
  I believe we could, however, distribute and run DragonflyDB, SurrealDB, Nomad, and Terraform; we probably don’t do anything that could qualify as providing a competitive paid SaaS offering and so would be allowed our “production” use under the Business Source License.

* The vast majority of our non‐free binary packages need to use `patchelf` to make those binaries work with our `/nix/store` paths. Most of this software is distributed under licences that only permit _unmodified_ redistribution. We would be distributing patched binaries. I am not a lawyer – it may be that it is perfectly acceptable to modify the binaries for interoperability with our packaging system, and distribute the result; it may be that it does not count as true “modification” at all, or that our goal of interoperability exempts us from the restriction. However, I do not think that this is _obviously_ the case, and it may depend on jurisdiction. Personally, I would want a lawyer with strong experience in software licensing law in the jurisdictions of our project infrastructure and the Foundation to sign off on this before we accept the risk.

See the point about CUDA about.

* We have a systemic licence compliance problem, even for FOSS. We currently fail to include licence text and copyright notices in almost all our built packages, despite being legally obligated to do so, as they are derivative works of the source code. We also don’t have downloads for our modified versions of packages that we package, despite that being required by copyleft licences.
  While I have wanted to start tackling this problem, it will be a big task to fix this and establish comprehensive monitoring for it, and I think is unlikely to be solved without funding directed at the problem.
  So far this hasn’t been a problem, because FOSS is built on goodwill and we do make an attempt to track licensing information in our package definitions; nobody has come after us because FOSS projects generally want to be redistributed and appreciate being packaged in Linux distributions.
  With non‐free software, that goodwill will be much less reliable. We’ll be dealing with software often from well-resourced corporations, who have explicitly chosen _not_ to offer their software under lenient terms but instead impose substantial restrictions, and rather than dealing with a relatively small number of common, well‐understood FOSS licences with simple requirements, we’ll often be dealing with bespoke pages of legalese designed to be complicated to comply with.
  Given that we are currently terrible at even including licence texts in packages, I really doubt our `unfreeRedistributable` packages do any better at this on average. Since we’d have an explicit flag under this RFC, that won’t be an immediate problem, but we will really want to ensure that we have adequate review of licence compliance for any new package setting it. (In particular though, **the flag should probably be on individual packages rather than licences** for this reason. That still wouldn’t handle cases where we have to audit downstream users for compliance, though.)

Since we are often re-packaging proprietary packages, this less of a problem compared to opensource licensed projects, since the license would be than already part of the original distribution. But even if we get this part wrong, it's unlikely to severe consequences. Usually you will first be contacted by the other party to fix the packaging before legal actions will be invoked.

* Although ensuring that the installer ISOs do not depend on non‐free packages is a good start, it will not be sufficient to ensure that non‐free dependencies do not leak into the build closure of other free packages that ought not to require them. Since Hydra will be evaluating with non‐free packages enabled, and many contributors also allow them, this will only surface when a user tries to evaluate such packages with the default settings. That feels like somewhat painful UX to me and unfortunate for software freedom. We should probably have a system to flag up such “closure freeness regressions”.

We have the unfree check that will still fail to evaluate. This wouldn't be changed by this RFC.

* Removing all versions of things from the cache is pretty painful, especially if there might be downstream packages including the code from static linking. That’s already an existing problem we have to deal with when licences are incorrectly marked, but I’m just noting that if we have to do it more regularly we’ll probably have to get better at it.

@emilazy
Copy link
Member

emilazy commented Mar 21, 2025

Well, I do recommend you read the full discussion as the RFC author, since I tried to outline all the concerns I have, and even the discussions about specific software are illustrative of the general problems.

I realize that the listed packages are just examples, but I also think that a large part of the desire for this is driven by a relatively small number of non‐free packages that are expensive to build – after all, the drawback of the status quo is described as “very long builds for lots of software”, though in practice I almost exclusively hear people talk about CUDA and MongoDB, and the motivation section directly talks about MongoDB – and those packages happen to be ones that are questionably legal for us to distribute. And clearly the RFC is meant as a referendum on the idea of trying to actually distribute meaningful amounts of non‐free software in practice, so I think that “it’s just adding a flag, which is zero‐risk, because all the problems only come the first time someone tries to actually use the flag” is misleading.

MIT software depending on GPL is not a licence problem and is pretty common. Free software depending on proprietary software isn’t uncommon, either; it’s only a problem when it unnecessarily does so. The RFC certainly makes that more likely to happen.

@Mic92
Copy link
Member

Mic92 commented Mar 21, 2025

Well, I do recommend you read the full discussion as the RFC author, since I tried to outline all the concerns I have, and even the discussions about specific software are illustrative of the general problems.

I read the full discussion and the RFC before I applied for shepherding. Thanks

I realize that the listed packages are just examples, but I also think that a large part of the desire for this is driven by a relatively small number of non‐free packages that are expensive to build – after all, the drawback of the status quo is described as “very long builds for lots of software”, though in practice I almost exclusively hear people talk about CUDA and MongoDB, and the motivation section directly talks about MongoDB – and those packages happen to be ones that are questionably legal for us to distribute. And clearly the RFC is meant as a referendum on the idea of trying to actually distribute meaningful amounts of non‐free software in practice, so I think that “it’s just adding a flag, which is zero‐risk, because all the problems only come the first time someone tries to actually use the flag” is misleading.

If this was true, which it isn't it (the license is mainly restricts SaaS as opposed to distribution), mongodb would be marked as non-redistributable.

MIT software depending on GPL is not a licence problem and is pretty common. Free software depending on proprietary software isn’t uncommon, either; it’s only a problem when it unnecessarily does so. The RFC certainly makes that more likely to happen.

Not convinced this is more likely, companies often have better license checks in place to stop this from happening. Opensource projects often don't care much about this because. And the result if a proprietary project would depend on GPL would be just that the project is also under GPL... This is not really our problem but the problem of the project.

@emilazy
Copy link
Member

emilazy commented Mar 21, 2025

I am not convinced by this argument. We are also supporting macOS, which requires a lot more proprietary components (macOS SDK) and is harder to test. It by the way also doesn't conveniently ship a meta.license attribute in nixpkgs because it would be "unfree". This just enables a few more packages that can be tested on otherwise free operating systems.

What we use from the macOS SDK is headers and .tbd stub files that only list symbols. It’s interface definitions; they do not ship the .dylibs containing the code for the system libraries in the SDK. The stub files are clearly uncopyrightable, and I think the precedent of Google v. Oracle for use of the API headers being fair use on interoperability grounds is clear. (I believe the precedent for interoperability is as strong or stronger in the EU.)

I agree that before Apple adopted the .tbd stub scheme our use of the macOS SDK was highly questionable. (We do still have contributors who refuse to test things on macOS on ethical grounds, and I respect that.)

The "Modification" needs to be interpreted in the legal sense. This means, are we creating a derivative work of CUDA. Patchelf is not creating a derived work, it's an automated patching process that is not even specific to CUDA of parts of the program that are not even the copywrite-protectable part of CUDA. Elf header existed way before CUDA and CUDA did not invented anything new here. If you would interpret this in the technical sense, you wouldn't be allowed to load the binary into memory because is creating a modified copy of the original ELF file (i.e. patching up references in memory).

Given that this clause says “provided that the object code files are not modified in any way (except for unzipping of compressed files)”, implying that unzipping compressed files counts as a modification, do you think NVIDIA’s lawyers would agree with this stance?

I agree that it would be sensible for copyright law to allow patchelf, and that it may in fact be something we can rely on, but I don’t trust the guesses of programmers to determine that – myself included – and that is why I think the Foundation should really ask a lawyer about it. (Especially since it’s unclear to me if nvcc is intended to be allowed to be redistributed even without modification.)

  Another example in the RFC is TeamSpeak; the TeamSpeak licence is _way too long_ so I only skimmed it, but it does not seem to permit redistribution of any kind.
  More borderline cases include things like UnRAR; per [the license](https://fedoraproject.org/wiki/Licensing:Unrar), we would have to audit downstream packages to ensure that they don’t use it to recreate the RAR compression algorithm, and there is a somewhat onerous licence termination clause that will apply if we fail to. I’m not sure that’s something we’d want to sign up for.

The license you linked says, you cannot use the UnRaR code to re-create the RAR compression algorithm. This is different from you said.

I’m not sure how that’s different from what I said? Any downstream user of the UnRAR code that is in violation of that clause would then subject us to the onerous termination clause were we to build and distribute it. Packaging one piece of software that uses the UnRAR code in a prohibited way would revoke our licence to distribute UnRAR at all.

We publish the NixOS tests over the internet, this is enough for a user to reproduce the source code. And no, a NixOS test is not a service for third-parties, it runs in a nix build and cannot access by other parties this way.

An arbitrary third‐party could send pull requests to NixOS to the MongoDB test, which would result in us exercising MongoDB’s functionality as requested through ofborg and then Hydra. The question is whether this counts as “mak[ing] the functionality of the Program or a modified version available to third parties as a service”. I agree that this is far from certain. But I do not think it is certain to be false either. The SSPL is not designed to be unambiguous or easy to comply with if you do anything that could remotely count as that.

And this isn’t the AGPL, where you just have to publish the code for the actual thing you run. See the quoted portion of the SSPL, which clearly states “all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software”. It was explicitly designed so that Amazon could not offer MongoDB as SaaS without open sourcing the entirety of AWS. The scope is essentially unlimited, and if the clause applies, would in fact include all the software we use to manage our infrastructure.

We can’t apply FOSS norms where licences are intended to be easy to comply with and participants are generally sensible and acting in good faith with the desire to have their code redistributed. The SSPL was designed as a legal stick to stop organizations that were doing things that MongoDB Inc. didn’t like. Non‐free licences are far more often deliberately adversarial and backed by expensive lawyers, and we have to take this into account when discussing compliance and risk.

NixOS is not doing this, so why would this be a problem?

See above.

Since we are often re-packaging proprietary packages, this less of a problem compared to opensource licensed projects, since the license would be than already part of the original distribution. But even if we get this part wrong, it's unlikely to severe consequences. Usually you will first be contacted by the other party to fix the packaging before legal actions will be invoked.

I agree that in practice the risk is not completely unlimited. However it is far higher than with FOSS.

We have the unfree check that will still fail to evaluate. This wouldn't be changed by this RFC.

It would; the proposal is explicitly to allow certain packages to disable the check on Hydra:

Hydra will build all packages with licenses for which redistributable && runnableOnHydra. It will still fail evaluation if the ISO image build or the Amazon AMIs were to contain any unfree software.

If a user with non‐free software enabled makes a PR and nobody with it disabled reviews it, then there will be no early warning for non‐free software in a free package’s closure. CI, based on the settings of the release jobset, is our current backstop against this happening. That’s why it explicitly carves out the release ISOs; I’m saying that the problem extends beyond those.

@emilazy
Copy link
Member

emilazy commented Mar 21, 2025

I read the full discussion and the RFC before I applied for shepherding. Thanks

My comment was directed at @Ekleog as the RFC author, who said:

This being said, I did skip over the license-specific discussion (because I'm not focusing on any specific project yet and we will have to discuss this again anyway), so I may have missed another point within these parts.

I did not see your comment before posting #185 (comment) and have replied to it in #185 (comment).

@alyssais
Copy link
Member

If you would interpret this in the technical sense, you wouldn't be allowed to load the binary into memory because it is creating a modified copy of the original ELF file (i.e. patching up references in memory).

IANAL, but IIRC the reason that you are allowed to do this is that unlike modifying ELF files with patchelf, this specific action is permitted by copyright law. I found https://digital-law-online.info/lpdi1.0/treatise20.html — not sure if it's a trustworthy resource but matches what I remember having read about this before.

@emilazy
Copy link
Member

emilazy commented Mar 21, 2025

IANAL, but IIRC the reason that you are allowed to do this is that unlike modifying ELF files with patchelf, this specific action is permitted by copyright law. I found https://digital-law-online.info/lpdi1.0/treatise20.html — not sure if it's a trustworthy resource but matches what I remember having read about this before.

It seems like this states that loading into RAM is likely a copy under US law, right? if that is the case, I presume that it is permitted because EULAs grant the right to run programs (explicitly or implicitly), and the copying and in‐memory modification involved in doing so is an unavoidable part of running a program. Perhaps that extends to modifying the on‐disk ELF files such that they can be run, although it still seems to me that the picture gets very lawyer‐requiringly murky once we bring redistribution into the picture.

@Ekleog
Copy link
Member Author

Ekleog commented Mar 21, 2025

Well, I do recommend you read the full discussion as the RFC author, since I tried to outline all the concerns I have, and even the discussions about specific software are illustrative of the general problems.

I will definitely go over it once again once I'm back to actually push this RFC forward this evening. This being said, I also regard the discussion about specific derivations as being off-topic for this thread, as is explicitly mentioned in the RFC text.

So please forgive me for not reading in-depth the long and off-topic discussion, that I specifically tried to prevent by writing the RFC in a way as defensive as possible, because I knew otherwise the discussion would derail into each specific packages and legal discussions are not one technical people are good at having.

This being said, this discussion did raise a very interesting point in @Mic92 's answer: we currently have some packages that should be marked as unfree and are not because of technical limitations. Which is probably more dangerous than an unfree package sneaking its way into a free package's closure and being detected only by people with unfree disabled.

So I'll add that in the RFC too, and will read the rest of the discussion without implicating myself too much, as I'm noticing these few messages from today are already too much.

Anyway, have a good afternoon and weekend!

@Ekleog
Copy link
Member Author

Ekleog commented Mar 21, 2025

I just committed:

  • Some changes I was thinking I had already committed and pushed a looong time ago
  • The changes in reply to @alyssais 's comments, to clarify that we won't support unfree software any more than currently
  • Some changes based on @emilazy 's comments; please let me know if you see other points the RFC would need to mention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: open for nominations Open for shepherding team nominations
Projects
None yet
Development

Successfully merging this pull request may close these issues.