-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Fusion support to HTCondor #5865
Open
JosephLalli
wants to merge
37
commits into
nextflow-io:master
Choose a base branch
from
JosephLalli:merging_w_nf
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
3705 condor docker support
Update build
Signed-off-by: Joseph Lalli <[email protected]>
…se of "BasicFileAttributes supported" error
…custom CondorTaskHandler. Now both are defined.
…hat I am using my version of nextflow
…sh "nextflow -C fusion_test.config run rnaseq-nf""
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Signed-off-by: Joseph Lalli <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi there,
Feel free to use this code or not, but I want to let you know that I have come back to this project (#3697) and have Nextflow successfully running on UWisc CHTC's HTCondor implementation using Fusion and a connection to a local s3 storage. I have a separate Dockerfile and launch script [here] that allows for running Nextflow in an Apptainer container off a submit node with the ancillary files necessary to establish a connection to the scheduler.
I have a feeling that this code would require some changes to meet your house style. First, I do not know how Nextflow typically handles the kind of issues surrounding scheduler access that I described above. I know you don't have a separate Dockerfile for each grid computation software package. I'm willing to work to implement scheduler access in a Nextflow-y manner (even if it's just "make sure the user environment is set up correctly").
Second, I could not figure out how to directly submit both a server configuration request (called a submit file in Condor) and the script to run over stdin. Condor requires both a submit file (which can be submitted via stdin) and a separate execution file, which is specified in the submit file and automatically transferred to the compute server. For now, I have created a generateFusionBashWrapperCommand function in the CondorTaskHandler class that reports the output of FusionHelper.runWithContainer with a bash shebang. fusionStdinWrapper writes the output of generateFusionBashWrapperCommand to FileHelper.getLocalTempPath().resolve(".condor.${task.id}.${task.hash}.sh"), which is then included in the submit file that is passed to condor_submit. I copied write0 from GridTaskHandler, since write0 is private and I couldn't figure out how to run it from CondorTaskHandler. I imagine there is a better file writing function to use, as well as a file location and file name that better comports with Nextflow style.
Finally, Condor allows for the "condor_history" command to parse a local log file that is automatically updated by multiple jobs and report the job status (amongst other attributes) of each job. The advantage of this method over "condor_q" is that it does not make a submission to the scheduler server, and can be repeatedly and easily checked. I give each job the same local log file, saving it to the $PWD with the run uuid as file name. I imagine there is a logical place for Nextflow to store this file, but I do not know where.
It has only been tested with Apptainer process containers, but both nextflow hello and nextflow rnaseq-nf run to completion. Root access is not an option on remote servers, so docker cannot be used, but most other containerization software should function well.
Notably, condor will handle file transfer of small files, including container image files. While I used a shared drive for my apptainer cache location, it would be straightforward to tell condor to transfer a locally stored container image to the execution server before running Fusion's apptainer command, or potentially to download it from Tower.
I'm happy to work with you all to integrate this into Nextflow as you see fit.
Signed-off-by: Joseph L Lalli [email protected]