Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staging script not found when running on Google Batch #5888

Open
ejseqera opened this issue Mar 14, 2025 · 0 comments
Open

Staging script not found when running on Google Batch #5888

ejseqera opened this issue Mar 14, 2025 · 0 comments

Comments

@ejseqera
Copy link

Bug report

When running Nextflow on Google Batch with gcsfuse mounted directories, and attemping to stage in many input files for a task, the task will fail with the following error:

Error executing process > 'COPY_FILES'

Caused by:
  No such file or directory: /mnt/disks/nf-tower-test-eu-1/scratch/1rnFXXfWvpW08n/f7/b2319081b9b7957733eebb846c0fd3/.command.stage

This appears to be related to the fix implemented in #4282 and reported in #4279, which was intended to disable the separate staging script for remote object storage entirely. However, the fix doesn't properly work on Google Batch.

Steps to reproduce the problem

  1. Run a Nextflow pipeline on GCP with a task that stages in many input files (e.g., ~1000 or more files
# create random files
for ((n=0;n<6000;n++)); do touch dummy_file_${n}.txt; done

# sync to a GCS bucket
gsutil -m rsync -r ./ gs://nf-tower-test-eu-1/esha/many_files_test/

One process workflow:

process COPY_FILES {
    input:
        path files
    output:
        path("outdir", type: 'dir')
    script:
    """
    mkdir -p outdir
    for f in ${files}; do
        cp \$f outdir/
    done
    """
}

workflow {
    Channel.fromPath(params.input).collect()
    | COPY_FILES()
}
  1. Instead of staging in the directory, stage in each individual file which will result in a large .command.run exceeding 1MB.
  2. The task fails because it tries to access the .command.stage file which isn't properly created or accessible

Environment

  • Nextflow version: 24.10.5
  • Seqera Platform Cloud Version 24.3.0-cycle4_803f393

Additional context

(Add any other context about the problem here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant