getreads [unsupported]

Note

I recommend using nf-core/fetchngs This repository is no longer supported as of 2023

A minimal pipeline to download FASTQ files from SRA given a list of accession IDs.

🪄 Usage

See installation for more details

# Suggestion: replace main with a version from the releases 
  nextflow run telatin/getreads -r main   -profile docker \
     --list list.txt --outdir downloaded-reads/

Where:

--list "list.txt" is a list of SRA accession IDs in simple text format
--outdir "name" is the name of the output directory
--wait INT is the number of seconds to wait after running ffq [default: 2]
-profile docker will used Docker for dependencies. An easy alternative is to create a conda environment using deps/env.yaml. Singularity is supported but untested (usually clusters with singularity are offline anyway)

📂 Output

The output directory contains:

📁 json (JSON file, one for each accession)
📁 urls (text files with the download URIs)
📁 reads (FASTQ.gz files, a set per accession)
🗒️ stats.txt (reads statistics)
🗒️ check.txt (a report of number of files per ID downloaded, with control of number of reads per file being equal)
🗒️ table.tsv metadata table from JSON files (only for samples where ffq didn't fail) (new in 2.0)

Alternatives

nf-core/fetchngs ⭐ is a fully-featured pipeline to download reads and associated metadata. It's a fantastic and regularly update tool. Since sometimes it failed for me for reasons related to its complexity, I made this minimal pipeline as a backup plan.

Uses

ffq to fetch URLs given the accessions, wrapped in ffq-sake.py that retries if NCBI responds with "too many requests", but gracefully fails on 400 error.
wget to download the reads
seqfu to collect stats

Screenshot

Cite

If you use this pipeline, please cite:

Gálvez-Merchán, Á., et al. (2023). Metadata retrieval from sequence databases with ffq. Bioinformatics
Telatin, A., et al. (2020). SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github		.github
bin		bin
deps		deps
docs		docs
modules		modules
test		test
.gitignore		.gitignore
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

getreads [unsupported]

🪄 Usage

📂 Output

Alternatives

Uses

Screenshot

Cite

About

Releases 3

Packages

Languages

telatin/getreads

Folders and files

Latest commit

History

Repository files navigation

getreads [unsupported]

🪄 Usage

📂 Output

Alternatives

Uses

Screenshot

Cite

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages