Skip to content

A Vertex Reordering Approach for Scalable Long Read Assembly

Notifications You must be signed in to change notification settings

Oieswarya/Tile-X

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tile-X: A Vertex Reordering Approach for Scalable Long Read Assembly

Tile-X is a graph-based approach for optimizing long read genome assembly by reordering sequencing reads prior to assembly. By leveraging vertex reordering techniques, Tile-X enhances parallelism, reduces memory usage, and improves the contiguity of assembled genomes while maintaining high accuracy.

Features

  1. Graph-Theoretic Read Reordering: Computes an overlap graph and applies vertex reordering techniques to improve assembly efficiency.
  2. Multiple Reordering Strategies: Implements standard reordering heuristics like Reverse Cuthill-McKee (RCM) [Tile-RCM], Metis [Tile-Metis], and Grappolo [Tile-Grappolo], as well as a novel Farthest Neighbor [Tile-Far] heuristic for sparsified assembly.
  3. Scalability: Reduces computational overhead and enables efficient assembly of large genomes.

Step-by-Step Guide

  1. Clone the Tile-X Repository:

    git clone https://github.com/Oieswarya/Tile-X.git
    cd Tile-X
    
  2. Compile the source files and setup directories:

    make all
    
  3. Check if Tile-X is properly installed:

    ./tileX.sh -h
    
    

Usage

Run the tileX.sh script from the root directory:

./tileX.sh -lr path/to/longreads.fa [options]


-lr,--longreads    Path to the long reads input file
Options:
-o, --output       Output directory (default: $HOME/Tile-X/Output/)
-t, --threads      Number of threads to use (default: 16)
-n, --nodes        Number of nodes to use (default: 2)
-p, --processes    Number of processes per node (default: 2)
-tile, --module    Tile-X module to use (default: Tile-Far)
                     Options: Tile-Far (default), Tile-RCM, Tile-Metis, Tile-Grappolo
-h, --help         Show this help message

Note: This code has been tested on high-performance cluster (HPC) systems with MPI and OpenMP compatibility and has been tested for both PBS and SLURM job scheduling systems.

For a quick test, you can use the provided test input. Navigate within the Tile-X repository and run the tileX.sh script.

~/Tile-X/tileX.sh ~/Tile-X/TestInput/CoxiellaBurnetii_longreads.fa

The final scaffolds will be located here: ~/Tile-X/Output/Final/finalAssembly.fa, within the Output folder of the Tile-X directory.

Tips:

  1. On some clusters, you may need to load specific modules before installing dependencies and and then also while running Tile-X.
  2. Ensure that you have the appropriate permissions to execute the job script.

Tile-X utilizes the following tools:

About

A Vertex Reordering Approach for Scalable Long Read Assembly

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published