Skip to content
This repository was archived by the owner on Jul 18, 2023. It is now read-only.

HINGE for minimap #112

Open
ebioman opened this issue May 10, 2017 · 10 comments
Open

HINGE for minimap #112

ebioman opened this issue May 10, 2017 · 10 comments
Assignees

Comments

@ebioman
Copy link

ebioman commented May 10, 2017

Hello
I found that intriguing sentence in your paper

"Therefore, integrating HINGE with other overlapping tools such as MHAP or Minimap can be done if different levels of alignment sensitivity or memory usage are required"

Is this a theoretical possibility, something you already started to look into or even possible with current code?

Cheers

@fxia22
Copy link
Collaborator

fxia22 commented May 10, 2017

Hi @ebioman ,

HINGE (tentatively) supports minimap, instead of passing --db and --las arguments to hinge filter, maxmial and layout, you can pass --fasta and --paf.

@ebioman
Copy link
Author

ebioman commented May 11, 2017

Hi @fxia22
Thanks for the quick reply, I will give it a (tentative) shot.
Just to be clear, this will mean that there is subsequently no scrubbing possible?

@fxia22
Copy link
Collaborator

fxia22 commented May 11, 2017

@ebioman We have our own filtering which contains scrubbing(mainly based on coverage). BTW, the minimap pipeline has not draft assembly and consensus module, so the assembly graph will be final output.

@ebioman
Copy link
Author

ebioman commented May 12, 2017

Hello @fxia22
I was aware of the latter but suspected that some scrubbing was done using DASCRUBBER as you include it in your repository. Thanks for the clarification!

@ebioman ebioman closed this as completed May 12, 2017
@ebioman
Copy link
Author

ebioman commented Jun 1, 2017

Hi
I tried a minimal test with minimap and hinge but was not successful yet. I always run into an error with 200 random PacBio RSII sequences.


minimap -Sw5 -L100 -m0 -t20 test.fasta test.fasta > test.paf
hinge filter --fasta test.fasta --paf test.paf -x test --config ../nominal.ini 
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr: __pos (which is 18446744073709551612) > this->size() (which is 0)
/software/UHTS/Assembler/HINGE/20170509/bin/hinge: line 8: 38579 Aborted                 Reads_filter "$@"

Update: the strange thing is that if I remove all options and just execute hinge filter I do get the exact same error

If I do the same with DALIGNER it succeeds:


fasta2DB test test.fasta 
DBsplit test
HPC.daligner test | bash -v
DASqv -c100 test test.las
hinge filter --db test --las test.las -x test --config ../nominal.ini 

I used minimap 0.2-r124-dirty

@ebioman ebioman reopened this Jun 1, 2017
@fxia22 fxia22 self-assigned this Jun 2, 2017
@ebioman
Copy link
Author

ebioman commented Jul 18, 2017

Sorry to push this, but any ideas what might be wrong - is it working for you in the latest version?

@govinda-kamath
Copy link
Collaborator

We have fixed the bug and you can run the pipeline before hinge draft with a paf and a fasta. We'll take some more time to fix draft as paf does not give us detailed alignments for that.

There seems to be a nice work around using minimap2.
This will need lastools and a script for recomputing cigars

The pipeline for the lambda-virus would be the following:

  • Download reads
wget https://www.dropbox.com/s/lda4pwsttt168yj/lambda_filtered_subreads.fasta
  • Run minimap2
minimap2 -x ava-pb -a lambda_filtered_subreads.fasta lambda_filtered_subreads.fasta > lambda.raw.sam
  • Sort and index the sam files
samtools view -bS lambda.raw.sam | samtools sort -o lambda.sorted.bam
samtools index lambda.sorted.bam
python cigar_recomp.py lambda.sorted.bam lambda_filtered_subreads.fasta overlap_recompute <num-processes>
  • convert to las using bamtolas:
bamtolas overlaps.las lambda_filtered_subreads.fasta <overlap_recompute.bam 

@ebioman
Copy link
Author

ebioman commented Aug 8, 2017

Thank you very much, I will give it a try today.

Update:

If I try the original --fasta and --paf combination I still encounter a problem

 hinge filter --fasta subreads.fasta  --paf overlaps.paf -x test --config nominal.ini
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr: __pos (which is 18446744073709551612) > this->size() (which is 0)
/software/UHTS/Assembler/HINGE/20170509/bin/hinge: line 8:  3658 Aborted                 Reads_filter "$@"

Is there anything in the config which has to be changed or am I missing something obvious ?

I looked at the work-around but the step of recalculating the cigars is unfortunately very time-consuming and beats the idea of replacing DALIGNER by minimap2 in order to gain speed in assembly.

@govinda-kamath
Copy link
Collaborator

Sorry did not see the update to the comment (It looks like we don't get notified about it).

The way we seem to be handling this is to convert the bam to a las. And then run HINGE on the las created (after creating a corresponding db as well). So you'll not have to run hinge with --fasta and --paf

@mictadlo
Copy link

mictadlo commented Jan 4, 2018

Minimap2 has also -c option which stands for output CIGAR in PAF. Is your minimap2 workflow (#112 (comment)) still valid?

If it is still valid after this step bamtolas overlaps.las lambda_filtered_subreads.fasta <overlap_recompute.bam do I have to run

fasta2DB DB -flambda_filtered_subreads.fasta
hinge filter --db DB --las overlaps.las -x asm --config <path-to-nominal.ini>
....

Thank you in advance.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants