Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bamsurgeon goes silent (does not finish) on high coverage small panel bam file #217

Open
ranjit58 opened this issue Jan 10, 2023 · 8 comments

Comments

@ranjit58
Copy link

I am using bamsurgeon for inserting mutation in a ctdna bam file, captured with panel with ~100 regions at median ~8000X coverage. In each region, I am adding mutations at low allele frequency (0.01% - 1%).

The bamsurgeon works fine when inserting ~40 mutation or so. When inserting 60 or more mutations, the program goes silent (not finishing). Using htop command i see several python process and one samtools mipleup process but not utilizing any cpu/memmory, sitting silent.

The last message is something like
INFO 2023-01-10 03:26:37,790 haplo_9_137709597_137709597 avgincover: 10233.000000, avgoutcover: 8937.500000

An example of I am using following commands
python3 -O bamsurgeon/bin/addsnv.py
-v test.bed
-f test.bam
-r index/index.fa
-o output.bam
--mindepth 1000
--maxdepth 50000
--force
-p 8
--seed 1234

One solution : Trying various options , when i add the flag --ignorepileup, the program does finishes. I don't know what caused the problem and if the fix is appropriate, so I am reporting this issue for any further input or suggestion.

thanks,
Ranjit

@Lin-Yuying
Copy link

Hi there!

Any other solution to this issue? I recently encountered the same issue, and my command ended without any errors. No bam file was generated. Do you happen to know what has happened?

Thanks in advance!

Best,
Y. Lin

@adamewing
Copy link
Owner

Hi, sorry this is happening. Is your .bam file also very high coverage over the regions where you are trying to add variants? I suspect it's due to inefficiency around extremely high pileup depth but not sure of a solution at present.

@Lin-Yuying
Copy link

Lin-Yuying commented Feb 23, 2023

Thank for your prompt reply. I don't think this is due to high coverage area, since I restricted the insertion regions to be with coverage between 1/2X ~ 2X of average coverage of the individual, which is around 12X ~ 48X.

@adamewing
Copy link
Owner

Ok, probably not related to the same issue then. Are you able to run any of the tests successfully?

@Lin-Yuying
Copy link

Lin-Yuying commented Feb 23, 2023

I've tried to run the test data, and yeilded the expected bam file by specifying picard. Not sure if the picard will affect the result or not, I am testing it and will keep you informed when I finish. Thanks!

@Lin-Yuying
Copy link

Hi again!

I successfully run the test sample, but when I use the same command line on my data, it stopped without any error messages. the last thing in the .log file looks like

INFO 2023-02-23 17:45:53,185 haplo_LG2_622827_622827 remove original bam: addsnv.tmp/haplo_LG2_622827_622827.tmpbam.cc2173b4-e719-4cb1-b78c-24844e29cc22.bam
INFO 2023-02-23 17:45:53,186 haplo_LG2_622827_622827 rename sorted bam: addsnv.tmp/haplo_LG2_622827_622827.tmpbam.cc2173b4-e719-4cb1-b78c-24844e29cc22.bam.realign.sorted.bam to original name: addsnv.tmp/haplo_LG2_622827_622827.tmpbam.cc2173b4-e719-4cb1-b78c-24844e29cc22.bam
INFO 2023-02-23 17:45:53,186 haplo_LG2_622827_622827 indexing: samtools index addsnv.tmp/haplo_LG2_622827_622827.tmpbam.cc2173b4-e719-4cb1-b78c-24844e29cc22.bam
INFO 2023-02-23 17:45:53,195 haplo_LG2_622827_622827 removing addsnv.tmp/haplo_LG2_622827_622827.tmpbam.cc2173b4-e719-4cb1-b78c-24844e29cc22.fastq
INFO 2023-02-23 17:45:53,198 haplo_LG2_622827_622827 avgincover: 27.500000, avgoutcover: 27.500000

Do you have any thoughts on what has happened?

Thanks!

@Lin-Yuying
Copy link

Lin-Yuying commented Feb 24, 2023

New update: I successfully inserted 10 mutations but when it comes to 50 mutations or 1000 mutations, the command line just cannot finish for somewhat reason. I also checked the coverage for the .bam file, I can tell the highest coverage of the insert loci is 47X. Still trying to figure out what happened with my data.

@Lin-Yuying
Copy link

Final Update (Maybe). Okay. I guess I've figured out what happened with some of the insertions, I specified the "--requirepaired" parameter when running my data, however, the addsnv.py skipped those mutations and also stopped running. It would be great if there are any improvements in tackling this issue in the future.

Thank you so much for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants