Skip to content

NGS data analysis scripts for HBV elimination research group

License

Notifications You must be signed in to change notification settings

babinyurii/apobec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0ead764 · May 30, 2022

History

72 Commits
May 30, 2022
Dec 12, 2018
Aug 6, 2019
Jan 23, 2019
May 30, 2022
Nov 30, 2018
Nov 30, 2018
Jan 23, 2019
May 16, 2022
May 16, 2022
May 30, 2022
Feb 16, 2022
May 30, 2022

Repository files navigation

apobec

apobec contains two in-house scripts that are used for NGS data analysis by HBV crispr/cas9 research group.

Installation

pip install apobec

Usage

apobec is intended to be used in Jupyter Notebook. Create folder named input_data and put your fastas into it. Navigate into the directory which contains the input_data folder. Then import the package :

import apobec

and run :

%run -m apobec.count_snp_duplex
%run -m apobec.create_bars
%run -m apobec.snp_rate

Description

The scripts take fasta alignment as an input. The input file is the result of deep sequencing reads mapping onto the reference sequence and is imported from the Geneious software.

count_snp_duplex.py counts mutations in dinucleotide duplex context.

create_bars.py outputs summary bar charts : bars

count_snp_duplex.py outputs excel spreadsheets to further manipulate the data : bars

bars

bars

snp_rate.py counts mutations in each read.

It outputs a distribution plot: dist

and also raw count and summary statistics in excel spreadsheets: sum_stat

Requirements

  • Python 3
  • biopython
  • matplotlib
  • numpy
  • pandas
  • seaborn