bismark_deduplicate¶
This function is used for removing duplicates from bismark output.
Note
This function is calling bismark.
Krueger, Felix, and Simon R. Andrews. “Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications.” bioinformatics 27.11 (2011): 1571-1572.
Parameters¶
bismark_deduplicate(bamInput=None, outputdir=None,
threads=1, paired=True,
other_params={}, stepNum=None,
upstream=None, verbose=True)
bamInput: list, input bam files.
outputdir: str, output result folder, None means the same folder as input files.
threads: int, how many thread to use.
paired: True for paired data, False for single end data.
- other_params: dict, other parameters passing to Bismark.
“-parameter”: True means “-parameter” in command line. “-parameter”: 1 means “-parameter 1” in command line.
stepNum: int or str, step flag for folder name.
upstream: upstream output results, used for pipeline. This parameter can be True, which means a new pipeline start.
verbose: bool, True means print all stdout, but will be slow; False means black stdout verbose, much faster.
Example usage:
# bam file must from bismark output and not be sorted!
bams = ["test1.bam", "test2.bam"]
bismark_deduplicate(bamInput=bams, paired=True)