GCCorrect¶
This function is used for processing GC correction on read count data in wig or csv/txt files.
Parameters¶
GCcorrect(readInput=None, gcwigInput=None,
readtype=None, corrkey=None,
outputdir=None, threads=1,
stepNum=None, readupstream=None,
gcupstream=None, verbose=True,)
readInput: list, paths of input files of read counts.
gcwigInput: list, paths of wig files of gc contents.
readtype: int, file type of readInput, 1 for .wig, 2 for .txt/.csv.; 1 is set by default.
corrkey: char, type of GC correction, “-” for minus, “/” for divide, “0” for process without GC correction; “/” is set by default
outputdir: str, output result folder, None means the same folder as input files.
threads: int, how many thread to use.
stepNum: Step number for folder name.
readupstream: upstream output results, used for pipeline.
gcupstream: upstream output results, used for pipeline.
verbose: bool, True means print all stdout, but will be slow; False means black stdout verbose, much faster.
Warning
We recommend using this function in arm-level CNV detection.
Example usage:
# an example for compute arm-level CNV
from cfDNApipe import *
import glob
pipeConfigure2(
threads=20,
genome="hg19",
refdir=r"reference_genome/hg19",
outdir=r"output/pcs_armCNV",
data="WGS",
type="paired",
JavaMem="8G",
case="cancer",
ctrl="normal",
build=True,
)
verbose = False
case_bam = glob.glob("path_to_data/HCC/*.bam")
ctrl_bam = glob.glob("path_to_data/CTR/*.bam")
# case
switchConfigure("cancer")
case_bamCounter = bamCounter(
bamInput=case_bam, upstream=True, verbose=verbose, stepNum="case01"
)
case_gcCounter = runCounter(
filetype=0, upstream=True, verbose=verbose, stepNum="case02"
)
case_GCCorrect = GCCorrect(
readupstream=case_bamCounter,
gcupstream=case_gcCounter,
verbose=verbose,
stepNum="case03",
)
# ctrl
switchConfigure("normal")
ctrl_bamCounter = bamCounter(
bamInput=ctrl_bam, upstream=True, verbose=verbose, stepNum="ctrl01"
)
ctrl_gcCounter = runCounter(
filetype=0, upstream=True, verbose=verbose, stepNum="ctrl02"
)
ctrl_GCCorrect = GCCorrect(
readupstream=ctrl_bamCounter,
gcupstream=ctrl_gcCounter,
verbose=verbose,
stepNum="ctrl03",
)
switchConfigure("cancer")
res_computeCNV = computeCNV(
caseupstream=case_GCCorrect,
ctrlupstream=ctrl_GCCorrect,
stepNum="ARMCNV",
verbose=verbose,
)