fpCounter

This function is used for counting total reads or short-and-long-reads of the input bedgz files.

Parameters

fpCounter(bedgzInput=None, chromsizeInput=None,
        blacklistInput=None, gapInput=None, domains=None,
        binlen=None, outputdir=None, threads=1,
        processtype=None, stepNum=None,
        upstream=None, verbose=True,)
  • bedgzInput: list, paths of input bedgz files waiting to be processed.

  • chromsizeInput: str, path of chromsize file.

  • blacklistInput: str, used in fragmentation profile, path of blacklist file.

  • gapInput: str, used in fragmentation profile, path of gap file.

  • domains: list, used in fragmentation profile, [minimum_length_of_short_fragments, maximum_length_of_short_fragments,

    minimum_length_of_long_fragments, maximum_length_of_long_fragments]; default is [100, 150, 151, 220].

  • binlen: int, length of each bin; default is 5000000(5Mb) for fragmentation profile, or 100000(100kb) for CNV.

  • outputdir: str, output result folder, None means the same folder as input files.

  • threads: int, how many thread to use.

  • processtype: int, 1 for fragmentation profile, 2 for CNV.

  • stepNum: Step number for folder name.

  • upstream: Not used parameter, do not set this parameter.

  • verbose: bool, True means print all stdout, but will be slow; False means black stdout verbose, much faster.

Warning

We recommend using this function in short long fragmentation analysis.

Example usage:

# an example for short long fragmentation analysis
from cfDNApipe import *
import glob

pipeConfigure2(
    threads=20,
    genome="hg19",
    refdir=r"path_to_genome/hg19",
    outdir=r"path_to_output/pcs_fp",
    data="WGS",
    type="paired",
    JavaMem="8G",
    case="cancer",
    ctrl="normal",
    build=True,
)

verbose = False

# these bed.gz output can be get from bam2bed function in cfDNApipe
case_bedgz = glob.glob("/data/wzhang/pcs_final/HCC/*.bed.gz")
ctrl_bedgz = glob.glob("/data/wzhang/pcs_final/Healthy/*.bed.gz")

# case
switchConfigure("cancer")
case_fragCounter = fpCounter(
    bedgzInput=case_bedgz, upstream=True, verbose=verbose, stepNum="case01", processtype=1
)
case_gcCounter = runCounter(
    filetype=0, binlen=5000000, upstream=True, verbose=verbose, stepNum="case02"
)
case_GCCorrect = GCCorrect(
    readupstream=case_fragCounter,
    gcupstream=case_gcCounter,
    readtype=2,
    corrkey="-",
    verbose=verbose,
    stepNum="case03",
)

# ctrl
switchConfigure("normal")
ctrl_fragCounter = fpCounter(
    bedgzInput=ctrl_bedgz, upstream=True, verbose=verbose, stepNum="ctrl01", processtype=1
)
ctrl_gcCounter = runCounter(
    filetype=0, binlen=5000000, upstream=True, verbose=verbose, stepNum="ctrl02"
)
ctrl_GCCorrect = GCCorrect(
    readupstream=ctrl_fragCounter,
    gcupstream=ctrl_gcCounter,
    readtype=2,
    corrkey="-",
    verbose=verbose,
    stepNum="ctrl03",
)

switchConfigure("cancer")
res_fragprofplot = fragprofplot(
    caseupstream=case_GCCorrect,
    ctrlupstream=ctrl_GCCorrect,
    stepNum="FP",
)