Question: The meaning of results produced by the bamQC function implemented in the ATACseqQC
0
3 months ago by
Gary10
Gary10 wrote:

Hi,

I use bamQC in ATACseqQC to do the quality analysis of our ATAC-Seq data. However, I have some questions below. Could you help me? Many thanks.

Best,

Gary

My questions (1) The meaning of $totalQNAMEs (2) The meaning of$nonRedundantFraction

(3) The meaning of $MAPQ. Is it mapping quality? (4) How does ATACseqQC define the low mapping quality in$notPassingQualityControlsRate

(5) The difference between $duplicateRate,$PCRbottleneckCoefficient1, and $PCRbottleneckCoefficient2 bamQC results > bamfile <- "Bulbul.bam" > bamfile.label <- sub(".bam","",basename(bamfile)) > bamQC(bamfile, doubleCheckDup = TRUE, mitochondria = "chrM", outPath=NULL)$totalQNAMEs
[1] 21916261

$duplicateRate [1] 0.3664853$mitochondriaRate
[1] 0.1325265

$properPairRate [1] 0.8650588$unmappedRate
[1] 0

$hasUnmappedMateRate [1] 0.009750733$notPassingQualityControlsRate
[1] 0

$nonRedundantFraction [1] 0.4259382$PCRbottleneckCoefficient_1
[1] 0.6878547

$PCRbottleneckCoefficient_2 [1] 3.223372$MAPQ
Var1     Freq
0     0  1084726
1     1  2871341
11   11  3289972
12   12   104817
14   14   816882
16   16    93679
17   17   501750
18   18   393854
19   19    54317
2     2   885517
21   21   276105
22   22  2311572
24   24  1988876
25   25   157419
28   28  1968735
31   31   157705
32   32   120801
33   33   119436
34   34   123993
35   35    44634
36   36  2199923
37   37   158525
38   38   179100
39   39   203899
40   40   359444
41   41  1728179
42   42  1664509
44   44 19352277
9     9   197263

$idxstats seqnames seqlength mapped unmapped 1 chr1 148872119 4128800 0 2 chr2 196923045 5417894 0 3 chr3 131294613 3780349 0 4 chr4 91954985 2632716 0 5 chr5 70255457 2138911 0 6 chr6 47769531 1717115 0 7 chr7 48434409 1320034 0 8 chr8 38169314 1097923 0 9 chr9 32183676 909768 0 10 chr10 26112686 759545 0 11 chr11 29734599 926547 0 12 chr12 26924181 784131 0 13 chr13 20766281 587163 0 14 chr14 23539961 695308 0 15 chr15 18414836 502454 0 16 chr16 86149285 2480089 0 17 chr17 18671425 534943 0 18 chr18 16791751 470999 0 19 chr19 13212593 386971 0 20 chr20 20610846 610399 0 21 chr21 10574128 301684 0 22 chr22 7051356 305518 0 23 chr23 8322055 267565 0 24 chr24 9986730 340468 0 25 chr25 2816729 118446 0 26 chr26 8520532 272092 0 27 chr27 8282467 251498 0 28 chr28 7839907 262123 0 29 chr29 1612305 44851 0 30 chr30 23086460 897442 0 31 chr31 1266911 40603 0 32 chr32 81982468 2672024 0 33 chrM 17011 5752877 0  ADD COMMENTlink modified 3 months ago by Julie Zhu4.0k • written 3 months ago by Gary10 Answer: The meaning of results produced by the bamQC function implemented in the ATACseq 1 3 months ago by Ou, Jianhong1.1k United States Ou, Jianhong1.1k wrote: Hi Gary, (1) The meaning of$totalQNAMEs Total number of reads (single, paired)

(3) The meaning of $MAPQ. Is it mapping quality? Yes. It is the count of each mapping quality value. (4) How does ATACseqQC define the low mapping quality in$notPassingQualityControlsRate ATACseqQC did not define the low mapping quality. It should be defined in your bam file.

(2) The meaning of $nonRedundantFraction (5) The difference between$duplicateRate, $PCRbottleneckCoefficient1, and$PCRbottleneckCoefficient2

You can refer https://www.encodeproject.org/data-standards/terms/

Jianhong.

ADD COMMENTlink written 3 months ago by Ou, Jianhong1.1k

Hi Jianhong,

Thanks a lot. May I have an additional question? Using ENCODE's terms and definitions for the ATAC-Seq library complexity, I don't understand why (1) my bottlenecking level is "Severe" based on my PBC1 value (0.6878547 < 0.7), but (2) my bottlenecking level is "None" base on my PCB2 value (3.223372 > 3). Could you help me? Many thanks.

Best,

Gary

ADD REPLYlink written 3 months ago by Gary10
Answer: The meaning of results produced by the bamQC function implemented in the ATACseq
1
3 months ago by
Julie Zhu4.0k
United States
Julie Zhu4.0k wrote:

Gary,

PCR1 = number of genomic locations with one uniquely mapped reads / number of genomic locations with at least one uniquely mapped reads. PCR2 = number of genomic locations with one uniquely mapped reads / number of genomic locations with two uniquely mapped reads.

If one of the PCR Bottlenecking Coefficients indicates there is a problem with the library complexity, there is a problem regardless of the value of the other coefficient. In your situation, it means that there is not a concern about the number of genomic locations with exactly two uniquely mapped reads. However, there is too many genomic locations with more than 1 uniquely mapped reads.

Hope this answers your question.

Best regards,

Julie

ADD COMMENTlink written 3 months ago by Julie Zhu4.0k

Dear Julie,

Your explanation is very helpful. Thank you so much.

Best,

Gary

ADD REPLYlink written 3 months ago by Gary10

Dear Gary,

You are very welcome! Thanks for letting me know! It is a good question! This thread will very likely help others to evaluate their ATAC-seq data.

Best regards,

Julie

ADD REPLYlink written 3 months ago by Julie Zhu4.0k
