Question: The meaning of results produced by the bamQC function implemented in the ATACseqQC
0
gravatar for Gary
4 weeks ago by
Gary0
Gary0 wrote:

Hi,

I use bamQC in ATACseqQC to do the quality analysis of our ATAC-Seq data. However, I have some questions below. Could you help me? Many thanks.

Best,

Gary

My questions (1) The meaning of $totalQNAMEs

(2) The meaning of $nonRedundantFraction

(3) The meaning of $MAPQ. Is it mapping quality?

(4) How does ATACseqQC define the low mapping quality in $notPassingQualityControlsRate

(5) The difference between $duplicateRate, $PCRbottleneckCoefficient1, and $PCRbottleneckCoefficient2

bamQC results

> bamfile <- "Bulbul.bam"
> bamfile.label <- sub(".bam","",basename(bamfile))
> bamQC(bamfile, doubleCheckDup = TRUE, mitochondria = "chrM", outPath=NULL)
$totalQNAMEs
[1] 21916261

$duplicateRate
[1] 0.3664853

$mitochondriaRate
[1] 0.1325265

$properPairRate
[1] 0.8650588

$unmappedRate
[1] 0

$hasUnmappedMateRate
[1] 0.009750733

$notPassingQualityControlsRate
[1] 0

$nonRedundantFraction
[1] 0.4259382

$PCRbottleneckCoefficient_1
[1] 0.6878547

$PCRbottleneckCoefficient_2
[1] 3.223372

$MAPQ
   Var1     Freq
0     0  1084726
1     1  2871341
11   11  3289972
12   12   104817
14   14   816882
16   16    93679
17   17   501750
18   18   393854
19   19    54317
2     2   885517
21   21   276105
22   22  2311572
24   24  1988876
25   25   157419
28   28  1968735
31   31   157705
32   32   120801
33   33   119436
34   34   123993
35   35    44634
36   36  2199923
37   37   158525
38   38   179100
39   39   203899
40   40   359444
41   41  1728179
42   42  1664509
44   44 19352277
9     9   197263

$idxstats
   seqnames seqlength  mapped unmapped
1      chr1 148872119 4128800        0
2      chr2 196923045 5417894        0
3      chr3 131294613 3780349        0
4      chr4  91954985 2632716        0
5      chr5  70255457 2138911        0
6      chr6  47769531 1717115        0
7      chr7  48434409 1320034        0
8      chr8  38169314 1097923        0
9      chr9  32183676  909768        0
10    chr10  26112686  759545        0
11    chr11  29734599  926547        0
12    chr12  26924181  784131        0
13    chr13  20766281  587163        0
14    chr14  23539961  695308        0
15    chr15  18414836  502454        0
16    chr16  86149285 2480089        0
17    chr17  18671425  534943        0
18    chr18  16791751  470999        0
19    chr19  13212593  386971        0
20    chr20  20610846  610399        0
21    chr21  10574128  301684        0
22    chr22   7051356  305518        0
23    chr23   8322055  267565        0
24    chr24   9986730  340468        0
25    chr25   2816729  118446        0
26    chr26   8520532  272092        0
27    chr27   8282467  251498        0
28    chr28   7839907  262123        0
29    chr29   1612305   44851        0
30    chr30  23086460  897442        0
31    chr31   1266911   40603        0
32    chr32  81982468 2672024        0
33     chrM     17011 5752877        0
ADD COMMENTlink modified 28 days ago by Julie Zhu4.0k • written 4 weeks ago by Gary0
Answer: The meaning of results produced by the bamQC function implemented in the ATACseq
1
gravatar for Ou, Jianhong
4 weeks ago by
Ou, Jianhong1.1k
United States
Ou, Jianhong1.1k wrote:

Hi Gary,

(1) The meaning of $totalQNAMEs Total number of reads (single, paired)

(3) The meaning of $MAPQ. Is it mapping quality? Yes. It is the count of each mapping quality value.

(4) How does ATACseqQC define the low mapping quality in $notPassingQualityControlsRate ATACseqQC did not define the low mapping quality. It should be defined in your bam file.

(2) The meaning of $nonRedundantFraction (5) The difference between $duplicateRate, $PCRbottleneckCoefficient1, and $PCRbottleneckCoefficient2

You can refer https://www.encodeproject.org/data-standards/terms/

Jianhong.

ADD COMMENTlink written 4 weeks ago by Ou, Jianhong1.1k

Hi Jianhong,

Thanks a lot. May I have an additional question? Using ENCODE's terms and definitions for the ATAC-Seq library complexity, I don't understand why (1) my bottlenecking level is "Severe" based on my PBC1 value (0.6878547 < 0.7), but (2) my bottlenecking level is "None" base on my PCB2 value (3.223372 > 3). Could you help me? Many thanks.

Best,

Gary

ADD REPLYlink written 4 weeks ago by Gary0
Answer: The meaning of results produced by the bamQC function implemented in the ATACseq
1
gravatar for Julie Zhu
28 days ago by
Julie Zhu4.0k
United States
Julie Zhu4.0k wrote:

Gary,

PCR1 = number of genomic locations with one uniquely mapped reads / number of genomic locations with at least one uniquely mapped reads. PCR2 = number of genomic locations with one uniquely mapped reads / number of genomic locations with two uniquely mapped reads.

If one of the PCR Bottlenecking Coefficients indicates there is a problem with the library complexity, there is a problem regardless of the value of the other coefficient. In your situation, it means that there is not a concern about the number of genomic locations with exactly two uniquely mapped reads. However, there is too many genomic locations with more than 1 uniquely mapped reads.

Hope this answers your question.

Best regards,

Julie

ADD COMMENTlink written 28 days ago by Julie Zhu4.0k

Dear Julie,

Your explanation is very helpful. Thank you so much.

Best,

Gary

ADD REPLYlink written 27 days ago by Gary0

Dear Gary,

You are very welcome! Thanks for letting me know! It is a good question! This thread will very likely help others to evaluate their ATAC-seq data.

Best regards,

Julie

ADD REPLYlink written 27 days ago by Julie Zhu4.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 271 users visited in the last hour