I use ATACseqQC to estimate the library complexity of one ATAC-Seq data. When I don't trim any reads, it seems that our library complexity (cell number) is not enough because of a saturation plateau (fig. 1). However, it appears that the library complexity (cell number) is enough because of an unsaturation curve, if I trim chrM, duplicate, low mapping quality, and improper reads (fig. 2). I don't know which one is correct. Could you help me? May I know how ATACseqQC estimates the number of putative sequenced fragments and distinct fragments beyond the number of reads we sequenced (i.e., about 27 million reads)? Many thanks.
Fig. 1. The library complexity based on untrimmed reads
Fig. 2. The library complexity based on trimmed reads