Is it possible to use RNA-seq tools for getting P-values for gene duplication detection based of gene coverage? I have a list of genes vs read counts. I thought about:
I don't think differential expression tools would be best for this purpose. Most DE tools assume that the biological variation has a continuous distribution (e.g. normal or gamma), but variation due to CNV would be discrete at integer multiples of the haploid coverage depth. You could try it, but I'm guessing that there are better tools out there that model the CNV using a more appropriate distribution.
One possible use would be if you don't have pure populations of mutant genomes, such that the CNVs aren't integer multiples; in such cases, having log-fold changes across the real line would be meaningful. That being said, I think some CNV detection tools account for tumor purity, so as Ryan suggests, you might be better off using those.
One possible use would be if you don't have pure populations of mutant genomes, such that the CNVs aren't integer multiples; in such cases, having log-fold changes across the real line would be meaningful. That being said, I think some CNV detection tools account for tumor purity, so as Ryan suggests, you might be better off using those.