I am using edgeR for the analysis of allele-specific expression events with mouse RNA-Seq dataset. I use featurecount to count the reads mapped to maternal/paternal allele and also those which map equally to both (i.e. Reads that don't overlap a SNP). Then I use edgeR to analyse the differential expression (maternal over paternal allele) using these counts. However I found that mostly these counts are low (since I am counting only allele-specific reads and discard reads with no SNP information).
To work around this problem, I was suggested by someone to add a proportion of "background reads" (i.e. reads with no allelic information) to the allele-specific read counts on both sides. This improved the number of differentially genes detected. In fact, addition of 50% background reads also makes the expression status of my "control genes" (mouse imprinted genes), comparable to a previously published dataset in the same cell line (where they indeed sequenced with twice the depth as ours).
However, I am unsure if my strategy is correct. How does the testing in edgeR affected if you are comparing, for example, 14 vs 12 reads, in place of 4 vs 2 reads? What's the best strategy to compute differential expression in this situation?