DESeq2 With Ratios from Nascent RNA-seq (Conversion Rate from slamdunk)
1
0
Entering edit mode
vanbelj ▴ 30
@vanbelj-21216
Last seen 10 weeks ago
United States

This is a follow-up to my post about DESeq2 with 2-factor design here.

Nascent RNA-seq experiments analyzed by slamdunk report a Conversion Rate for each gene, which is the ratio of Nascent Reads to Total Reads for a given gene (simplified description). DESeq2 requires unnormalized counts (integers), so a direct comparison of Conversion Rate is not possible. Instead, a 2-factor design should be used. However, 2-factor design comes with the caveat of needing to define a threshold value for LFC. I don't have a way to determine the appropriate threshold biologically, and different values yield quite different results (see code below). Further, the appropriate threshold may change between experiments.

For this reason, I'm wondering if there is a way to compare Conversion Rates directly? For instance, if I excluded low-count genes, and then multiplied the ratios by 1E6 and then rounded to the nearest integer? I'm sure a statistician is screaming somewhere, heh. Just looking for alternatives that could potentially be used.

The following shows genes from a 2-factor design with padj <0.05 when log2FC threshold is set to 0.2, 0.3, 0.585, 0.848 (1.15x, 1.2x, 1.5x, and 1.8x respectively)

results2 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.2, altHypothesis="lessAbs")
length(results2@rownames[results2$padj <= 0.05])
[1] 0

results3 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.3, altHypothesis="lessAbs")
length(results3@rownames[results3$padj <= 0.05])
[1] 2046

results4 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.585, altHypothesis="lessAbs")
length(results4@rownames[results4$padj <= 0.05])
[1] 2690

results5 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.848, altHypothesis="lessAbs")
length(results5@rownames[results5$padj <= 0.05])
[1] 3735
DESeq2 • 619 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

I don't think there's any way around defining what you mean by "no change" in order to specify the genes you want to find. From the earlier thread, you said that you didn't want genes where there were changes between treatment groups: "this data set still appears to contain genes where there are large changes in TotalReads between Treatment groups". You just have to say, large changes (LFC > ~1) and then the previous plan will work.

And you should not provide DESeq2 with modified counts or it will give nonsense results.

ADD COMMENT

Login before adding your answer.

Traffic: 694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6