Hi,
I'm trying to measure RNA stability changes after knocking out a repressor RNA binding protein. I'm using 4sU to label nascent RNA and also collecting total RNA as per this paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114636/. I can then calculate RNA stability by the ratio of nascent RNA/total RNA. I'm using the interaction in limma to look at changes in RNA stability between WT and KO cells. The issue that I have run into is that it looks like some genes are stabilized and some genes are destabilized. However, I think this is just due to the relative nature of RNA-seq where once some genes are stabilized, other genes now take up less of the sequencing pool so look destabilized. I have validated with an alternative low throughput method based on qPCR that only the stabilized gene changes are real. I was wondering if people had suggestions for alternative ways to normalize the data so that the sequencing would only show the stabilized genes? Unfortunately external ERCC spikes have been noisy and not worked well for normalization.
Thanks
d <- DGEList(count_mat_filtered, genes = rownames(count_mat_filtered)) d <- calcNormFactors(d) v <- voom(d, design, plot = T) # Differential expression fit <- lmFit(v, design) cont.matrix <- makeContrasts( mRNA = KO.total - wt.total, su = KO.su - wt.su, interaction = (KO.total - KO.su) - (wt.total - wt.su), stability_wt = wt.total - wt.su, stability_KO = KO.total - KO.su, levels = design) fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit2) all_fc <- topTable(fit2, number = nrow(count_mat_filtered)) mRNA_results <- topTable(fit2, coef = 1, number = nrow(count_mat_filtered), confint = T) su_results <- topTable(fit2, coef = 2, number = nrow(count_mat_filtered), confint = T) interaction_results <- topTable(fit2, coef = 3, number = nrow(count_mat_filtered), confint = T) stability_wt <- topTable(fit2, coef = 4, number = nrow(count_mat_filtered), confint = T) stability_KO <- topTable(fit2, coef = 5, number = nrow(count_mat_filtered), confint = T)
I would say the first thing to do is to make an MA plot of your data. If it shows a bimodal distribution along the vertical (logFC) axis, then you might have some issues with the normalization, especially if neither mode is centered on zero.