I performed polysome profiling to measure translation rate in wildtype and knockout cells. For the polysome samples I collected a monosome fraction, a low polysome fraction, and a high polysome fraction. I also collected mRNA for both wildtype and knockout. I'm interested in comparing translation changes between knockout and wildtype (and most interested in the change in high polysome relative to changes in mRNA).
I'm using limma to find differences in the ratio of (ko high / ko mRNA) / (wt high / wt mRNA). I initially input all of my samples (mRNA, monosome, low, high) into limma and only pulled out the ratio that I was interested in. However, I noticed that the samples I put in changes the number of significant genes a lot even when I only pull out the ko vs wt high vs mRNA ratio:
|input samples:||# significant changes (ko high / ko mRNA) / (wt high / wt mRNA):|
I know that voom takes the variation across samples into account is that the reason that I'm losing significance the more sample types that I put in? Looking at the samples I don't think there are any outliers that are increasing overall variance.
This is the code that I'm using:
layout <- data.frame(row.names = colnames(count_mat), type = colnames(count_mat), condition = colnames(count_mat), stringsAsFactors = F) layout[grep('high', colnames(count_mat)), ]$type <- 'high' layout[grep('low', colnames(count_mat)), ]$type <- 'low' layout[grep('mono', colnames(count_mat)), ]$type <- 'mono' layout[grep('mRNA', colnames(count_mat)), ]$type <- 'mRNA' layout[grep('ko', colnames(count_mat)), ]$condition <- 'ko' layout[grep('wt', colnames(count_mat)), ]$condition <- 'wt' limmaModel <- paste(layout$condition, layout$type, sep = '.') limmaModel <- factor(limmaModel) print(layout) design <- model.matrix(~0 + limmaModel) colnames(design) <- levels(limmaModel) rownames(design) <- rownames(layout) # Filter low read counts count_mat_filtered <- count_mat[rowSums(cpm(count_mat) > 5) >= 3, ] # Normalize to log2(CPM + 0.5) with voom d <- DGEList(count_mat_filtered, genes = rownames(count_mat_filtered)) d <- calcNormFactors(d) v <- voom(d, design, plot = T) # Differential expression fit <- lmFit(v, design) cont.matrix <- makeContrasts( interaction_high_mrna = (ko.high - ko.mRNA) - (wt.high - wt.mRNA), levels = design) fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit2) interaction_high_mrna <- topTable(fit2, coef = 'interaction_high_mrna', number = nrow(count_mat_filtered)) sum(interaction_high_mrna$adj.P.Val < 0.05)
R version 3.4 edgeR_3.18.1 limma_3.32.9