So I was doing a differential expression analysis with RNA-Seq data using limma-voom pipeline. However, the fit for mean-variance is not like a flat line at 1 but a curve (top: plot after voom; bottom: plot after fitting to linear model). Is it normal? Can I still use the results?
Here are my codes:
library(edgeR) library(limma) #Import data txi <- tximport(files = read.hd5s, type = "kallisto", tx2gene = tx2gene, countsFromAbundance = "lengthScaledTPM", ignoreTxVersion = TRUE) # create DGEList object raw_data = DGEList(txi$counts) #Filtering low count reads (TPM < 2) library(genefilter) abundance = txi$abundance abs_crit = pOverA(0.2, 2) abs_filter <- filterfun(abs_crit) keep <- genefilter(abundance, abs_filter) raw_data <- raw_data[keep,,keep.lib.sizes=FALSE] nrow(raw_data) # calculate normalization factors to scale the raw library sizes raw_data = calcNormFactors(raw_data) # experiment design (cluster from previous clustering analysis) design <- model.matrix(~condition) # Batch effect correction dat0 = as.matrix(raw_data$counts) mod1 = design mod0 = cbind(mod1[,1]) set.seed(12345) svseq = svaseq(dat0,mod1,mod0) # using limma to find differentially expressed genes #model fit with batch new_design <- model.matrix(~0+condition) new_design = cbind(new_design,svseq$sv) #voom transformation voom_diff = voom(raw_data, new_design, normalize="none", plot = TRUE) #contrast matrix contr.matrix <- makeContrasts(conditionvscontrol = conditionTRUE - conditionFALSE, levels = colnames(new_design)) #model fitting vfit <- lmFit(voom_diff_ph, new_design) vfit <- contrasts.fit(vfit, contrasts=contr.matrix) efit <- eBayes(vfit,trend=TRUE) plotSA(efit, main="Final model: Mean-variance trend")
First two columns of design matrix are like this:
> head(new_design[,1:2]) conditionFALSE conditionTRUE 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0
There are some 1 in TRUE and some 0 in FALSE but it's too long so I didn't include them.