Hi there edgeR users,
I am new to edgeR so I need to get some help from the experts.
We have these paired-end RNA-seq datasets between mock (control) and fungus-inoculated wheat samples collected at 0, 6, 12 and 24 hai (hours after infection), each with three reps. That is, 2 treatments (control and treatment) x 4 time-points x 3 reps.
Unfortunately, samples at 0 hai which is supposed to be pre-inoculated samples were collected after inoculation so there is already infection activities at this baseline time-point. Doing pairwise differential expression to each time-point gives a thousands of differentially expressed genes at 0 hai which is unrealistic.
What we did, using edgeR, is to subtract out the expression difference of the baseline time-point from the latter time-point expression differences. That is, example (not complete contrast matrix):
my.contrasts <- makeContrasts (a = (infected.6hai – control.6hai) – (infected.0hai–control.0hai))
Is this a right approach to correct an "erroneous" baseline/reference time-point?
Additionally, to find genes that have responded differently due to treatment (control vs infected) between two time-points, additional comparisons were appended to the contrast matrix. Example, to find genes differentially expressed between 12 and 6 hai time-points:
my.contrasts <- makeContrasts (b = (infected.12hai – infected.6hai) – (control.12hai – control.6hai))
Again, is this right?
The complete code would be:
library("edgeR") #loading a data frame and combining experimental factors targets<-read.delim("targets.txt") Group <- factor(paste(targets$Treat,targets$Time,sep=".")) cbind(targets,Group=Group) #Load the data matrix x <- read.csv("sample.csv",row.names="Name",header=T) y<-DGEList(counts=x,group=Group) #filtering out lowly expressed isoforms and normalization keep <- rowSums(cpm(y)>1) >= 3 y<-y[keep,,keep.lib.sizes=FALSE] y<-calcNormFactors(y) #design matrix design<-model.matrix(~0+Group) #estimating dispersion y<-estimateDisp(y, design, robust=TRUE) colnames(design)<-levels(Group) fit<-glmQLFit(y,design, robust=TRUE) #incomplete or sample contrast matrix my.contrasts <- makeContrasts(a=(i.6-m.6)-(i.0-m.0),b=(i.12-i.6)-(m.12-m.6), ,levels=design) qlf <- glmQLFTest(fit, contrast=my.contrasts[,"a"]) table = topTags(qlf, n=Inf)