Hello to everyone,
I am trying to find gene fold change for transcription data in R via limma package.As I am bit new to this, just want to make sure that what I have been doing is correct or no. I am bit more concern on design matrix that i have created correct or not and also with the contrast matrix. My target is to get relative gene fold change from control vs treatment at 3 time points and the transcription data (microarray data) are like this; I have two control replicates at each time point of 2hr, 8hr and 24hr. The treatment groups includes 3 doses lower, middle and higher with each two replicates at time point of 2hr, 8hr and 24hr. So total I have 24 samples including both control, treatment. I want to compare the control vs treatment at each time point like
Here below I have attached the code, please check it.
samples <- paste(targets$Target, targets$time ,sep="/") sample<- factor(samples) design <- model.matrix(~0+sample) colnames(design) <- levels(sample) Control2/2 hr Control24/24 hr Control8/8 hr High2/2 hr High24/24 hr High8/8 hr 1 1 0 0 0 0 0 2 1 0 0 0 0 0 3 0 0 0 0 0 0 4 0 0 0 0 0 0 5 0 0 0 0 0 0 6 0 0 0 0 0 0 7 0 0 0 1 0 0 8 0 0 0 1 0 0 9 0 0 1 0 0 0 10 0 0 1 0 0 0 11 0 0 0 0 0 0 12 0 0 0 0 0 0 13 0 0 0 0 0 0 14 0 0 0 0 0 0 15 0 0 0 0 0 1 16 0 0 0 0 0 1 17 0 1 0 0 0 0 18 0 1 0 0 0 0 19 0 0 0 0 0 0 20 0 0 0 0 0 0 21 0 0 0 0 0 0 22 0 0 0 0 0 0 23 0 0 0 0 1 0 24 0 0 0 0 1 0 Low2/2 hr Low24/24 hr Low8/8 hr Middl8/8 hr Middle2/2 hr Middle24/24 hr 1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 1 0 0 0 0 0 4 1 0 0 0 0 0 5 0 0 0 0 1 0 6 0 0 0 0 1 0 7 0 0 0 0 0 0 8 0 0 0 0 0 0 9 0 0 0 0 0 0 10 0 0 0 0 0 0 11 0 0 1 0 0 0 12 0 0 1 0 0 0 13 0 0 0 1 0 0 14 0 0 0 1 0 0 15 0 0 0 0 0 0 16 0 0 0 0 0 0 17 0 0 0 0 0 0 18 0 0 0 0 0 0 19 0 1 0 0 0 0 20 0 1 0 0 0 0 21 0 0 0 0 0 1 22 0 0 0 0 0 1 23 0 0 0 0 0 0 24 0 0 0 0 0 0 attr(,"assign") [1] 1 1 1 1 1 1 1 1 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$f [1] "contr.treatment" contrast.matrix <- makeContrasts(control_lowdose.2 = Low2-Control2, control_lowdose.8 = Low8 - Control8, control_lowdose.24 = Low24-Control24,control_Mediumdose.2 = Middle2 -Control2, control_Mediumdose.8 = Middl8 -Control8, control_Mediumdose.24 = Middle24-Control24 , control_Highdose.2 = High2 -Control2,control_Highdose.8 = High8 -Control8, control_Highdose.24 = High24-Control24,levels=design)
Thanks Aaron for your reply.
Yes that was mistake in my post. I have analysed the data, I am bit confused as I am getting different log fold 2 change with different tools such as FARMS and GCRMA. Now I am bit worried which method should I used. Along with that I am getting multiple identical gene symbol having different log2 fold change.
Here I have posted the code that i am using for gene annotation.
fit <- lmFit(exprs(e1FARMSfiltered), design)
contrast.matrix <- makeContrasts(control_lowdose.2 = Low2-Control2, control_lowdose.8 = Low8 -Control8,
control_lowdose.24 = Low24-Control24,control_Mediumdose.2 = Middle2 -Control2,
control_Mediumdose.8 = Middl8 -Control8, control_Mediumdose.24 = Middle24-Control24 ,
control_Highdose.2 = High2 -Control2,control_Highdose.8 = High8 -Control8,
control_Highdose.24 = High24-Control24,levels=design)
dose_fits <- contrasts.fit(fit, contrast.matrix)
dose_ebFit <- eBayes(dose_fits)
#repeat from coef 1 to 9 as we have 9 coefficient
#topTable(dose_ebFit, number=10, coef=1, sort.by="M")
probeset.list1 <- topTable(dose_ebFit, coef= 1, sort.by="M",number=54675)
#Annotating the results with associated gene symbols
affyIDs1 <- rownames(probeset.list1)
#need to change number for all problist
gns1 <- select(hgu133plus2.db, keys=rownames(probeset.list1),
columns=c("SYMBOL","GENENAME"), keytype="PROBEID") #warning would come but it does not affect results
gns1 <- gns1[!duplicated(gns1[,1]),]
results1 <- cbind(probeset.list1, gns1)
write.table(results1,file="control vs low.2.csv",sep=",",row.names=F)
Thanks,
Raju
I can't speak for what the other packages are doing; you'll have to ask their authors.