Hi everybody.
I'm trying to study differential gene expression in an affymetrix array. My data is composed by 1 array with 4 cells, each one with different conditions. I'm interested on compare all cells in pairs, i.e if I have cels a b c d i want to compare ab, ac, ad, bc, bd and cd.
looking at bioconductor documentation i thought that the best approach was to do the following script for each pair (LV1LV2 is a pair in this case).
library(affy) library(affycoretools) library(limma) LV1LV2 <- c("20050622_Fibro_LV1_U133A_ov.CEL","20050622_Fibro_LV2_U133A_ov.CEL") SLV1LV2=c("LV1","LV2") DataLV1LV2 <-ReadAffy(filenames=LV1LV2,sampleNames = SLV1LV2) pData(DataLV1LV2)<-read.table("pLV1LV2.txt", header=T,row.names=1, sep="\t") hist(DataLV1LV2) legend("topright", gsub("\\.[Cc][Ee][Ll]", "",sampleNames(DataLV1LV2)) , col = 1:ncol(DataLV1LV2), lty= 1:ncol(DataLV1LV2)) savePlot(filename="histoLV1LV2.png",type="png") rLV1LV2=rma(DataLV1LV2) #Diferencial expression for LV1LV2 GroupLV1LV2 <- factor(pData(DataLV1LV2)[,1 ] , levels = levels(pData(DataLV1LV2)[,1])) designLV1LV2<- model.matrix(~GroupLV1LV2) fitLV1LV2 <-lmFit(rLV1LV2,designLV1LV2) fLV1LV2 <-eBayes(fitLV1LV2 ) tabLV1LV2 <- topTable(fitLV1LV2, coef = 2, adjust = "fdr", n = 50) write.table(tabLV1LV2 ,file="DEGLV1LV2.xls",row.names=F, sep="\t") write.csv(tabLV1LV2, file = "tabLV1LV250.csv",row.names=TRUE) selectedLV1LV2<- p.adjust(fitLV1LV2$p.value[, 1]) <0.05 esetSelLV1LV2 <- eset [selectedLV1LV2, ] write.csv(esetSelLV1LV2, file = "SIGNIFICANTLV1LV250.csv",row.names=TRUE) heatmap(exprs(esetSelLV1LV2)) savePlot(filename="heatmapDataLV1LV2.png",type="png")
After fLV1LV2 <-eBayes(fitLV1LV2 ),
I get the following error :
Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, : No residual degrees of freedom in linear model fits
which seems to be caused by :
- not passing the data properly
- having only two samples.
when i look at fitLV1LV2 summary i get
summary(fitLV1LV2$coef) (Intercept) GroupLV1LV2LV2 Min. : 3.727 Min. :-2.78770 1st Qu.: 5.161 1st Qu.:-0.20358 Median : 6.379 Median :-0.07446 Mean : 6.730 Mean :-0.03792 3rd Qu.: 7.901 3rd Qu.: 0.07528 Max. :14.714 Max. : 2.10817 > summary(fitLV1LV2$df.residual) Min. 1st Qu. Median Mean 3rd Qu. Max. 0 0 0 0 0 0 > summary(fitLV1LV2$sigma) Mode NA's logical 22283
I think that the approach I'm doing to performe the analysis is wrong. There is some way to, upladin all 4 cells in an object, do the comparations by pairs (pe data<-readAffy() and then do comparations) or a better approach to do it?
thanks
It's too much of a pain to parse your code to decipher this answer for myself, but do you have any replicates per condition? If your experiment has no replicates, I'm pretty sure that this (or a similar) error is what you'll get.