I'm trying to study differential gene expression in an affymetrix array. My data is composed by 1 array with 4 cells, each one with different conditions. I'm interested on compare all cells in pairs, i.e if I have cels a b c d i want to compare ab, ac, ad, bc, bd and cd.
looking at bioconductor documentation i thought that the best approach was to do the following script for each pair (LV1LV2 is a pair in this case).
library(affy) library(affycoretools) library(limma) LV1LV2 <- c("20050622_Fibro_LV1_U133A_ov.CEL","20050622_Fibro_LV2_U133A_ov.CEL") SLV1LV2=c("LV1","LV2") DataLV1LV2 <-ReadAffy(filenames=LV1LV2,sampleNames = SLV1LV2) pData(DataLV1LV2)<-read.table("pLV1LV2.txt", header=T,row.names=1, sep="\t") hist(DataLV1LV2) legend("topright", gsub("\\.[Cc][Ee][Ll]", "",sampleNames(DataLV1LV2)) , col = 1:ncol(DataLV1LV2), lty= 1:ncol(DataLV1LV2)) savePlot(filename="histoLV1LV2.png",type="png") rLV1LV2=rma(DataLV1LV2) #Diferencial expression for LV1LV2 GroupLV1LV2 <- factor(pData(DataLV1LV2)[,1 ] , levels = levels(pData(DataLV1LV2)[,1])) designLV1LV2<- model.matrix(~GroupLV1LV2) fitLV1LV2 <-lmFit(rLV1LV2,designLV1LV2) fLV1LV2 <-eBayes(fitLV1LV2 ) tabLV1LV2 <- topTable(fitLV1LV2, coef = 2, adjust = "fdr", n = 50) write.table(tabLV1LV2 ,file="DEGLV1LV2.xls",row.names=F, sep="\t") write.csv(tabLV1LV2, file = "tabLV1LV250.csv",row.names=TRUE) selectedLV1LV2<- p.adjust(fitLV1LV2$p.value[, 1]) <0.05 esetSelLV1LV2 <- eset [selectedLV1LV2, ] write.csv(esetSelLV1LV2, file = "SIGNIFICANTLV1LV250.csv",row.names=TRUE) heatmap(exprs(esetSelLV1LV2)) savePlot(filename="heatmapDataLV1LV2.png",type="png")
fLV1LV2 <-eBayes(fitLV1LV2 ), I get the following error :
Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, : No residual degrees of freedom in linear model fits
which seems to be caused by :
- not passing the data properly
- having only two samples.
when i look at fitLV1LV2 summary i get
summary(fitLV1LV2$coef) (Intercept) GroupLV1LV2LV2 Min. : 3.727 Min. :-2.78770 1st Qu.: 5.161 1st Qu.:-0.20358 Median : 6.379 Median :-0.07446 Mean : 6.730 Mean :-0.03792 3rd Qu.: 7.901 3rd Qu.: 0.07528 Max. :14.714 Max. : 2.10817 > summary(fitLV1LV2$df.residual) Min. 1st Qu. Median Mean 3rd Qu. Max. 0 0 0 0 0 0 > summary(fitLV1LV2$sigma) Mode NA's logical 22283
I think that the approach I'm doing to performe the analysis is wrong. There is some way to, upladin all 4 cells in an object, do the comparations by pairs (pe data<-readAffy() and then do comparations) or a better approach to do it?