Question

differential gene expression

0

Entering edit mode

Pau Marc Muñoz Torres ▴ 20

@pau-marc-munoz-torres-6731

Last seen 6.8 years ago

Barcelona

Hi everybody.

I'm trying to study differential gene expression in an affymetrix array. My data is composed by 1 array with 4 cells, each one with different conditions. I'm interested on compare all cells in pairs, i.e if I have cels a b c d i want to compare ab, ac, ad, bc, bd and cd.

looking at bioconductor documentation i thought that the best approach was to do the following script for each pair (LV1LV2 is a pair in this case).

library(affy)
library(affycoretools)
library(limma)
LV1LV2 <- c("20050622_Fibro_LV1_U133A_ov.CEL","20050622_Fibro_LV2_U133A_ov.CEL")
SLV1LV2=c("LV1","LV2")
DataLV1LV2 <-ReadAffy(filenames=LV1LV2,sampleNames =  SLV1LV2)
pData(DataLV1LV2)<-read.table("pLV1LV2.txt", header=T,row.names=1, sep="\t")
hist(DataLV1LV2)
legend("topright", gsub("\\.[Cc][Ee][Ll]", "",sampleNames(DataLV1LV2)) , col = 1:ncol(DataLV1LV2), lty= 1:ncol(DataLV1LV2))
savePlot(filename="histoLV1LV2.png",type="png")

rLV1LV2=rma(DataLV1LV2)

#Diferencial expression for LV1LV2

GroupLV1LV2 <- factor(pData(DataLV1LV2)[,1 ] , levels = levels(pData(DataLV1LV2)[,1]))
designLV1LV2<- model.matrix(~GroupLV1LV2)
fitLV1LV2 <-lmFit(rLV1LV2,designLV1LV2)
fLV1LV2 <-eBayes(fitLV1LV2 )
tabLV1LV2 <- topTable(fitLV1LV2, coef = 2, adjust = "fdr", n = 50)
write.table(tabLV1LV2 ,file="DEGLV1LV2.xls",row.names=F, sep="\t")
write.csv(tabLV1LV2, file = "tabLV1LV250.csv",row.names=TRUE)
selectedLV1LV2<- p.adjust(fitLV1LV2$p.value[, 1]) <0.05
esetSelLV1LV2 <- eset [selectedLV1LV2, ]
write.csv(esetSelLV1LV2, file = "SIGNIFICANTLV1LV250.csv",row.names=TRUE)
heatmap(exprs(esetSelLV1LV2))
savePlot(filename="heatmapDataLV1LV2.png",type="png")

After fLV1LV2 <-eBayes(fitLV1LV2 ), I get the following error :

Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, :

No residual degrees of freedom in linear model fits

which seems to be caused by :

not passing the data properly
having only two samples.

when i look at fitLV1LV2 summary i get

summary(fitLV1LV2$coef)

(Intercept) GroupLV1LV2LV2

Min. : 3.727 Min. :-2.78770

1st Qu.: 5.161 1st Qu.:-0.20358

Median : 6.379 Median :-0.07446

Mean : 6.730 Mean :-0.03792

3rd Qu.: 7.901 3rd Qu.: 0.07528

Max. :14.714 Max. : 2.10817

> summary(fitLV1LV2$df.residual)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0 0 0 0 0 0

> summary(fitLV1LV2$sigma)

Mode NA's

logical 22283

I think that the approach I'm doing to performe the analysis is wrong. There is some way to, upladin all 4 cells in an object, do the comparations by pairs (pe data<-readAffy() and then do comparations) or a better approach to do it?

thanks

limma differentialexpression • 1.6k views

ADD COMMENT • link updated 9.6 years ago by Diego Diez ▴ 760 • written 9.6 years ago by Pau Marc Muñoz Torres ▴ 20

1

Entering edit mode

It's too much of a pain to parse your code to decipher this answer for myself, but do you have any replicates per condition? If your experiment has no replicates, I'm pretty sure that this (or a similar) error is what you'll get.

ADD REPLY • link 9.6 years ago Steve Lianoglou ★ 13k

score 5 · Answer 1 · 2014-09-24

5

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

Since you have no replication, you essentially have no way of doing a hypothesis test. In statistical terms, this is referred to as "No residual degrees of freedom". To answer your last question, probably the easiest approach is to rank your probesets by pairwise fold change (no p-values or other statistical testing).

ADD COMMENT • link 9.6 years ago Sean Davis 21k

score 0 · Answer 2 · 2014-09-24

I would add to Sean's answer that it is also important here to consider the expression level (e.g. the average log2 expression, A) in order to avoid selecting too many false positives. Non expressed genes show large stochastic variation and tend to show high fold changes. An MA plot can be very helpful in this situation to know where to put the cutoff.