Entering edit mode
Hello everybody,
I'm have a set of CEL files of six time points and I would like to use
LIMMA
to analyse to differences in time between a treatment and the control
arrays.
I'm reading the data from an excel sheet and create an object for each
time
point and treatment level.
> data_1.0 <- read.delim(file="stress_1.0.txt", row.names=1)
> dim(data_1.0)
22810 51
I than combined the various time points and treatments into one data
frame
with
> xxx <- cbind(log_control_1.0_1h, log_control_1.0_3h,
log_control_1.0_6h,
log_control_1.0_9h,
log_control_1.0_24h, log_control_1.0_48h, log_xxx_1.0_1h,
log_xxx_1.0_3h,
log_xxx_1.0_6h,
log_xxx_1.0_9h, log_xxx_1.0_24h, log_xxx_1.0_48h)
> row.names(xxx) <- row.names(data_1.0)
I have a different number of array for the various time points.
1. Can that cause a problem in the analysis?
I'm having difficulties in creating the correct design and contrast
matrices.
I tried :
>d <-
factor(c("c1","c1","c1","c1","c3","c3","c3","c3","c3","c3","c6","c6","
c6","c6","c6","c6",
"c9","c9","c9","c24","c24","c24","c24","c48","c48","t1","t1","t1","t1"
,
"t3","t3","t3","t3","t3","t3","t3","t6","t6","t6","t6","t6","t9","t9",
"t9","t9",
"t24","t24","t24","t24","t48","t48"),
levels=c("c1","c3","c6","c9","c24","c48","t1","t3","t6","t9","t24","t4
8"))
#Design matrix for stress
> design <- model.matrix(~0+d)
# line in the table with NaN values (NA) were taken out - 4 lines of
house
keeoping genes
> fit_limma<- lmFit(xxx, design)
#which genes responds either at 1h, 3h,6h,9h,24h or 48h to treatment?
> contrastoftreatment <-
makeContrasts("dt1-dc1","dt3-dc3","dt6-dc6","dt9-dc9","dt24-dc24","dt4
8-dc48"
,levels=design)
But after a lot of thoughts I found out that I'm basically only
searching
for here for differentially regulated genes between treated and
control
experiments in a specific time point.
so I created a terget files:
Name fileName Target Time
xxx_0_1h_1 xxx_0_1h_1.CEL wt.1h1 1
xxx_0_1h_2 xxx_0_1h_2.CEL wt.1h2 1
xxx_0_1h_3 xxx_0_1h_3.CEL wt.1h3 1
xxx_0_1h_4 xxx_0_1h_4.CEL wt.1h4 1
xxx_0_24h_1 xxx_0_24h_1.CEL wt.24h1 24
xxx_0_24h_2 xxx_0_24h_2.CEL wt.24h2 24
xxx_0_24h_3 xxx_0_24h_3.CEL wt.24h3 24
xxx_0_24h_4 xxx_0_24h_4.CEL wt.24h4 24
xxx_0_3h_1 xxx_0_3h_1.CEL wt.3h1 3
xxx_0_3h_2 xxx_0_3h_2.CEL wt.3h2 3
xxx_0_3h_3 xxx_0_3h_3.CEL wt.3h3 3
xxx_0_3h_5 xxx_0_3h_5.CEL wt.3h4 3
xxx_0_3h_6 xxx_0_3h_6.CEL wt.3h5 3
xxx_0_3h_7 xxx_0_3h_7.CEL wt.3h6 3
xxx_0_48h_1 xxx_0_48h_1.CEL wt.48h1 48
xxx_0_48h_2 xxx_0_48h_2.CEL wt.48h2 48
xxx_0_6h_1 xxx_0_6h_1.CEL wt.6h1 6
xxx_0_6h_2 xxx_0_6h_2.CEL wt.6h2 6
xxx_0_6h_3 xxx_0_6h_3.CEL wt.6h3 6
xxx_0_6h_4 xxx_0_6h_4.CEL wt.6h4 6
xxx ... (you get the idea)
and used it to analyse the data according to the differences between
the
treated and the control experiments.
> targets <-readTargets("targets.txt",row.names="Name",quote="")
> lev <- c("wt.1h", "wt.24h", "wt.3h", "wt.48h", "wt.6h", "wt.9h",
"mu.1h", "mu.24h", "mu.3h", "mu.48h", "mu.6h", "mu.9h")
> f<-factor(targets$Target, levels=lev)
> design_1 <- model.matrix(~0+f)
> colnames(design_1) <- lev
> fit_1 <- lmFit(xxx, design_1)
# searching for the genes with time effects.
> cont.wt <- makeContrasts("wt.3h-wt.1h","wt.6h-wt.3h","wt.9h-wt.6h",
"wt.24h-wt.9h","wt.48h-wt.24h", "mu.3h-mu.1h", "mu.6h-mu.3h", "mu.9h-
mu.6h",
"mu.24h-mu.9h","mu.48h-mu.24h", levels=design_1)
> fit_2 <-contrasts.fit(fit_1, cont.wt)
> fit_3 <- eBayes(fit_2)
> topTable(fit_3, adjust="BH")
> sel.wt <- p.adjust(fit_3$F.p.value, method="fdr")<0.01
> cont.diff <- makeContrasts(diff3h=(mu.3h-mu.1h) - (wt.3h-wt.1h),
diff6h=(mu.6h-mu.3h) - (wt.6h-wt.3h),
diff9h=(mu.9h-mu.6h) - (wt.9h-wt.6h),
diff24h=(mu.24h-mu.9h) - (wt.24h-wt.9h),
diff48h=(mu.48h-mu.24h) - (wt.48h-wt.24h),
levels=design_1)
> fit_diff <-contrasts.fit(fit_1, cont.diff)
> fit_diff1 <- eBayes(fit_diff)
with this command I'm extracting the genes which are differentially
regulated between the 3h and the 1h time points with thresholds p<0.01
and a
2 fold change.
> topGenes <- topTable(fit_diff1, coef=1, number=3000, adjust="BH",
p.value=0.01, lfc=2, sort.by="logFC")
I'd have a look at the resulting genes, but somehow I didn't get the
feeling
that it now does what I wanted it to do, so my main question is - Is
the
experimental design I made here on the second part correct for finding
significantly differentially regulated genes in a time course analysis
between different time points?
Is there a way of extrapolating this kind of contrast design so that
I'm
able to find only the genes which are differentially regulated in all
time
points with this experimetal design?
Does such an analysis make any sense?
THX for any kind of help.
Assa Yeroslaviz
--
[[alternative HTML version deleted]]