Entering edit mode
Colleen Burge
▴
30
@colleen-burge-5872
Last seen 10.6 years ago
Dear all,
I'm trying to set up my analysis in EdgeR to look at differential
expression at 4 time points (J, N, M, N) in three samples (A,B,C).
Our
hypothesis is that environmental conditions at "N" should have an
affect on
gene expression, and make gene expression significantly different at
this
time point than the pre: J or post M &N time points. When I plot the
MDS,
I can see that my experiment is having a "batch" affect, where samples
are
grouping by genotype. I would like to test for differences,
particularly
between N and the other 3 time points accounting for the affect of the
samples. I believe an additive linear model with genotype as the
blocking
factor is appropriate (but PLEASE correct me if I'm wrong), is the
correct
analysis. I have not been able to get to contrasts, due to error
messages,
and I may also need help with this :)
I have tried the analysis a couple of different ways:
model1: ~genotype +group
or
model ~genotype:genotype + group
*Script for model1*
#Consideration that treatment is the same for all genotypes
setwd ("~/Desktop/CURRENT FILES/R Scripts/Bleaching_data")
library(edgeR)
# making a matrix of factors called "targets",Time=Time , SF =
genotype
targets <-readTargets("TargetsSF.txt")
targets
# importing raw data
rawdata <- read.delim("SF_BLEACH.txt",row.names="BLContig")
head(rawdata)
# setting groups equal to time differences
group <- targets$Time
group <- as.factor(group)
group
# making my DGE list
y <- DGEList(counts=rawdata,group=group)
head(y$counts)
# filtering out lowly expressed genes
keep <- rowSums(cpm(y)>1) >= 4
y <- y[keep,]
table(keep)
dim(y)
# retained 67416 genes
#recompute the library sizes:
y$samples$lib.size <- colSums(y$counts)
#calculating normalization factors
y <- calcNormFactors(y)
y$samples
# looks good
n <- y$samples$lib.size
# examining sample for outliers
plotMDS(y)
## OK adding indiv SF
targets
genotype <- factor(targets$SF)
genotype
design <- model.matrix(~ genotype + group)
design
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
# Disp = 0.19995 , BCV = 0.4472
y <- estimateGLMTrendedDisp(y,design)
# Loading required package: splines
y <- estimateGLMTagwiseDisp(y,design)
plotBCV(y)
fit <- glmFit(y, design)
lrt <- glmLRT(y,fit, coef=5:6)
#error here: Error in beta[k, ] <- betaj[decr, ] : NAs are not
allowed in
subscripted assignments
*Script for model2: consideration that treatment affects are different
between genotypes*
setwd ("~/Desktop/CURRENT FILES/R Scripts/Bleaching_data")
library(edgeR)
# making a matrix of factors called "targets",Time=Time , SF = subject
targets <-readTargets("TargetsSF.txt")
targets
# importing raw data
rawdata <- read.delim("SF_BLEACH.txt",row.names="BLContig")
head(rawdata)
# setting groups equal to time differences
group <- targets$Time
group <- as.factor(group)
group
# making my DGE list
y <- DGEList(counts=rawdata,group=group)
head(y$counts)
# filtering out lowly expressed genes
keep <- rowSums(cpm(y)>1) >= 4
y <- y[keep,]
dim(y)
# retained 67416 genes
#recompute the library sizes:
y$samples$lib.size <- colSums(y$counts)
#calculating normalization factors
y <- calcNormFactors(y)
y$samples
# looks good
n <- y$samples$lib.size
# examining sample for outliers
plotMDS(y)
## OK adding indiv SF
targets
genotype <- factor(targets$SF)
genotype
design <- model.matrix(~ genotype *group)
design
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
#ERROR HERE: Warning message: In estimateGLMCommonDisp.default(y =
y$counts, design = design, :No residual df: setting dispersion to NA
#Can clearly see that the model is parameterizing each sample
separately
#Tried this iteration of the model
design2 <- model.matrix (~ genotype + genotype:group)
design2
y <- estimateGLMCommonDisp(y, design2, verbose=TRUE)
#Same error as above!!
Here is the version of R/edgeR
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/en_US.UTF-8/C/C/C/C
attached base packages:
[1] splines stats graphics grDevices utils datasets
methods
base
other attached packages:
[1] edgeR_2.6.12 limma_3.12.3
loaded via a namespace (and not attached):
[1] tools_2.15.1
Thanks you!!!!
Colleen Burge
Postdoctoral Research Associate
Dept. Ecol & Evol. Biology
Cornell University
[[alternative HTML version deleted]]