Entering edit mode
Aditi Rambani
▴
50
@aditi-rambani-5366
Last seen 9.6 years ago
Thanks for your input. We defined a single experiment with 8 levels
and fitted model as a one way layout using attached script. All our
contrasts gave us output except one : lrt.F1 <- glmLRT(dge, fit,
contrast=c(1,-1,0,0,0,0,0,0))
We get an error message :
Error in beta[k, ] <- betaj[decr, ] :
NAs are not allowed in subscripted assignments
Calls: glmLRT ... glmFit.DGEList -> glmFit -> glmFit.default -> mglmLS
Can you please look into this.
Thanks
THE SCRIPT :
library("edgeR")
counts <- read.table(INFILE, header=T, row.names=1)
groups <- factor(c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8))
design <- model.matrix(~groups)
dge <- DGEList(counts=counts, group=groups)
dge <- calcNormFactors(dge)
dge <- estimateGLMCommonDisp(dge, design)
dge <- estimateGLMTrendedDisp(dge, design)
dge <- estimateGLMTagwiseDisp(dge, design)
fit <- glmFit(dge, design)
lrt.acc1 <- glmLRT(dge, fit, contrast=c(0,0,1,1,-1,-1,0,0))
lrt.acc2 <- glmLRT(dge, fit, contrast=c(0,0,1,1,1,1,-2,-2))
lrt.acc3 <- glmLRT(dge, fit, contrast=c(-3,-3,1,1,1,1,1,1))
lrt.F1 <- glmLRT(dge, fit, contrast=c(1,-1,0,0,0,0,0,0))
lrt.Mx <- glmLRT(dge, fit, contrast=c(0,0,1,-1,0,0,0,0))
lrt.Tx <- glmLRT(dge, fit, contrast=c(0,0,0,0,1,-1,0,0))
lrt.Tom <- glmLRT(dge, fit, contrast=c(0,0,0,0,0,0,1,-1))
write.table(topTags(lrt.acc1, n=10000), file="acc1.results", sep="\t",
quote=F)
write.table(topTags(lrt.acc2, n=10000), file="acc2.results", sep="\t",
quote=F)
write.table(topTags(lrt.acc3, n=10000), file="acc3.results", sep="\t",
quote=F)
write.table(topTags(lrt.F1, n=10000), file="F1.results", sep="\t",
quote=F)
write.tabletopTagslrt.Mx, n=10000), file="Mx.results", sep="\t",
quote=F)
write.table(topTags(lrt.Tx, n=10000), file="Tx.results", sep="\t",
quote=F)
write.table(topTags(lrt.Tom, n=10000), file="Tom.results", sep="\t",
quote=F)
________________________________
From: Gordon K Smyth <smyth@wehi.edu.au>
Cc: Bioconductor mailing list <bioconductor@r-project.org>
Sent: Friday, June 29, 2012 6:19 PM
Subject: Re: Contrast Problem
Dear Aditi,
The reason you are getting unexpected results is that your contrasts
are incorrect. The meaning of a contrast is that it makes a
comparison between the coefficients of the linear model that you have
fitted. Here are the column names of your design matrix:
> colnames(design2)
"(Intercept)" "accessions2" "accessions3" "accessions4" "genomes2"
"accessions2:genomes2" "accessions3:genomes2" "accessions4:genomes2"
So when you take the contrast c(1,-1,0,0,0,0,0,0) you are comparing
the first coefficient, which is the intercept term, with the second
coefficient, which is a difference between accession2 and accession1
for the first genome. It is not a comparison of any interest.
The interaction model that you have fitted is inappropriate for the
questions that you want to answer. I suggest that you instead
redefine a single experiment factor with 8 levels, corresponding to
all combinations of accession and genome, so that you can fit your
model as a one way layout. I think that will produce coefficients
that you will find easier to understand and to take contrasts of.
I will not be able to provide more help for some days.
Best wishes
Gordon
On Fri, 29 Jun 2012, Aditi Rambani wrote:
Hello,
Thanks for your reply, sorry for not being clear enough.
> When you say different expression (DE) between genomes for all the
accessions, do you mean DE between genomes for each of the accessions
separately? That is DE genes for A.F1 vs D.F1, for A.Tom vs D.Tom,
for A.Tx vs D.Tx and for A.Mx vs D.Mx?
Yes, that is what i meant. We do a contrast for A.F1 vs D.F1 like this
: [contrast=c(1,-1,0,0,0,0,0,0)].
Our problem is that even when contrast is between A.F1(0 rpkm) vs D.
F1 (0 rpkm) it has significant FDR, I dont understand how it can
contrast two columns with zero values and show a significant
differential expression. Also, it sometimes ignores genes with
significant bias but that number is not very high.
> When you say DE between accessions, is that for each genome
separately? In other words, do you expect the differences between the
accessions to be relatively the same for each genome, or will the
differences between accessions be genome-specific?
We are not absolutely certain about what differences to expect from
genomes between accessions. We do know that some accessions are more
closely related than others and we can assume they will have a similar
differential expression pattern. For accessions we did our contrasts
like this :
acc1= Mx vs Tx [lrt.acc1 <- glmLRT(dge2, fit2,
contrast=c(0,0,1,1,-1,-1,0,0))]
acc2 = Tom vs Mx/Tx [lrt.acc2 <- glmLRT(dge2, fit2,
contrast=c(0,0,1,1,1,1,-2,-2))]
acc3= F1 vs Mx/Tx/Tom [lrt.acc3 <- glmLRT(dge2, fit2,
contrast=c(-3,-3,1,1,1,1,1,1))]
I have no way to test accuracy of these results because i dont know
what is expected. I was hoping if we implemented the package correctly
we could trust the results.
> Your comments about zero expression and not detecting significantly
DE genes don't make sense to me. I won't try to respond to those
comments because I think sorting out the above questions will probably
solve other perceived problems as well.
- I will appreciate your input on this matter and thanks for looking
into this.
Aditi
> Date: Thu, 28 Jun 2012 12:01:50 -0700
> To: "bioconductor@stat.math.ethz.ch"
<bioconductor@stat.math.ethz.ch>
> Subject: [BioC] Contrast Problem
>
> Hello,
>
> I am a graduate student at Brigham Young University working on
polyploid cotton RNA seq data. Our study design has two explanatory
variables, one is 'accession' with four levels (F1,Tom,Tx,Mx) and
other one is 'genome' with two levels (A genome or D genome). We want
to detect differential expression of genes between 'genomes' from all
the accessions and also find genes that are differentially expressed
between accessions. We built a (Accession*Genome) model and did a
contrast for two levels of 'genomes'. In contrast results we see that
many genes with zero expression (0 RPKM) have significant FDRs and
some significantly differentially expressed genes are not detected. We
are not sure why this is happening, any help will be greatly
appreciated.
>
> Thanks
>
> Aditi
>
> We are using following script to do our analysis but
>
>
> library("edgeR")
>
> counts <- read.table(INFILE, header=T, row.names=1)
> groups <- factor(c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8))
> accessions <-
factor(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4))
> genomes <-
factor(c(1,1,1,2,2,2,1,1,1,2,2,2,1,1,1,2,2,2,1,1,1,2,2,2))
>
>
> design2 <- model.matrix(~accessions*genomes)
>
> dge <- DGEList(counts=counts, group=groups)
> dge <- calcNormFactors(dge)
> dge2 <- estimateGLMCommonDisp(dge, design2)
> dge2 <- estimateGLMTrendedDisp(dge2, design2)
> dge2 <- estimateGLMTagwiseDisp(dge2, design2)
>
> fit2 <- glmFit(dge2, design2)
>
> lrt.acc1 <- glmLRT(dge2, fit2, contrast=c(0,0,1,1,-1,-1,0,0))
> lrt.acc2 <- glmLRT(dge2, fit2, contrast=c(0,0,1,1,1,1,-2,-2))
> lrt.acc3 <- glmLRT(dge2, fit2, contrast=c(-3,-3,1,1,1,1,1,1))
> lrt.F1 <- glmLRT(dge2, fit2, contrast=c(1,-1,0,0,0,0,0,0))
> lrt.Mx <- glmLRT(dge2, fit2, contrast=c(0,0,1,-1,0,0,0,0))
> lrt.Tx <- glmLRT(dge2, fit2, contrast=c(0,0,0,0,1,-1,0,0))
> lrt.Tom <- glmLRT(dge2, fit2, contrast=c(0,0,0,0,0,0,1,-1))
>
>
> write.table(topTags(lrt.acc1, n=10000), file="acc1.results",
sep="\t", quote=F)
> write.table(topTags(lrt.acc2, n=10000), file="acc2.results",
sep="\t", quote=F)
> write.table(topTags(lrt.acc3, n=10000), file="acc3.results",
sep="\t", quote=F)
> write.table(topTags(lrt.F1, n=10000), file="F1.results", sep="\t",
quote=F)
> write.tabletopTagslrt.Mx, n=10000), file="Mx.results", sep="\t",
quote=F)
> write.table(topTags(lrt.Tx, n=10000), file="Tx.results", sep="\t",
quote=F)
> write.table(topTags(lrt.Tom, n=10000), file="Tom.results", sep="\t",
quote=F)
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:8}}