Entering edit mode
Lizhe Xu
▴
210
@lizhe-xu-666
Last seen 10.3 years ago
I just read the SNOMAD from the book "The Analysis of Gene Expression
Data". It makes the three steps of normalizations after global scale
much meaningful to me:
local mean normalization and local variance correction across the
intensities and local mean across the spatial surface.
Since now I try to use Bioconductor to analyze my Affy data (rma and
gcrma), I wonder which of the above steps the quantile normalization
in RMA corresponds to .
Is it possible to combine RMA(gcRMA) with SNOMAD? If yes, how?
Thanks.
Lizhe
-----Original Message-----
From: bioconductor-request@stat.math.ethz.ch [mailto:bioconductor-
request@stat.math.ethz.ch]
Sent: Friday, March 19, 2004 5:04 AM
To: bioconductor@stat.math.ethz.ch
Subject: Bioconductor Digest, Vol 13, Issue 45
Send Bioconductor mailing list submissions to
bioconductor@stat.math.ethz.ch
To subscribe or unsubscribe via the World Wide Web, visit
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
or, via email, send a message with subject or body 'help' to
bioconductor-request@stat.math.ethz.ch
You can reach the person managing the list at
bioconductor-owner@stat.math.ethz.ch
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioconductor digest..."
Today's Topics:
1. Re: limma: topTable (Julia Engelmann)
2. Subsetting Affybatch Objects by Gene list. (Horswell, Stuart)
----------------------------------------------------------------------
Message: 1
Date: Fri, 19 Mar 2004 08:50:19 +0100
From: Julia Engelmann <julia.engelmann@biozentrum.uni-wuerzburg.de>
Subject: Re: [BioC] limma: topTable
To: bioconductor@stat.math.ethz.ch
Message-ID:
<200403190850.19084.julia.engelmann@biozentrum.uni-
wuerzburg.de>
Content-Type: text/plain; charset="iso-8859-1"
Sorry, here comes my code:
library(affy)
library(limma)
library(vsn)
data <- ReadAffy()
normalize.AffyBatch.methods <- c(normalize.AffyBatch.methods, "vsn")
Set <- expresso(data, normalize.method="vsn",
bgcorrect.method="none",pmcorrect.method="pmonly",
summary.method="medianpolish")
# R and T stand for different treatment, R1 and R2 as well as T1 and
T2 are
biological reps.
design <- model.matrix(~ -1+factor(c(1,1,2,2,3,3,4,4)))
colnames(design) <- c("R1","R2","T1","T2")
fit <- lmFit(Set, design)
contrast.matrix <- makeContrasts(((R1+R2)-(T1
+T2))/2,R1-T1,R2-T2,levels=design);
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
data <- read.table("ATH1-121501_annot.csv", header=TRUE,
sep=",")
res <- topTable(fit2, number=10,coef=1,
adjust="fdr",genelist=data,sort.by="P",resort.by="M")
Thanks a lot again,
Julia
> >Hi Bioconductor folks,
> >
> >I have really been enjoying using the limma package, but I have
just come
> >across a problem.
> >When I use the topTable-command, the slot "result$Probe.Set.ID"
does not
> > seem to match the other entries I am interested in, namely M, P,
t.
> >I am using R 1.8.0, limma 1.5.5 on Affymetrix ATH1-121501 chips.
> >
> >An example would be:
> >Output of the topTable-result:
> >
> > Probe.Set.ID M t
> > P.Value B
> >5598 259302_at 4.593339 38.9793 1.126260e-05 11.88238
> >
> >If I check on gene number 5598 it says
> >geneNames(myExprSet)[5598]
> >[1] "250498_at"
> >
> >Am I misinterpreting 5598 as the index of my ExpressionSet?
>
> I think you've done something non-standard, anyway you can't have
simply
> used lmFit() and eBayes() and topTable() on our exprSet object.
Please show
> us enough of your code so that we can see how topTable is getting
the probe
> set ID's.
>
> Gordon
>
> >Thanks a lot for any suggestions!
> >Julia
------------------------------
Message: 2
Date: Fri, 19 Mar 2004 10:33:31 -0000
From: "Horswell, Stuart" <stuart.horswell@csc.mrc.ac.uk>
Subject: [BioC] Subsetting Affybatch Objects by Gene list.
To: <bioconductor@stat.math.ethz.ch>
Message-ID:
<a09f613372b70c4cbce35a9095f434f009ffee@icex34.cc.ic.ac.uk>
Content-Type: text/plain; charset="iso-8859-1"
Hi again,
First, I'd like to say thanks to everyone who's sent comments
or suggestions on this. I've now managed to sort the problems out but
just wanted to reply to a few of the questions raised.
>From: "Matthew Hannah" <hannah@mpimp-golm.mpg.de>
>I'm abit confused as to why you want to go to so much trouble to
subset your >>data.
>RMA or any of the expresso functions can be called on the entire
affybatch and >then
>written to file.
>
Yes that's true but I want to examine what sort of a difference it
makes if I normalize at the probe level rather than at the
"expression" level (i.e. the value obtained by combining all of the
probe pairs in a probe set). I can normalize the probes, but computing
the expression values, well, I'd rather use the R functions already
written if I can!
>I don't see how this would be any different to the MAS5 analysis
>you presumably want consistency with as MAS5 scales/normalises on a
whole chip >basis anyway.
I've performed all of my previous analyses in Excel starting with only
the MAS5 P/A calls (filtered almost as you suggest below!) and the raw
expression levels (most definitely not already quantile normalized).
>If you >want to
>use BioC/R for more analysis you could save the result to a txt file
and then >just read it back into R.
Again, very true but normalizing in Excel is quite laborious
(particularly over 24x5 individual chips) and I wanted to use R to
automate the process.
>A better thing to look into may be whether you really want to filter
based on >the P/A
>calls as with RMA you might find that including A genes only has a
very small >effect on
>your final list of genes (if based on fold change).
I'm using SAM (amongst other things) since it's less biased towards
selecting low expressors than fold change and I want to compare the
results with previous results based on P/A filtering. Ironically, I
want to use RMA since it's supposed to be less biased towards *high*
expressors than the MAS5 expressions!
> And thats before considering if
>the P/A call is useful due to the 1/3 of MM>PM.
That's something we plan to look at post hoc (and probably post haste
too) once we've got an idea of what results we get from the absolute
filter.
Professor Gentleman;
>Well, first, R and Bioconductor are *open source* and that means you
> really can get at the source code for all functions and methods.
I'm sorry, I didn't mean to imply that the code was hidden, just that
I couldn't figure out how to get at it! (although I still can't
convince getMethods to recognise "normalize" as an argument but that's
a question for a different mailing list!)
>I can only suggest that if you want to do reasonably sophisticated
> things in any language that spending some time learning how to
> program in it will be rewarded.
I hadn't realised that affybatch was an S4 object. I thought, naively,
that it was a structure specific to affybatch (or at least
bioconductor), and I had also hoped that there might be some pre-
programed way of getting expresso to do what I wanted which I had
missed. That's why I queried it here - I'm sorry if I wandered off
topic for this list :)
Once again I'd like to thank everyone for their comments and advice.
If anyone else is interested in comparing the effects of filtering
before normalizing I'd be happy to send you a copy of my (rather ugly
and inefficient) code.
best wishes
Stu
------------------------------
_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
End of Bioconductor Digest, Vol 13, Issue 45