Dear list:
A batch effect was observed in 16 out of 105 hgU133_plus2 arrays based
on
NUSE plot. I know SAM, like limma, is able to handle this kind of
thing
with setting block. But I am not sure if that is appropriate way for
this
unintentional situation.
I would try to use the lmFit function of limma with batch as a factor
to
remove the batch effect. Then use the resulting data values for other
software applications, for example SAM. Here come my questions:
1. Should I apply the raw data, data at exprs, instead of eset at
exprs for
running lmFit?
2. If that is a right way, how can I extract the lmFit normalized data
for
next RMA or GCRMA?
Please point it out if I am confused with something! I will appreciate
your
help!
Jianping
##################################
Jianping Jin Ph.D.
Bioinformatics scientist
Center for Bioinformatics
Room 3133 Bioinformatics building
CB# 7104
University of Chapel Hill
Chapel Hill, NC 27599
Phone: (919)843-6105
FAX: (919)843-3103
E-Mail: jjin at email.unc.edu
Dear list,
This is a follow-up info to my first email (see below). I tried to use
lmFit on my raw data, data at exprs which is a read.affybatch object.
I
realized that there was no way for me to retrieve the linear model
fitted
resulting values for next data procession as the results are stored in
a
compact form as stated in the on-line help document.
It looks like that the easier way to handle the chip batch effect for
my
data is to include "batch" as a factor in the fit model and let limma
do
the further procession.
So the usual processes are reading .CEL files in, preprocessing data
with
RMA/GCRMA, fitting data with linear model and then using eBayes. I
have
questions here, please forgive me as a none statistician, if the
quantile
normalization in RMA would do any harm than good to the data when a
batch
effect exists? What else does lmFit do if RMA has already removed the
batch
effect?
Please help me out for this confusion.
thanks!
Jianping
--On Wednesday, November 15, 2006 11:50 AM -0500 Jianping Jin
<jjin at="" email.unc.edu=""> wrote:
>
> Dear list:
>
> A batch effect was observed in 16 out of 105 hgU133_plus2 arrays
based on
> NUSE plot. I know SAM, like limma, is able to handle this kind of
thing
> with setting block. But I am not sure if that is appropriate way for
this
> unintentional situation.
>
> I would try to use the lmFit function of limma with batch as a
factor to
> remove the batch effect. Then use the resulting data values for
other
> software applications, for example SAM. Here come my questions:
> 1. Should I apply the raw data, data at exprs, instead of eset at
exprs for
> running lmFit?
> 2. If that is a right way, how can I extract the lmFit normalized
data
> for next RMA or GCRMA?
>
> Please point it out if I am confused with something! I will
appreciate
> your help!
>
> Jianping
>
> ##################################
> Jianping Jin Ph.D.
> Bioinformatics scientist
> Center for Bioinformatics
> Room 3133 Bioinformatics building
> CB# 7104
> University of Chapel Hill
> Chapel Hill, NC 27599
> Phone: (919)843-6105
> FAX: (919)843-3103
> E-Mail: jjin at email.unc.edu
##################################
Jianping Jin Ph.D.
Bioinformatics scientist
Center for Bioinformatics
Room 3133 Bioinformatics building
CB# 7104
University of Chapel Hill
Chapel Hill, NC 27599
Phone: (919)843-6105
FAX: (919)843-3103
E-Mail: jjin at email.unc.edu
Hi Jianping,
>So the usual processes are reading .CEL files in, preprocessing data
with
>RMA/GCRMA, fitting data with linear model and then using eBayes. I
have
>questions here, please forgive me as a none statistician, if the
quantile
>normalization in RMA would do any harm than good to the data when a
batch
>effect exists? What else does lmFit do if RMA has already removed the
batch
>effect?
In my experience, quantile normalization does not always remove a
batch
effect. You can test this by using some sort of clustering on the
arrays
using both the raw data and then the RMA values. I like the 'overview'
function in the made4 package, but I have no idea if it will handle
105
arrays. If you're going to fit a batch in limma, you should be using
'duplicateCorrelation', and you can see if the consensus correlation
is
positive; if it's near zero or negative, then your batch effect isn't
large
enough overall to worry about.
Cheers,
Jenny
>Please help me out for this confusion.
>
>thanks!
>
>Jianping
>
>--On Wednesday, November 15, 2006 11:50 AM -0500 Jianping Jin
><jjin at="" email.unc.edu=""> wrote:
>
> >
> > Dear list:
> >
> > A batch effect was observed in 16 out of 105 hgU133_plus2 arrays
based on
> > NUSE plot. I know SAM, like limma, is able to handle this kind of
thing
> > with setting block. But I am not sure if that is appropriate way
for this
> > unintentional situation.
> >
> > I would try to use the lmFit function of limma with batch as a
factor to
> > remove the batch effect. Then use the resulting data values for
other
> > software applications, for example SAM. Here come my questions:
> > 1. Should I apply the raw data, data at exprs, instead of eset at
exprs for
> > running lmFit?
> > 2. If that is a right way, how can I extract the lmFit normalized
data
> > for next RMA or GCRMA?
> >
> > Please point it out if I am confused with something! I will
appreciate
> > your help!
> >
> > Jianping
> >
> > ##################################
> > Jianping Jin Ph.D.
> > Bioinformatics scientist
> > Center for Bioinformatics
> > Room 3133 Bioinformatics building
> > CB# 7104
> > University of Chapel Hill
> > Chapel Hill, NC 27599
> > Phone: (919)843-6105
> > FAX: (919)843-3103
> > E-Mail: jjin at email.unc.edu
>
>
>
>##################################
>Jianping Jin Ph.D.
>Bioinformatics scientist
>Center for Bioinformatics
>Room 3133 Bioinformatics building
>CB# 7104
>University of Chapel Hill
>Chapel Hill, NC 27599
>Phone: (919)843-6105
>FAX: (919)843-3103
>E-Mail: jjin at email.unc.edu
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu