Batch effect
3
0
Entering edit mode
@santana-sarma-3163
Last seen 10.2 years ago
Hi, How is it possible to judge whether there is any batch effect in two groups of Affymetrix .cel files ? I have got currently one Affybatch object by reading all the .cell files. Being new to Affymetrix analysis, any advice/elaboration will be very helpful. Cheers, Santana [[alternative HTML version deleted]]
• 1.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 21 minutes ago
United States
Hi Santana, On 9/5/2012 2:14 AM, Santana Sarma wrote: > Hi, > > > How is it possible to judge whether there is any batch effect in two groups > of Affymetrix .cel files ? I have got currently one Affybatch object by > reading all the .cell files. There are several things you can look at. I find PCA plots very helpful to look for batch effects. You might also look at density plots (hist() function in affy) as well as boxplots. But IMO PCA is the most useful. Best, Jim > > > Being new to Affymetrix analysis, any advice/elaboration will be very > helpful. > > > Cheers, > > Santana > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Dear Santana you could try the arrayQualityMetrics function in the eponymous package, which produces PCA plots and other diagnostics and is helpful to detect batch effects. The function runs either on the AffyBatch object, or the normalised ExpressionSet; the former is more useful to understand how well the experiment worked, the latter, how well subsequent analyses might work. Best wishes Wolfgang Sep/5/12 3:10 PM, James W. MacDonald scripsit: > Hi Santana, > > On 9/5/2012 2:14 AM, Santana Sarma wrote: >> Hi, >> >> >> How is it possible to judge whether there is any batch effect in two >> groups >> of Affymetrix .cel files ? I have got currently one Affybatch object by >> reading all the .cell files. > > There are several things you can look at. I find PCA plots very helpful > to look for batch effects. You might also look at density plots (hist() > function in affy) as well as boxplots. But IMO PCA is the most useful. > > Best, > > Jim > > >> >> >> Being new to Affymetrix analysis, any advice/elaboration will be very >> helpful. >> >> >> Cheers, >> >> Santana >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Best wishes Wolfgang Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD REPLY
0
Entering edit mode
Hi Santana, You might also try the sva function in the sva package. This function is specifically designed to identify batch effects and other sources of variation. PCA typically confounds any signal of interest with potential batch effects, so may be somewhat deceiving, particularly if the batches are not balanced across groups of interest. Best, Jeff On Wed, Sep 5, 2012 at 5:35 PM, Wolfgang Huber <whuber@embl.de> wrote: > Dear Santana > > you could try the arrayQualityMetrics function in the eponymous package, > which produces PCA plots and other diagnostics and is helpful to detect > batch effects. > > The function runs either on the AffyBatch object, or the normalised > ExpressionSet; the former is more useful to understand how well the > experiment worked, the latter, how well subsequent analyses might work. > > Best wishes > Wolfgang > > > Sep/5/12 3:10 PM, James W. MacDonald scripsit: > > Hi Santana, >> >> On 9/5/2012 2:14 AM, Santana Sarma wrote: >> >>> Hi, >>> >>> >>> How is it possible to judge whether there is any batch effect in two >>> groups >>> of Affymetrix .cel files ? I have got currently one Affybatch object by >>> reading all the .cell files. >>> >> >> There are several things you can look at. I find PCA plots very helpful >> to look for batch effects. You might also look at density plots (hist() >> function in affy) as well as boxplots. But IMO PCA is the most useful. >> >> Best, >> >> Jim >> >> >> >>> >>> Being new to Affymetrix analysis, any advice/elaboration will be very >>> helpful. >>> >>> >>> Cheers, >>> >>> Santana >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: >>> http://news.gmane.org/gmane.**science.biology.informatics.**conduc tor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>> >> >> > > -- > Best wishes > Wolfgang > > Wolfgang Huber > EMBL > http://www.embl.de/research/**units/genome_biology/huber<http: www.="" embl.de="" research="" units="" genome_biology="" huber=""> > > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Reema Singh ▴ 570
@reema-singh-4373
Last seen 10.2 years ago
On Wed, Sep 5, 2012 at 11:44 AM, Santana Sarma <aimanusarma@gmail.com>wrote: > Hi, > > > How is it possible to judge whether there is any batch effect in two groups > of Affymetrix .cel files ? I have got currently one Affybatch object by > reading all the .cell files. > Batch effect is generally come into consideration when you have data(.cel files) from two different laboratories. And you want to merge them. > > > Being new to Affymetrix analysis, any advice/elaboration will be very > helpful. > Limma has function for batch effect correction . You can go through that. Here's is one link may it helps you. https://stat.ethz.ch/pipermail/bioconductor/2007-August/018690.html Regards Reema Singh > > > Cheers, > > Santana > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@w-evan-johnson-5447
Last seen 5 months ago
United States
Santana, The first thing I always do is hierarchical clustering. Often, batch effects are easily spotted with this simple approach. Then try something like PCA. Also, just to point out, we have recently published a single-sample normalization approach, SCAN, that does a better job at normalizing the arrays. Often, artifacts that look like 'batch effects' drop out in the normalization step with this approach. We've shown in several cases that this approach does a better job at combining data than anything else out there, so it will give you a cleaner starting point. After SCAN normalization, if you still have batch effects, try ComBat or sva (both in the sva package). This will likely be all you need for your batch effects. Here is a link to our SCAN paper: http://www.sciencedirect.com/science/article/pii/S0888754312001632 Here is a link to our SCAN software: http://jlab.bu.edu/software/scan- upc/ SCAN is available in both R and Python at the site. Hope this helps! Evan On Sep 6, 2012, at 6:00 AM, bioconductor-request at r-project.org wrote: > Message: 19 > Date: Wed, 5 Sep 2012 20:46:58 -0400 > From: Jeff Leek <jtleek at="" gmail.com=""> > To: Wolfgang Huber <whuber at="" embl.de=""> > Cc: bioconductor at r-project.org > Subject: Re: [BioC] Batch effect > Message-ID: > <cagwgrqnpdhcfwvh2nhf8zxnfbk_kdy_dmeer_5nmsdxsjbxtxq at="" mail.gmail.com=""> > Content-Type: text/plain > > Hi Santana, > > You might also try the sva function in the sva package. This function is > specifically designed to identify batch effects and other sources of > variation. PCA typically confounds any signal of interest with potential > batch effects, so may be somewhat deceiving, particularly if the batches > are not balanced across groups of interest. > > Best, > > Jeff > > On Wed, Sep 5, 2012 at 5:35 PM, Wolfgang Huber <whuber at="" embl.de=""> wrote: > >> Dear Santana >> >> you could try the arrayQualityMetrics function in the eponymous package, >> which produces PCA plots and other diagnostics and is helpful to detect >> batch effects. >> >> The function runs either on the AffyBatch object, or the normalised >> ExpressionSet; the former is more useful to understand how well the >> experiment worked, the latter, how well subsequent analyses might work. >> >> Best wishes >> Wolfgang >> >> >> Sep/5/12 3:10 PM, James W. MacDonald scripsit: >> >> Hi Santana, >>> >>> On 9/5/2012 2:14 AM, Santana Sarma wrote: >>> >>>> Hi, >>>> >>>> >>>> How is it possible to judge whether there is any batch effect in two >>>> groups >>>> of Affymetrix .cel files ? I have got currently one Affybatch object by >>>> reading all the .cell files. >>>> >>> >>> There are several things you can look at. I find PCA plots very helpful >>> to look for batch effects. You might also look at density plots (hist() >>> function in affy) as well as boxplots. But IMO PCA is the most useful. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>>> >>>> Being new to Affymetrix analysis, any advice/elaboration will be very >>>> helpful. >>>> >>>> >>>> Cheers, >>>> >>>> Santana >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: >>>> http://news.gmane.org/gmane.**science.biology.informatics.**condu ctor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>>> >>> >>> >> >> -- >> Best wishes >> Wolfgang >> >> Wolfgang Huber >> EMBL >> http://www.embl.de/research/**units/genome_biology/huber<http: www="" .embl.de="" research="" units="" genome_biology="" huber=""> >> >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
SCAN-UPC-R download link appears to be broken: http://jlab.bu.edu/files/2012/08/SCAN-UPC-R.tar.gz thanks! --t On Thu, Sep 6, 2012 at 5:22 AM, W. Evan Johnson <wej@bu.edu> wrote: > Santana, > > The first thing I always do is hierarchical clustering. Often, batch > effects are easily spotted with this simple approach. Then try something > like PCA. > > Also, just to point out, we have recently published a single-sample > normalization approach, SCAN, that does a better job at normalizing the > arrays. Often, artifacts that look like 'batch effects' drop out in the > normalization step with this approach. We've shown in several cases that > this approach does a better job at combining data than anything else out > there, so it will give you a cleaner starting point. After SCAN > normalization, if you still have batch effects, try ComBat or sva (both in > the sva package). This will likely be all you need for your batch effects. > > Here is a link to our SCAN paper: > http://www.sciencedirect.com/science/article/pii/S0888754312001632 > Here is a link to our SCAN software: http://jlab.bu.edu/software /scan-upc/ > > SCAN is available in both R and Python at the site. > > Hope this helps! > > Evan > > > On Sep 6, 2012, at 6:00 AM, bioconductor-request@r-project.org wrote: > > > Message: 19 > > Date: Wed, 5 Sep 2012 20:46:58 -0400 > > From: Jeff Leek <jtleek@gmail.com> > > To: Wolfgang Huber <whuber@embl.de> > > Cc: bioconductor@r-project.org > > Subject: Re: [BioC] Batch effect > > Message-ID: > > < > CAGWgrqNPDHCFwVH2nhf8ZXnfBk_KDy_DMEER_5NMSDxsjbXTxQ@mail.gmail.com> > > Content-Type: text/plain > > > > Hi Santana, > > > > You might also try the sva function in the sva package. This function is > > specifically designed to identify batch effects and other sources of > > variation. PCA typically confounds any signal of interest with potential > > batch effects, so may be somewhat deceiving, particularly if the batches > > are not balanced across groups of interest. > > > > Best, > > > > Jeff > > > > On Wed, Sep 5, 2012 at 5:35 PM, Wolfgang Huber <whuber@embl.de> wrote: > > > >> Dear Santana > >> > >> you could try the arrayQualityMetrics function in the eponymous package, > >> which produces PCA plots and other diagnostics and is helpful to detect > >> batch effects. > >> > >> The function runs either on the AffyBatch object, or the normalised > >> ExpressionSet; the former is more useful to understand how well the > >> experiment worked, the latter, how well subsequent analyses might work. > >> > >> Best wishes > >> Wolfgang > >> > >> > >> Sep/5/12 3:10 PM, James W. MacDonald scripsit: > >> > >> Hi Santana, > >>> > >>> On 9/5/2012 2:14 AM, Santana Sarma wrote: > >>> > >>>> Hi, > >>>> > >>>> > >>>> How is it possible to judge whether there is any batch effect in two > >>>> groups > >>>> of Affymetrix .cel files ? I have got currently one Affybatch object > by > >>>> reading all the .cell files. > >>>> > >>> > >>> There are several things you can look at. I find PCA plots very helpful > >>> to look for batch effects. You might also look at density plots (hist() > >>> function in affy) as well as boxplots. But IMO PCA is the most useful. > >>> > >>> Best, > >>> > >>> Jim > >>> > >>> > >>> > >>>> > >>>> Being new to Affymetrix analysis, any advice/elaboration will be very > >>>> helpful. > >>>> > >>>> > >>>> Cheers, > >>>> > >>>> Santana > >>>> > >>>> [[alternative HTML version deleted]] > >>>> > >>>> ______________________________**_________________ > >>>> Bioconductor mailing list > >>>> Bioconductor@r-project.org > >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor< > https://stat.ethz.ch/mailman/listinfo/bioconductor> > >>>> Search the archives: > >>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > >>>> > >>> > >>> > >> > >> -- > >> Best wishes > >> Wolfgang > >> > >> Wolfgang Huber > >> EMBL > >> http://www.embl.de/research/**units/genome_biology/huber< > http://www.embl.de/research/units/genome_biology/huber> > >> > >> > >> ______________________________**_________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/**listinfo/bioconductor< > https://stat.ethz.ch/mailman/listinfo/bioconductor> > >> Search the archives: http://news.gmane.org/gmane.** > >> science.biology.informatics.**conductor< > http://news.gmane.org/gmane.science.biology.informatics.conductor> > >> > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Tim, Thanks for the heads up. Sorry about that. I have fixed the link. There is also a short vignette that explains how to install and use the package. Please let me know if you have any problems or questions. We are working on adding this package to Bioconductor, but for now you can just access it as an R package from our Web site. Regards, -Steve From: "<tim triche="">", "Jr." <tim.triche@gmail.com<mailto:tim.triche@gmail.com>> Reply-To: "ttriche@usc.edu<mailto:ttriche@usc.edu>" <ttriche@usc.edu<mailto:ttriche@usc.edu>> Date: Thursday, September 6, 2012 Thu, Sep 6, 2011 10:43 AM To: "W. Evan Johnson" <wej@bu.edu<mailto:wej@bu.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>>, Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Subject: Re: [BioC] Batch effect SCAN-UPC-R download link appears to be broken: http://jlab.bu.edu/files/2012/08/SCAN-UPC-R.tar.gz thanks! --t On Thu, Sep 6, 2012 at 5:22 AM, W. Evan Johnson <wej@bu.edu<mailto:wej@bu.edu>> wrote: Santana, The first thing I always do is hierarchical clustering. Often, batch effects are easily spotted with this simple approach. Then try something like PCA. Also, just to point out, we have recently published a single-sample normalization approach, SCAN, that does a better job at normalizing the arrays. Often, artifacts that look like 'batch effects' drop out in the normalization step with this approach. We've shown in several cases that this approach does a better job at combining data than anything else out there, so it will give you a cleaner starting point. After SCAN normalization, if you still have batch effects, try ComBat or sva (both in the sva package). This will likely be all you need for your batch effects. Here is a link to our SCAN paper: http://www.sciencedirect.com/science/article/pii/S0888754312001632 Here is a link to our SCAN software: http://jlab.bu.edu/software/scan- upc/ SCAN is available in both R and Python at the site. Hope this helps! Evan On Sep 6, 2012, at 6:00 AM, bioconductor-request@r-project.org<mailto :bioconductor-request@r-project.org=""> wrote: > Message: 19 > Date: Wed, 5 Sep 2012 20:46:58 -0400 > From: Jeff Leek <jtleek@gmail.com<mailto:jtleek@gmail.com>> > To: Wolfgang Huber <whuber@embl.de<mailto:whuber@embl.de>> > Cc: bioconductor@r-project.org<mailto:bioconductor@r-project.org> > Subject: Re: [BioC] Batch effect > Message-ID: > <cagwgrqnpdhcfwvh2nhf8zxnfbk_kdy_dmeer_5nmsdxsjbxtxq@mail.gmai l.com<mailto:cagwgrqnpdhcfwvh2nhf8zxnfbk_kdy_dmeer_5nmsdxsjbxtxq@mail.="" gmail.com="">> > Content-Type: text/plain > > Hi Santana, > > You might also try the sva function in the sva package. This function is > specifically designed to identify batch effects and other sources of > variation. PCA typically confounds any signal of interest with potential > batch effects, so may be somewhat deceiving, particularly if the batches > are not balanced across groups of interest. > > Best, > > Jeff > > On Wed, Sep 5, 2012 at 5:35 PM, Wolfgang Huber <whuber@embl.de<mailto:whuber@embl.de>> wrote: > >> Dear Santana >> >> you could try the arrayQualityMetrics function in the eponymous package, >> which produces PCA plots and other diagnostics and is helpful to detect >> batch effects. >> >> The function runs either on the AffyBatch object, or the normalised >> ExpressionSet; the former is more useful to understand how well the >> experiment worked, the latter, how well subsequent analyses might work. >> >> Best wishes >> Wolfgang >> >> >> Sep/5/12 3:10 PM, James W. MacDonald scripsit: >> >> Hi Santana, >>> >>> On 9/5/2012 2:14 AM, Santana Sarma wrote: >>> >>>> Hi, >>>> >>>> >>>> How is it possible to judge whether there is any batch effect in two >>>> groups >>>> of Affymetrix .cel files ? I have got currently one Affybatch object by >>>> reading all the .cell files. >>>> >>> >>> There are several things you can look at. I find PCA plots very helpful >>> to look for batch effects. You might also look at density plots (hist() >>> function in affy) as well as boxplots. But IMO PCA is the most useful. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>>> >>>> Being new to Affymetrix analysis, any advice/elaboration will be very >>>> helpful. >>>> >>>> >>>> Cheers, >>>> >>>> Santana >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org<mailto:bioconductor@r-project.org> >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: >>>> http://news.gmane.org/gmane.**science.biology.informatics.**condu ctor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>>> >>> >>> >> >> -- >> Best wishes >> Wolfgang >> >> Wolfgang Huber >> EMBL >> http://www.embl.de/research/**units/genome_biology/huber<http: www="" .embl.de="" research="" units="" genome_biology="" huber=""> >> >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org<mailto:bioconductor@r-project.org> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org<mailto:bioconductor@r-project.org> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- A model is a lie that helps you see the truth. Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6