Question

Appropriate merge of batches of the same array to create a final total ExpressionSet

0

Entering edit mode

svlachavas ▴ 830

@svlachavas-7225

Last seen 6 months ago

Germany/Heidelberg/German Cancer Resear…

Dear ALL,

based on an experimental design of testing 4 different substances("treatments") , dataset grows incrementally, as about on average a month period i get small batches of 5 CELs each(one control and 4 "treatments") which grow as beside the first batch, now i got the second batch which are "technical replicates" of the first batch. The platform is Affymetrix PrimeView(PM-only). So my main and very important question is:

as there is a "batch effect"(but on the same array batches) regarding the time of "creating" getting and each batch, should i preprossess each batch with the fRMA algorithm like this:

i.e. batch1 < ReadAffy(..)

norm1 <- frma(batch_1, background="rma", normalize="quantile", summarize="robust_weighted_average", target="probeset) ?

and then merge all the collected batches with some package such as inSilicoMerging which has a batch effect option ?

Or also there are other options in the frma() command that can also account for this ?

Please excuse me for my naive questions, but im a beginner in R, and i believe that this question is important, as in the end i want to unite these small batches i collect to have a final ExpressionSet

frma bioconductor batcheffect Primeview insilicomerging • 1.9k views

ADD COMMENT • link updated 9.0 years ago by Matthew McCall ▴ 830 • written 9.0 years ago by svlachavas ▴ 830

0

Entering edit mode

Hello,

I have a couple questions that will help me answer your question:

1. Do you really mean "technical replicates" -- same biological sample analyzed multiple times? From your description of how the data arise, it sounds like you probably have "biological replicates".

2. Does each batch also represent a different biological unit (e.g. different patient)?

Best, Matt

ADD REPLY • link 9.0 years ago Matthew McCall ▴ 830

0

Entering edit mode

Dear Mathiew,

please excuse me for the naive description. Actually, i checked my notes and there are indeed biological replicates: we use a specific cancer cell line and 4 different substances with a control to evaluate posssible apoptotic and other effects. So each batch represent the same procedure with the specific cell line and these substances

ADD REPLY • link 9.0 years ago svlachavas ▴ 830

score 1 · Answer 1 · 2015-04-20

1

Entering edit mode

Matthew McCall ▴ 830

@matthew-mccall-4459

Last seen 4.9 years ago

United States

You can preprocess your data using fRMA, in the manner you suggest, on each batch of samples and then combine the data for analysis. This will allow you to avoid repeatedly preprocessing your data and still be able to make comparisons across batches. However, fRMA only addresses one specific type of batch effects (ones affecting individual probes), so you may need to perform additional batch effect correction / modeling. There are several ways to do this (searching the BioC help archives will turn up multiple options).

ADD COMMENT • link 9.0 years ago Matthew McCall ▴ 830

0

Entering edit mode

Thank you for your answer. After reading the vignette for fRMA, i found that is appropriate to perform a additional batch effect correction, while i merge/combine the different batches to have the complete dataset. Thats why i found the package inSilicoMerging package and the function merge(): merge(esets, method='NONE'); method:
Merging method aimed at removing inter-study bias. Possible options are: BMC, COMBAT, GENENORM and XPN.

But i wil also search other packages for a similar approach. Moreover, i would like to ask you one more specific question:

regarding the summarization method of fRMA, should i use the "robust_weighted_average" option or another option is more apropriate ?

Best Regards

ADD REPLY • link 9.0 years ago svlachavas ▴ 830

0

Entering edit mode

The default summarization for fRMA, "robust_weighted_average", works well in most cases. If the probe effects differ across your batches, you might want to try "random_effect" summarization. However, in most cases the two will provide very similar results.

ADD REPLY • link 9.0 years ago Matthew McCall ▴ 830