filtering

0

Entering edit mode

Lev Soinov ▴ 470

@lev-soinov-2119

Last seen 9.6 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070712/ a13c9c77/attachment.pl

• 1.6k views

ADD COMMENT • link updated 16.8 years ago by Jenny Drnevich ★ 2.2k • written 16.8 years ago by Lev Soinov ▴ 470

0

Entering edit mode

Jenny Drnevich ★ 2.2k

@jenny-drnevich-382

Last seen 9.6 years ago

Hi Lev, There have been several discussions about when to filter out data on this list previously, and the consensus has been to NOT filter until after all pre-processing steps (e.g., normalization) have been done. One reason is that one array may have had a higher background than others, and so more data values would be removed in your scheme, which can be problematic for many normalization routines. I also would caution you against removing "badly measured signals" from your data set even after pre-processing. While these numbers may not be as accurate as larger numbers, they represent very low expression or no expression. Would you remove all the zeros from any set of data? My rationale is that had there been distinct expression, you would have measured it, therefore the low values near background are valid, if not as completely accurate. In the worst case scenario, you would miss genes that weren't expressed in one treatment but were expressed in another treatment because you were throwing out all the data from the non-expressed treatment. If the signals were "badly measured" in ALL samples, then I would remove that entire probe from the analysis (after pre-processing), but not if they were badly measured in only a few samples. That's my two cents, Jenny At 08:59 AM 7/12/2007, Lev Soinov wrote: > Dear List, > I have posted a similar question before, but would like to ask you again > about filtering strategies. I have some AB1700 data and filter on signal to > noise ratios before normalization. The rationale is to get rid of badly > measured signals before actual processing of the data. Two jpg > histograms of > log2 signal distributions, before (raw.jpg) and after (filtered.jpg) > filtering, can be seen in this location: > http://tmgarden.cloud.prohosting.com/images/ > Could you please have a look at the distributions and comment on whether > this is correct to filter before normalization as this changes > the distribution of > signals a lot? > Thank you very much for your help. > Lev. > > >--------------------------------- > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu

ADD COMMENT • link 16.8 years ago Jenny Drnevich ★ 2.2k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070712/ 050282ca/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070713/ f6c604ae/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

Quoting Lev Soinov <lev_embl1 at="" yahoo.co.uk="">: > [...] > Also, it is often assumed that log transformed raw signal is > roughly Normal. where do you get that idea? My raw signals are anything but normally distributed! In fact, they look a lot like yours. The *log ratios*, however... And the distribution depends a lot on the actual array and the actual experiment. I was recently looking at some yeast tiling arrays probed with the product of doing chromatin immunoprecipitation with an antibody against a protein that is present along the body of genes. Almost everywhere, in fact, given the gene density of yeast. In this particular case, the raw data looks entirely different to yours (and to the arrays I usually deal with). The log ratios, however... Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 16.8 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070713/ 94d8e8f2/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

Hi Lev, I am not sure I follow you. I know the raw data you were talking about was log-transformed. I'm not sure what you're trying to say now. Jose Quoting Lev Soinov <lev_embl1 at="" yahoo.co.uk="">: > Dear Jose, > > I meant log trasformed raw data of course. > Sorry for any misunderstanding. The plots that I posted were on > the log2 scale. > > Thank you, > Lev. > > J.delasHeras at ed.ac.uk wrote: > Quoting Lev Soinov : > >> > [...] >> Also, it is often assumed that log transformed raw signal is >> roughly Normal. > > where do you get that idea? > > My raw signals are anything but normally distributed! In fact, they > look a lot like yours. > The *log ratios*, however... > > And the distribution depends a lot on the actual array and the actual > experiment. I was recently looking at some yeast tiling arrays probed > with the product of doing chromatin immunoprecipitation with an > antibody against a protein that is present along the body of genes. > Almost everywhere, in fact, given the gene density of yeast. In this > particular case, the raw data looks entirely different to yours (and > to the arrays I usually deal with). The log ratios, however... > > Jose > > -- > Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 > Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > --------------------------------- > Yahoo! Mail is the world's favourite email. Don't settle for less, > sign up for your freeaccount today. -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 16.8 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

Jenny Drnevich ★ 2.2k

@jenny-drnevich-382

Last seen 9.6 years ago

Hi Lev, > I would like to make some further points regarding > filtering. Firstly, the bimodal behaviour of log > transformed signals shown in the plots that I have > posted (raw and filtered raw, > http://tmgarden.cloud.prohosting.com/images/) is > probably something specific to AB1700 and some other > platforms, not Affymetrix though. Therefore, > filtering of Affy data may not be a good idea. > Secondly, it just happens that by filtering on > signal-to-noise >=3 (threshold specified by ABI to > distinguish badly measured signals) I remove the > first peak of the distribution. I have observed this > phenomenon for many AB1700 datasets and thus think > that this first peak corresponding to low > signal-to-noise probes is artificial and does not > reflect real signal (I may be wrong here). Actually, a bimodal distribution is exactly what I would expect to see if a goodly percentage of probes on the array were not expressed in your particular sample. This is very common for whole genome arrays, and I often see this on Affymetrix arrays when the total percent present can be as low as 30-40%. Thus your two distributions are the unexpressed probes (effectively "zero" but measured with error) and the expressed probes, which might or might not have a normal distribution. I don't think this is particular to AB1700 datasets, and I don't think the peak is "artificial", but instead represents probes that are not expressed. > Thirdly, > as I pointed before, low signal-to-noise does not > always indicate low raw signal for a probe. My plots > clearly show this. Therefore, this is not the case > of discarding low expressed probes from the > analysis. I understand that filtering might lead to > loosing some interesting probes, but this is a trade > off between false positive and false negative > results. So, it may be better for you to save some > money and effort during validation stages. Again, I would argue that you are throwing out "zeros", not low- expressed probes. If you were to count for each probe how many arrays it was below your filter criteria, what you would probably find is an extreme bi-modal distribution, where most probes are either above background on all arrays or below background on all arrays. I think it's fine to filter (after normalization) out those that are below background on ALL arrays, which can cut out a substantial chuck of probes and save on the FDR correction. Usually there is only a small percentage of probes that are above background on some arrays and below on other arrays. To be conservative, I leave these in because they will not affect the FDR calculations all that much and I don't want to lose probes that may be off in one treatment and on in another treatment. Sorry I don't have a graph of a typical bi-modal distribution of "present" calls to show you, but I'm at home today. > Also, it is often assumed that log transformed raw > signal is roughly Normal. Is this assumption > required for normalization stage? If yes than > removing the peak corresponding to low > signal-to-noise should be advantageous. The log- transformation does help to compress the range of expression values and decreases the mean-variance problem, but I can't remember anywhere it's been said that it should be normal after transformation. Furthermore, normality is not an assumption for normalization, only that the distributions for each array should be the SAME, whatever the shape of the distribution. Unless there is something special about AB1700 arrays (I confess I don't have any experience with them), I think the bimodality represents real measured signal for all arrays, and it's better to use all available data for the pre-processing steps, but after normalization it's fine to remove probes that fail to pass a conservative filter on ALL arrays. Even if you want to use your filter of removing "probes that have >50% of "bad" signals within a treatment", use it only if the probe has >50% "bad" signals for ALL treatments. Cheers, Jenny > > > Jenny Drnevich <drnevich at="" uiuc.edu=""> wrote: > > Hi Lev, > > There have been several discussions about when to > filter out data on > this list previously, and the consensus has been > to NOT filter until > after all pre-processing steps (e.g., > normalization) have been done. > One reason is that one array may have had a higher > background than > others, and so more data values would be removed > in your scheme, > which can be problematic for many normalization > routines. I also > would caution you against removing "badly measured > signals" from your > data set even after pre-processing. While these > numbers may not be as > accurate as larger numbers, they represent very > low expression or no > expression. Would you remove all the zeros from > any set of data? My > rationale is that had there been distinct > expression, you would have > measured it, therefore the low values near > background are valid, if > not as completely accurate. In the worst case > scenario, you would > miss genes that weren't expressed in one treatment > but were expressed > in another treatment because you were throwing out > all the data from > the non-expressed treatment. If the signals were > "badly measured" in > ALL samples, then I would remove that entire probe > from the analysis > (after pre-processing), but not if they were badly > measured in only a > few samples. > > That's my two cents, > Jenny > > At 08:59 AM 7/12/2007, Lev Soinov wrote: > > Dear List, > > I have posted a similar question before, but > would like to ask you again > > about filtering strategies. I have some AB1700 > data and filter on signal to > > noise ratios before normalization. The rationale > is to get rid of badly > > measured signals before actual processing of the > data. Two jpg > > histograms of > > log2 signal distributions, before (raw.jpg) and > after (filtered.jpg) > > filtering, can be seen in this location: > > http://tmgarden.cloud.prohosting.com/images/ > > Could you please have a look at the > distributions and comment on whether > > this is correct to filter before normalization > as this changes > > the distribution of > > signals a lot? > > Thank you very much for your help. > > Lev. > > > > > >--------------------------------- > > > > [[alternative HTML version deleted]] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > >http://news.gmane.org/gmane.science.biology.informatics.conductor > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional > Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at uiuc.edu > > > > ------------------------------------------------ > > Yahoo! Mail is the world's favourite email. Don't > settle for less, sign up for your free account > today. Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu

ADD COMMENT • link 16.8 years ago Jenny Drnevich ★ 2.2k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070713/ 7f6735fd/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

J.delasHeras@ed.ac.uk ★ 1.9k

@jdelasherasedacuk-1189

Last seen 8.7 years ago

United Kingdom

Quoting Lev Soinov <lev_embl1 at="" yahoo.co.uk="">: > Dear List, > I have posted a similar question before, but would like to ask you again > about filtering strategies. I have some AB1700 data and filter on signal to > noise ratios before normalization. The rationale is to get rid of badly > measured signals before actual processing of the data. Two jpg > histograms of > log2 signal distributions, before (raw.jpg) and after (filtered.jpg) > filtering, can be seen in this location: > http://tmgarden.cloud.prohosting.com/images/ > Could you please have a look at the distributions and comment on whether > this is correct to filter before normalization as this changes the > distribution of > signals a lot? > Thank you very much for your help. > Lev. Hi Lev, Not sure what's the problem here. Your data has a large number of very low intensity points, and you removed them. The filtered histogram looks then exactly like the raw one, minus the low intensity stuff (the large peak on the left). That looks normal. I often do something similar, but I only remove the spots/probes that have low intensity in BOTH channels (if 2-colour hybs, if 1-colour then check on both samples that you are comparing), on ALL the relevant arrays. You seem to have filtered here only on one channel? Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD COMMENT • link 16.8 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070713/ ed0cb9f8/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

I would only remove probes from the analysis if the signal is negligible in BOTH treatments, for ALL (or the majority) of the arrays. Otherwise you're likely to remove things that are expressed in one treatment but not the other and I can't imagine why you'd want to lose those, as it's an extreme case of up/downregulation. With regards removing probes before of after normalisation... I personally prefer to remove them before. When I do filter (I don't do this always), I identify probes/spots with signal too close to background on each array, for each channel (mostly dealing with 2-colour arrays myself). Then, as indicated earlier, I only filter out those that meet the criterion (for negligible intensity) on BOTH channels (both comparisons, whatever), on every array (or 80% of teh arrays or whatever I deem appropriate in that particular case). Then, those probes/spots are eliminated entirely (in limma, I just give them weight of zero), and work with the rest. I don't think that removing them after normalisation makes much difference, in most cases, as they're going to be a bunch of spots on teh far left of the MA plot, converging towards zero if the background correction was good... and even if the background correction doesn't make teh low intensity spots converge towards M=0, in most cases they would be evenly distributed around M=0 (unless you have a very bad background problem) and the end result would be the same. Jose Quoting Lev Soinov <lev_embl1 at="" yahoo.co.uk="">: > Hi Jose, > > I filtered those probes that had less than 50% of good signals in > at least one of the treatments. That was because I had four > treatments, making several contrasts in LIMMA. I did it in order to > make contrasts comparable. So, each contrast (pair of treatments) > contained the same set of probes. In your terminology I filtered on > one channel, keeping only signals detectable on ALL arrays. > The data shown in the plots are for just one of the arrays. > > My question was about whether such filtering would affect > differential expression analysis if done before normalisation. Or it > really does not matter when you filter before or after > normalization step. Does this lead to anything undesirable or > wrong, apart from loosing some genes that were "undetectable" in > one treatment but "detectable" in some other treatments? > > Thank you, > Lev. > > > J.delasHeras at ed.ac.uk wrote: > Quoting Lev Soinov : > >> Dear List, >> I have posted a similar question before, but would like to ask you again >> about filtering strategies. I have some AB1700 data and filter on signal to >> noise ratios before normalization. The rationale is to get rid of badly >> measured signals before actual processing of the data. Two jpg >> histograms of >> log2 signal distributions, before (raw.jpg) and after (filtered.jpg) >> filtering, can be seen in this location: >> http://tmgarden.cloud.prohosting.com/images/ >> Could you please have a look at the distributions and comment on whether >> this is correct to filter before normalization as this changes the >> distribution of >> signals a lot? >> Thank you very much for your help. >> Lev. > > Hi Lev, > > Not sure what's the problem here. Your data has a large number of very > low intensity points, and you removed them. The filtered histogram > looks then exactly like the raw one, minus the low intensity stuff > (the large peak on the left). That looks normal. > > I often do something similar, but I only remove the spots/probes that > have low intensity in BOTH channels (if 2-colour hybs, if 1-colour > then check on both samples that you are comparing), on ALL the > relevant arrays. You seem to have filtered here only on one channel? > > Jose > > > -- > Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 > Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > --------------------------------- > Yahoo! Answers - Get better answers from someone who knows. Tryit now. -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 16.8 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070713/ fba27875/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

Quoting Lev Soinov <lev_embl1 at="" yahoo.co.uk="">: > Hi Jose, > > Yes, I totally understand the point about losing some probes that > are "not expressed" in one treatment but "expressed" in some other > treatments. However, let's say we have 4 treatments, comparing 1vs2, > 1vs3 and 1vs4 in LIMMA. If I am not mistaken, it is recommended to > process all treatments together to get more power. Suppose a probe > has "negligible" signal in 1, 2 and 3, but very strong signal in 4. > You would obviously keep it, according to you procedure, and you > would be absolutely right if you were interested in 1vs4 only. > However, in this particular situation 1vs2 and 1vs3 doesn't make > much sense and could produce false positive results. Also, if 1, 2 > and 3 do not contain "true" signals but only some near-background > noise, how would it help to estimate the variance for this probe? I > may be wrong here, but it seems to me that information from 1, 2 and > 3 would just add more error in lmFit, thus obscuring inferences for > 1vs4 as well. > > Thank you, > Lev. Hi Lev, I must admit I don't quite follow the way you pick comparisons and probes. If I want to analyse comparisons between all 4 treatments, and a probe is only present in one of them, say number 4, I'd probably keep it. If you want to remove it because it's only expressed in one of them, that's your call, but I'd keep it: I am looking for differences between treatments, and right there there's a clear one between 4 and all teh others... why lose it? Just to improve the FDR a little? I don't understand that rationale. It's not how I would go about things, but it probably depends on the actual experiment and what your goal is. If I don't care about things that are different between treatment 4 and the other three... then I'd just leave treatment 4 out altogether. I'm sorry if I am not fully understanding you. Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 16.8 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070714/ 188e80bb/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

0

Entering edit mode

Hi Lev, thanks for clarifying. It turns out that you're doing what I thought you were doing :-) But I still don't understand why you would want to remove that probe that only has a strong signal in one treatment. I mean, that very probe will probably be one of the hits you're looking for in the first place. You seem concerned that having no signal in teh other treatments is somehow worsening your stats for comparisons between the other 3 treatments, but I wouldn't be concerned about that. By removing probes that are only present in one treatment you're removing one class of potentially very interesting probes... I wouldn't do that. You'd be removing a probe that has strong signal in 4, and negligible in 1, yet you'd leave something that has the same signal in 4, and just above background in 1... but both probes belong to the same class and you removed the one showing a more striking difference of the two. If the biology of your treatments is such that you expect treatment 4 to be quite different to the others, such that a very large proportion of probes will be expressed in 4 but absent in 1&2&3, then the removal of such probes may be useful in some way... I don't know, because in that case I'd be more concerned about how I normalise all 4 treatments together, considering that one of them should have a completely different distribution. Please, don't take my word for it, I don't know everything and I am still learning, everyday, as I go along. I find that it's hard to make categorical statements about what is best in microarray analysis, because what is best depends on the actual experiment, and you need to understand the biology behind the data: what the experiments test for, what questions you're aiming to answer, and what the actual biological system is. In principle I feel that if I am analysing together X number of experiments, making contrasts etc, is because they're somehow related and can be compared. If I do that, then as a principle, I don't want to remove a probe that only shows expression in one of the treatments because it seems to me that it is one class of probes that I really want to know about. I think I can't say much more on this matter, Lev. You have to make your own decisions based on what it is that you are after. best, Jose Quoting Lev Soinov <lev_embl1 at="" yahoo.co.uk="">: > Hi Jose, > > Let's say we have 4 treatments, 3 replicates each. I am interested > in comparing 1vs2, 1vs3 and 1vs4. I assume that instead of making > pairwise comparisons it would be better to use something like the > following script. > > temp<-normalizeBetweenArrays(log2(signals), method='quantile') > design <- model.matrix(~0 +factor(c(1,1,1,2,2,2,3,3,3,4,4,4))) > colnames(design) <- c("T1","T2","T3","T4") > contrast.matrix <- > makeContrasts(T2-T1, T3-T1, T4-T1, levels=design) > fit <- lmFit(temp, design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > > So, all treatments are included in lmFit to get more power. Let's > suppose a probe has "negligible" signal in 1, 2 and 3, but very > strong signal in 4. It would mean that for this probe T2-T1 and > T3-T1 could produce erroneous results and also would negatively > influence T4-T1 as lmFit would estimate parameters gathering > information across all 4 treatments. > > Is it reasonable? > Thank you, > Lev. > > > > > J.delasHeras at ed.ac.uk wrote: > Quoting Lev Soinov : > >> Hi Jose, >> >> Yes, I totally understand the point about losing some probes that >> are "not expressed" in one treatment but "expressed" in some other >> treatments. However, let's say we have 4 treatments, comparing 1vs2, >> 1vs3 and 1vs4 in LIMMA. If I am not mistaken, it is recommended to >> process all treatments together to get more power. Suppose a probe >> has "negligible" signal in 1, 2 and 3, but very strong signal in 4. >> You would obviously keep it, according to you procedure, and you >> would be absolutely right if you were interested in 1vs4 only. >> However, in this particular situation 1vs2 and 1vs3 doesn't make >> much sense and could produce false positive results. Also, if 1, 2 >> and 3 do not contain "true" signals but only some near-background >> noise, how would it help to estimate the variance for this probe? I >> may be wrong here, but it seems to me that information from 1, 2 and >> 3 would just add more error in lmFit, thus obscuring inferences for >> 1vs4 as well. >> >> Thank you, >> Lev. > > Hi Lev, > > I must admit I don't quite follow the way you pick comparisons and probes. > If I want to analyse comparisons between all 4 treatments, and a probe > is only present in one of them, say number 4, I'd probably keep it. If > you want to remove it because it's only expressed in one of them, > that's your call, but I'd keep it: I am looking for differences > between treatments, and right there there's a clear one between 4 and > all teh others... why lose it? Just to improve the FDR a little? I > don't understand that rationale. > It's not how I would go about things, but it probably depends on the > actual experiment and what your goal is. If I don't care about things > that are different between treatment 4 and the other three... then I'd > just leave treatment 4 out altogether. > > I'm sorry if I am not fully understanding you. > > Jose > > -- > Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 > Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > > > > --------------------------------- > What kind of emailer are you? Find out today - get a free analysis > of your email personality. Take the quiz at the Yahoo! Mail > Championship. -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 16.8 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070714/ 0ca67eb0/attachment.pl

ADD REPLY • link 16.8 years ago Lev Soinov ▴ 470

Login before adding your answer.