Question

VSN: narrowing down probe sets for parameter estimation

0

Entering edit mode

Stefan Thomsen ▴ 50

@stefan-thomsen-2425

Last seen 10.2 years ago

Dear all, I am working on an Affymetrix time series data set with high percentages (30-40%) and mostly downregulated differentials. In a previous discussion regarding the question of a suitable normalization strategy for such data sets Wolfgang Huber highly recommended to "narrow down the probes from which you fit the parameters from all genes (incl. the differential ones) to a subset which are enriched for non-changing." In this context I have two questions: 1) What is the minimum number of genes/probes that should be used for VSN parameter estimation? I could extract a list of some hundred 'stable' or 'low variability' genes from previous microarray studies. Would this number be sufficient or do I need bigger probe subsets (thousands of probes, 1/2 of all probes, etc.)? 2) Is there a straight foward way to implement this into standard R packages offerring VSN? In other words, if I perform a VSN parameter estimation on my gene/probe subset, how (in R terms) would I subsequently apply this to the whole dataset?(My apologies if this is trivial, my programming skills are still rather a disgrace :) ) Any comment on these questions would be highly appreciated. Kind regards, Stefan -- Dr. Stefan Thomsen Research Associate Department of Zoology University of Cambridge Downing Street Cambridge CB2 3EJ Tel.: +44 1223 336623 Fax: +44 1223 336679 stt26 at cam.ac.uk

Microarray probe vsn Microarray probe vsn • 634 views

ADD COMMENT • link updated 17.1 years ago by Wolfgang Huber ★ 13k • written 17.1 years ago by Stefan Thomsen ▴ 50

score 0 · Answer 1 · 2007-10-19

Hi Stefan, 0) vsn already has an algorithm that attempts to narrow down the probes that are used. This is the so-called "robustification" of the ML estimator by Least Trimmed Sum of Squared minimisation. But of course this is automatic, and sometimes not perfect, and if you have an external way of identifi?ing non-changing probes, that can be very useful. 1) A few hundred should be OK in practice. What is more important than their number is that they about equally cover the whole dynamic range! 2) x = an ExpressionSet fit = vsn2(x[yourSelectedProbes, ]) nx = predict(fit, newdata=exprs(x)) see also the man page method?predict("vsn") (Please use latest release version.) Hope this helps Wolfgang > Dear all, > > I am working on an Affymetrix time series data set with high percentages > (30-40%) and mostly downregulated differentials. > > In a previous discussion regarding the question of a suitable normalization > strategy for such data sets Wolfgang Huber highly recommended to "narrow > down the probes from which you fit the parameters from all genes (incl. the > differential ones) to a subset which are enriched for non-changing." > > In this context I have two questions: > > 1) What is the minimum number of genes/probes that should be used for VSN > parameter estimation? I could extract a list of some hundred 'stable' or > 'low variability' genes from previous microarray studies. Would this number > be sufficient or do I need bigger probe subsets (thousands of probes, 1/2 > of all probes, etc.)? > > 2) Is there a straight foward way to implement this into standard R > packages offerring VSN? In other words, if I perform a VSN parameter > estimation on my gene/probe subset, how (in R terms) would I subsequently > apply this to the whole dataset?(My apologies if this is trivial, my > programming skills are still rather a disgrace :) ) > > Any comment on these questions would be highly appreciated. > > Kind regards, > > Stefan >