VSN: narrowing down probe sets for parameter estimation
1
0
Entering edit mode
@stefan-thomsen-2425
Last seen 10.2 years ago
Dear all, I am working on an Affymetrix time series data set with high percentages (30-40%) and mostly downregulated differentials. In a previous discussion regarding the question of a suitable normalization strategy for such data sets Wolfgang Huber highly recommended to "narrow down the probes from which you fit the parameters from all genes (incl. the differential ones) to a subset which are enriched for non-changing." In this context I have two questions: 1) What is the minimum number of genes/probes that should be used for VSN parameter estimation? I could extract a list of some hundred 'stable' or 'low variability' genes from previous microarray studies. Would this number be sufficient or do I need bigger probe subsets (thousands of probes, 1/2 of all probes, etc.)? 2) Is there a straight foward way to implement this into standard R packages offerring VSN? In other words, if I perform a VSN parameter estimation on my gene/probe subset, how (in R terms) would I subsequently apply this to the whole dataset?(My apologies if this is trivial, my programming skills are still rather a disgrace :) ) Any comment on these questions would be highly appreciated. Kind regards, Stefan -- Dr. Stefan Thomsen Research Associate Department of Zoology University of Cambridge Downing Street Cambridge CB2 3EJ Tel.: +44 1223 336623 Fax: +44 1223 336679 stt26 at cam.ac.uk
Microarray probe vsn Microarray probe vsn • 634 views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 12 weeks ago
EMBL European Molecular Biology Laborat…
Hi Stefan, 0) vsn already has an algorithm that attempts to narrow down the probes that are used. This is the so-called "robustification" of the ML estimator by Least Trimmed Sum of Squared minimisation. But of course this is automatic, and sometimes not perfect, and if you have an external way of identifi?ing non-changing probes, that can be very useful. 1) A few hundred should be OK in practice. What is more important than their number is that they about equally cover the whole dynamic range! 2) x = an ExpressionSet fit = vsn2(x[yourSelectedProbes, ]) nx = predict(fit, newdata=exprs(x)) see also the man page method?predict("vsn") (Please use latest release version.) Hope this helps Wolfgang > Dear all, > > I am working on an Affymetrix time series data set with high percentages > (30-40%) and mostly downregulated differentials. > > In a previous discussion regarding the question of a suitable normalization > strategy for such data sets Wolfgang Huber highly recommended to "narrow > down the probes from which you fit the parameters from all genes (incl. the > differential ones) to a subset which are enriched for non-changing." > > In this context I have two questions: > > 1) What is the minimum number of genes/probes that should be used for VSN > parameter estimation? I could extract a list of some hundred 'stable' or > 'low variability' genes from previous microarray studies. Would this number > be sufficient or do I need bigger probe subsets (thousands of probes, 1/2 > of all probes, etc.)? > > 2) Is there a straight foward way to implement this into standard R > packages offerring VSN? In other words, if I perform a VSN parameter > estimation on my gene/probe subset, how (in R terms) would I subsequently > apply this to the whole dataset?(My apologies if this is trivial, my > programming skills are still rather a disgrace :) ) > > Any comment on these questions would be highly appreciated. > > Kind regards, > > Stefan >
ADD COMMENT

Login before adding your answer.

Traffic: 543 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6