probeset for a single gene
3
0
Entering edit mode
Weiwei Shi ★ 1.2k
@weiwei-shi-1407
Last seen 10.3 years ago
Hi, I went through the archive for a while and still did not find the good answer for that. Sorry for the re-post :( suppose i have some probes for the same gene, I am wondering which is the proper way to get a statistic for the expression for this gene? using mean, median or max or min? I think it might be affected by the research target but I wondering if there is some ref on it. btw, is there some ref on the data pre-processing (gene selection, multiple comparison, better with case study) for microarray analysis other than bioconductor book? thanks -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III
Microarray Microarray • 821 views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Friday 10 November 2006 13:01, Weiwei Shi wrote: > suppose i have some probes for the same gene, I am wondering which is > the proper way to get a statistic for the expression for this gene? > using mean, median or max or min? I think it might be affected by the > research target but I wondering if there is some ref on it. Hi, Weiwei. There is not a standard, no. The answer does probably depend on the research question (as you suggest), the type of data, quality metrics, and probably other factors. Some of those "other factors", such as cross- hybridization, are not readily available in every case without some additional work. > btw, is there some ref on the data pre-processing (gene selection, > multiple comparison, better with case study) for microarray analysis > other than bioconductor book? The microarray literature is relatively large, but there are many papers on all of the aspects that you mention above. A good place to start if you have the bioconductor book is with the references therein. Sean
ADD COMMENT
0
Entering edit mode
Francois Pepin ★ 1.3k
@francois-pepin-1012
Last seen 10.3 years ago
Hi Weiwei, I'm pretty sure there's been some discussion on this not too long ago, but I can't recall off the top of my mind what the subject line was. The standard answer is that it depends on why you might have different probes for the gene and what you would expect from them. In many cases, there are several probes because they give different results (else they wouldn't waste the space). The canonical example for this would a splice variant or using an alternative poly-A site. Depending on your amplification protocol, you might also be more sensitive to the distance of the probe from the poly-A site as well. If you have reason to believe that all probes should give the same result then using the average or median would make sense. This happens if you have the exact same probe on different places on the array. Otherwise, you might want to take the most interesting probe and say it represents the whole gene. How you define the most interesting probe can vary. You can use the interquartile range or it could be the one giving you the most differential expression. The most interesting probe might change from an experiment to the next (if we're talking about splice variants for example). Another option is to keep them all around. I tend to prefer this option if I'm not running statistical tests that depend on having a single measurement per gene (GO and pathway analyses are the main example that come to mind). That whichever probe is works well will come up and if several of them show up, you can believe that result some more. As Sean mentioned there is an extensive literature on those subject. Francois On Fri, 2006-11-10 at 13:01 -0500, Weiwei Shi wrote: > Hi, > I went through the archive for a while and still did not find the good > answer for that. Sorry for the re-post :( > > suppose i have some probes for the same gene, I am wondering which is > the proper way to get a statistic for the expression for this gene? > using mean, median or max or min? I think it might be affected by the > research target but I wondering if there is some ref on it. > > btw, is there some ref on the data pre-processing (gene selection, > multiple comparison, better with case study) for microarray analysis > other than bioconductor book? > > thanks >
ADD COMMENT
0
Entering edit mode
@henrik-bengtsson-4333
Last seen 7 months ago
United States
Hi. On 11/11/06, Weiwei Shi <helprhelp at="" gmail.com=""> wrote: > Hi, > I went through the archive for a while and still did not find the good > answer for that. Sorry for the re-post :( > > suppose i have some probes for the same gene, I am wondering which is > the proper way to get a statistic for the expression for this gene? > using mean, median or max or min? I think it might be affected by the > research target but I wondering if there is some ref on it. Are we talking about finding a function summarizing the probe intensities in a probeset (as in Affymetrix arrays) to a single value? For 3' expression arrays there is plenty of algorithms/publications, e.g. the single-chip model MAS 5.0, multi-chip models MBEI (dChip) and RMA. The following article lists many more with references: Irizarry, R.A.; Wu, Z. & Jaffee, H.A. Comparison of Affymetrix GeneChip expression measures. Bioinformatics, 2006, 22, 789-794 Best Henrik > > btw, is there some ref on the data pre-processing (gene selection, > multiple comparison, better with case study) for microarray analysis > other than bioconductor book? > > thanks > > -- > Weiwei Shi, Ph.D > Research Scientist > GeneGO, Inc. > > "Did you always know?" > "No, I did not. But I believed..." > ---Matrix III > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 824 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6