filtered Exon Arrays: Core vs Extended Dataset

0

Entering edit mode

Lana Schaffer ★ 1.3k

@lana-schaffer-1056

Last seen 11.5 years ago

Hi, I have used Limma with both the core (~17,000) and extended (~120,000) Affymetrix datasets. Do you think that significant transcripts in the core dataset would also be found to be significant in the extended dataset? I have found that ~88% of the significant expressed transcripts from the core dataset are not found in the significant expressed transcripts from the extended dataset. Furthermore, 86% (1352/1575) of those significant core transcripts are found in the filtered extended dataset (input to Limma), but are not found to be significant in the filtered extended dataset. Core Extended Intersection Limma:adj.pvalue=0.05 1575 1142 225 overlap extended filtered dataset 1352 (86%) datasets 17,939 112,213 filtered datasets 17,939 61,717 Filtering was performed by standard deviation according to the following code. rs = rowSds(GL.un) lambda = 0.45 filtered = GL.un[ rs > quantile(rs, lambda, na.rm=T), ] What are your suggestions for this discrepancy? Lana Schaffer Biostatistics/Informatics The Scripps Research Institute DNA Array Core Facility La Jolla, CA 92037 (858) 784-2263 (858) 784-2994 schaffer at scripps.edu

limma limma • 1.1k views

ADD COMMENT • link updated 16.8 years ago by Mark Robinson ★ 1.1k • written 16.8 years ago by Lana Schaffer ★ 1.3k

0

Entering edit mode

Mark Robinson ★ 1.1k

@mark-robinson-2171

Last seen 11.5 years ago

Hi Lana. I can offer my view for what you are seeing. So, the thing is, some of the 120,000 transcript clusters in the extended set are represented in the core set, but just with more probesets included in them. You might say the extended set is a super set of the core set ... I'm assuming when you say extended, you really mean core+extended. Because the extended set includes probesets based on lower confidence annotation (e.g. EST only evidence), these extra probes will be measuring background at a higher rate. So, would a diff. expressed (DE) core transcript be DE in the extended set? Some of the time. But, a lot of the time the extra probes that make up the probeset will measure non-existent ESTs (i.e. background) and dilute the ability to detect DE. Of course, I could be wrong. You might verify this for yourself by looking at the probe-level data for a transcript that is very DE in the core set and not DE in the extended data ... Cheers, Mark On 07/05/2009, at 6:55 AM, Lana Schaffer wrote: > Hi, > I have used Limma with both the core (~17,000) and extended (~120,000) > Affymetrix datasets. Do you think that significant transcripts in > the > core dataset would also be found to be significant in the extended > dataset? > > > I have found that ~88% of the significant expressed transcripts from > the > > core dataset are not found in the significant expressed transcripts > from > > the extended dataset. > Furthermore, 86% (1352/1575) of those significant core transcripts are > found in the > filtered extended dataset (input to Limma), but are not found to be > significant in the filtered extended dataset. > > > Core Extended > Intersection > Limma:adj.pvalue=0.05 1575 1142 > 225 > overlap extended filtered dataset 1352 (86%) > datasets 17,939 112,213 > filtered datasets 17,939 61,717 > > > Filtering was performed by standard deviation according to the > following code. > > rs = rowSds(GL.un) > lambda = 0.45 > filtered = GL.un[ rs > quantile(rs, lambda, na.rm=T), ] > > What are your suggestions for this discrepancy? > > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ------------------------------ Mark Robinson Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: m.robinson at garvan.org.au e: mrobinson at wehi.edu.au p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852

ADD COMMENT • link 16.8 years ago Mark Robinson ★ 1.1k

Login before adding your answer.