On 4/14/2011 5:27 AM, Andreas Heider wrote:
> Dear Bioconductor mailing list,
> is ther a sensible way to deal with redundant probesets on
> like the HG-U219?
There are some things you can do, but each comes with its own
There is the findLargest() function in genefilter that will select the
probeset with the largest value of a test statistic. This assumes
other things) that all of the redundant probesets measure the same
thing. But note that the _x_ and _s_ in the probesets you list below
indicate that when Affy designed that chip the probesets
cross-hybridized with unrelated or related transcripts, respectively.
You can use the MBNI re-mapped cdfs, which take current versions of
genome and filter out probes that don't uniquely hybridize to the
genome, and then map probes to probesets based on e.g., Entrez Gene
This eliminates the problem of multiple probesets, but you then have
contend with probesets that vary from ~3 probes up to 100 or more. As
you can imagine, the probesets with 3 probes will have much larger
standard errors than those with say 100 probes. This makes downstream
analyses more difficult unless you choose to simply ignore that fact.
You could ignore the fact that you have multiple probesets that may or
may not be measuring the same thing, and assume independence (which,
course isn't even true when you have no redundant probesets).
No real satisfying alternatives, IMO, so you have to pick your poison.
> For Example:
> Probe Set ID RefSeq Transcript ID 11715100_at NM_003534
> NM_003534 11715102_x_at NM_003534
> Should I get the median/mean of te expression intensities? Or select
> highest? And what would be the procedre in R to do it? I mean, how
do I tell
> R to return the median of expression values if there are more than 1
> probesets for only 1 refseq ID?
> I hope you can help me, Andreas
> [[alternative HTML version deleted]]
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives:
James W. MacDonald, M.S.
University of Michigan
Department of Human Genetics
1241 E. Catherine St.
Ann Arbor MI 48109-5618
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues