PLIER affinities redux
4
0
Entering edit mode
@jeremy-gollub-1462
Last seen 10.2 years ago
Hi, All - James MacDonald (I think) answered my previous posting, and I promptly lost the message. Thanks, James, and apologies. The issue at hand was strange and (I believe) incorrect reporting of probe affinities from justPlier (plier package). At James' suggestion I have update to R 2.2.0 and plier 1.2.0, but the affinities are still coming back in a sparse <# probe pairs> X <# arrays> matrix, rather than as a useful vector. The colnames of this matrix are the sampleNames from the eset provided to justPlier; the rownames are the probeNames. James, you said this doesn't happen to you. How do you retrieve the affinities? Maybe I'm just looking at the wrong slot (see below). Looking at the justPlier source code, though, I don't see any other way to get them. Also, does justPlier allow one to pass the affinities back to another invocation of the method, rather than computing them from the current data? Thanks, - Jeremy Gollub The session (edited for readability): # --------------------------------------------------------------------- > sessionInfo()) R version 2.2.0, 2005-10-06, i386-pc-mingw32 attached base packages: [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" [7] "datasets" "base" other attached packages: rae230acdf plier affy Biobase qvalue "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0" > data <- ReadAffy() > data AffyBatch object size of arrays=602x602 features (50972 kb) cdf=RAE230A (15923 affyids) number of samples=18 number of genes=15923 annotation=rae230a > res <- justPlier(data, get.affinities = TRUE) > dim(res at description@preprocessing$affinity) [1] 175477 18 > sum(res at description@preprocessing$affinity != 0) [1] 175407 # ---------------------------------------------------------------------- -- Jeremy Gollub jeremy at gollub.net
cdf probe Biobase affy plier cdf probe Biobase affy plier • 1.6k views
ADD COMMENT
0
Entering edit mode
@jeremy-gollub-1462
Last seen 10.2 years ago
Hi, James - Thanks for clarifying that. Looking at the source code for justPlier, it seems there's a misapprehension about the probe affinity values - the justPlier code expects to get back a value for each probe (note, not probe set) in each sample, whereas the algorithm/C++ code actually returns a single value for each probe pair, which is applied to all samples. I think the padding with many extra zeros is done by justPlier, not by the C++ code...? Crispin, unless I'm missing something, this should probably be considered a bug. In the meantime, I should be able to deconvolute the matrix back into the expected vector, but I'm not sure which probe pair to associate with each value. - Jeremy James W. MacDonald wrote: > > Hi Jeremy, > > Seems I misunderstood your earlier email. I somehow thought you were > talking about expression values, not the affinities. I get the same sort > of result for the affinities that you are reporting. > > However it doesn't appear to be a sparse matrix per se, but a matrix > with a bunch of zeros padded on the end. > > > pset <- justPlier(dat, get.affinities = TRUE) > > pset > Expression Set (exprSet) with > 54675 genes > 10 samples > phenoData object with 1 variables and 10 cases > varLabels > sample: arbitrary numbering > > dim(pset at description@preprocessing$affinity) > [1] 604258 10 > > sum(rowSums(pset at description@preprocessing$affinity) != 0) > [1] 54676 > > a <- which(rowSums(pset at description@preprocessing$affinity) != 0) > > range(a) > [1] 1 54676 > > This is what I get with a HG-U133Plus_2 chip. It looks to me like there > are indeed affinities for each probeset (rather than each probe), but > the vector of affinities that is output by the C++ code is padded with a > bunch of zeros. Maybe the result is different for other chips? > > Anyway, this is probably a question for Crispin Miller, who maintains > the package. > > Best, > > Jim > > > > Jeremy Gollub wrote: > > Hi, All - > > > > James MacDonald (I think) answered my previous posting, and I promptly > > lost the message. Thanks, James, and apologies. > > > > The issue at hand was strange and (I believe) incorrect reporting of > > probe affinities from justPlier (plier package). At James' suggestion > > I have update to R 2.2.0 and plier 1.2.0, but the affinities are still > > coming back in a sparse <# probe pairs> X <# arrays> matrix, rather than > > as a useful vector. The colnames of this matrix are the sampleNames from > > the eset provided to justPlier; the rownames are the probeNames. > > > > James, you said this doesn't happen to you. How do you retrieve the > > affinities? Maybe I'm just looking at the wrong slot (see below). > > Looking at the justPlier source code, though, I don't see any other > > way to get them. > > > > Also, does justPlier allow one to pass the affinities back to another > > invocation of the method, rather than computing them from the current > > data? > > > > Thanks, > > > > - Jeremy Gollub > > > > > > The session (edited for readability): > > > > # --------------------------------------------------------------------- > > > > > >>sessionInfo()) > > > > R version 2.2.0, 2005-10-06, i386-pc-mingw32 > > > > attached base packages: > > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > > [7] "datasets" "base" > > > > other attached packages: > > rae230acdf plier affy Biobase qvalue > > "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0" > > > > > >>data <- ReadAffy() > >>data > > > > AffyBatch object > > size of arrays=602x602 features (50972 kb) > > cdf=RAE230A (15923 affyids) > > number of samples=18 > > number of genes=15923 > > annotation=rae230a > > > > > >>res <- justPlier(data, get.affinities = TRUE) > >>dim(res at description@preprocessing$affinity) > > > > [1] 175477 18 > > > > > >>sum(res at description@preprocessing$affinity != 0) > > > > [1] 175407 > > > > # ---------------------------------------------------------------------- > > > > > -- > James W. MacDonald > Affymetrix and cDNA Microarray Core > University of Michigan Cancer Center > 1500 E. Medical Center Drive > 7410 CCGC > Ann Arbor MI 48109 > 734-647-5623 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- Jeremy Gollub jeremy at gollub.net
ADD COMMENT
0
Entering edit mode
Jeremy Gollub wrote: > Hi, James - > > Thanks for clarifying that. > > Looking at the source code for justPlier, it seems there's a misapprehension > about the probe affinity values - the justPlier code expects to get back > a value for each probe (note, not probe set) in each sample, whereas the > algorithm/C++ code actually returns a single value for each probe pair, > which is applied to all samples. I think the padding with many extra > zeros is done by justPlier, not by the C++ code...? No, there is nothing in the code of justPlier that would do that (anyway, I checked it out). The C++ code returns a vector of length <number of="" probes=""> x <number of="" chips=""> that has affinity values for the first <number of="" *probesets*=""> x <number of="" chips="">, followed by the zeros. The R code in justPlier simply puts this vector into a matrix and then transposes (note that the size of the matrix will be determined by the length of r$affinity, since no ncol argument is given). if (get.affinities) { a <- t(matrix(r$affinity, nrow = num_exp)) colnames(a) <- sampleNames(eset) rownames(a) <- probeNames(eset) res at description@preprocessing$affinity = a } Best, Jim > > Crispin, unless I'm missing something, this should probably be considered > a bug. In the meantime, I should be able to deconvolute the matrix back > into the expected vector, but I'm not sure which probe pair to associate > with each value. > > - Jeremy -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States
Hi Jeremy, Seems I misunderstood your earlier email. I somehow thought you were talking about expression values, not the affinities. I get the same sort of result for the affinities that you are reporting. However it doesn't appear to be a sparse matrix per se, but a matrix with a bunch of zeros padded on the end. > pset <- justPlier(dat, get.affinities = TRUE) > pset Expression Set (exprSet) with 54675 genes 10 samples phenoData object with 1 variables and 10 cases varLabels sample: arbitrary numbering > dim(pset at description@preprocessing$affinity) [1] 604258 10 > sum(rowSums(pset at description@preprocessing$affinity) != 0) [1] 54676 > a <- which(rowSums(pset at description@preprocessing$affinity) != 0) > range(a) [1] 1 54676 This is what I get with a HG-U133Plus_2 chip. It looks to me like there are indeed affinities for each probeset (rather than each probe), but the vector of affinities that is output by the C++ code is padded with a bunch of zeros. Maybe the result is different for other chips? Anyway, this is probably a question for Crispin Miller, who maintains the package. Best, Jim Jeremy Gollub wrote: > Hi, All - > > James MacDonald (I think) answered my previous posting, and I promptly > lost the message. Thanks, James, and apologies. > > The issue at hand was strange and (I believe) incorrect reporting of > probe affinities from justPlier (plier package). At James' suggestion > I have update to R 2.2.0 and plier 1.2.0, but the affinities are still > coming back in a sparse <# probe pairs> X <# arrays> matrix, rather than > as a useful vector. The colnames of this matrix are the sampleNames from > the eset provided to justPlier; the rownames are the probeNames. > > James, you said this doesn't happen to you. How do you retrieve the > affinities? Maybe I'm just looking at the wrong slot (see below). > Looking at the justPlier source code, though, I don't see any other > way to get them. > > Also, does justPlier allow one to pass the affinities back to another > invocation of the method, rather than computing them from the current > data? > > Thanks, > > - Jeremy Gollub > > > The session (edited for readability): > > # --------------------------------------------------------------------- > > >>sessionInfo()) > > R version 2.2.0, 2005-10-06, i386-pc-mingw32 > > attached base packages: > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > [7] "datasets" "base" > > other attached packages: > rae230acdf plier affy Biobase qvalue > "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0" > > >>data <- ReadAffy() >>data > > AffyBatch object > size of arrays=602x602 features (50972 kb) > cdf=RAE230A (15923 affyids) > number of samples=18 > number of genes=15923 > annotation=rae230a > > >>res <- justPlier(data, get.affinities = TRUE) >>dim(res at description@preprocessing$affinity) > > [1] 175477 18 > > >>sum(res at description@preprocessing$affinity != 0) > > [1] 175407 > > # ---------------------------------------------------------------------- > -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
@jeremy-gollub-1462
Last seen 10.2 years ago
Hi, Matt - This was indeed useful. Examining r$affinity for a varying number of samples on RAE230A arrays, I find the following behavior: - length(r$affinity) = <# of samples> * 175,477 - I believe 175,477 is the number of useful probe pairs on the array. - sum(r$affinity != 0) <= min(<# probe sets> * <# samples>, <# probe pairs>) That is, the number of non-zero values increases to a maximum of the number of probe pairs, but is always less than the number of probe sets * samples. This explains why your results suggest probe sets * samples - if you used more samples, you'd see the non-zero affinities hit a maximum count of the number of probe pairs. It seems pretty clear that this behavior arises from the C++ code, not from justPlier as I previously suggested. I still suspect it's incorrect, though, since my understanding of the PLIER algorithm is that there should always be one affinity value per probe pair (so I should always be seeing 175,477 non-zero values, minus the values that are legitimately zero). I'll see whether I can confirm that, and report back if so. Comments from those more knowledgeable than I would be greatly appreciated. Thanks, - Jeremy Matthew Hannah wrote: > > Jeremy, > > I got distracted whilst still in draft and in the meantime Jim has responded. Anyway to add - > > I pasted your code into my fresh BioC install and it doesn't work with the ATH1121501 array either, it's inherent in the justPlier code. > Looking at justPlier code shows that it returns a matrix of probeNames(eset) x sampleNames(eset) which is consistent with what we find. After investigating it seems the r$affinity (within justPlier) is 159683 values followed by zeros until length 1757546. This is ~100000 short of what is needed for 1 affinity per probepair but if you divide it by the number of arrays I had = 159683/7 = 22811.86 it is pretty close to the number of probesets on the array(22810). So this would imply a probesets x samples output matrix. > > Not sure if this is still helpful after Jim's response. > > Cheers, > Matt > > Dr. Matt Hannah > Max-Planck Institute of Molecular Plant Physiology > Am Mühlenburg 1 > 14476 Golm > Germany > > + 49 (331) 567 8255 (phone) > + 49 (331) 567 8250 (fax) > > > > > > > >>>>>>>>>>>>>>>>>> > Hi, All - > > James MacDonald (I think) answered my previous posting, and I promptly > lost the message. Thanks, James, and apologies. > > The issue at hand was strange and (I believe) incorrect reporting of > probe affinities from justPlier (plier package). At James' suggestion > I have update to R 2.2.0 and plier 1.2.0, but the affinities are still > coming back in a sparse <# probe pairs> X <# arrays> matrix, rather than > as a useful vector. The colnames of this matrix are the sampleNames from > the eset provided to justPlier; the rownames are the probeNames. > > James, you said this doesn't happen to you. How do you retrieve the > affinities? Maybe I'm just looking at the wrong slot (see below). > Looking at the justPlier source code, though, I don't see any other > way to get them. > > Also, does justPlier allow one to pass the affinities back to another > invocation of the method, rather than computing them from the current > data? > > Thanks, > > - Jeremy Gollub > > > The session (edited for readability): > > # --------------------------------------------------------------------- > > > sessionInfo()) > R version 2.2.0, 2005-10-06, i386-pc-mingw32 > > attached base packages: > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > [7] "datasets" "base" > > other attached packages: > rae230acdf plier affy Biobase qvalue > "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0" > > > data <- ReadAffy() > > data > AffyBatch object > size of arrays=602x602 features (50972 kb) > cdf=RAE230A (15923 affyids) > number of samples=18 > number of genes=15923 > annotation=rae230a > > > res <- justPlier(data, get.affinities = TRUE) > > dim(res at description <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> @preprocessing$affinity) > [1] 175477 18 > > > sum(res at description <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> @preprocessing$affinity != 0) > [1] 175407 > > # ---------------------------------------------------------------------- > > -- > Jeremy Gollub > jeremy at gollub.net <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > -- Jeremy Gollub jeremy at gollub.net
ADD COMMENT
0
Entering edit mode
Crispin Miller ★ 1.1k
@crispin-miller-264
Last seen 10.2 years ago
Hi all, Sorry for not being part of the conversation earlier... Let me take a look at things and put a patch together... Thanks for pointing this out, Crispin -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor- bounces@stat.math.ethz.ch] On Behalf Of Jeremy Gollub Sent: 18 October 2005 19:12 To: Matthew Hannah Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] PLIER affinities redux Hi, Matt - This was indeed useful. Examining r$affinity for a varying number of samples on RAE230A arrays, I find the following behavior: - length(r$affinity) = <# of samples> * 175,477 - I believe 175,477 is the number of useful probe pairs on the array. - sum(r$affinity != 0) <= min(<# probe sets> * <# samples>, <# probe pairs>) That is, the number of non-zero values increases to a maximum of the number of probe pairs, but is always less than the number of probe sets * samples. This explains why your results suggest probe sets * samples - if you used more samples, you'd see the non-zero affinities hit a maximum count of the number of probe pairs. It seems pretty clear that this behavior arises from the C++ code, not from justPlier as I previously suggested. I still suspect it's incorrect, though, since my understanding of the PLIER algorithm is that there should always be one affinity value per probe pair (so I should always be seeing 175,477 non-zero values, minus the values that are legitimately zero). I'll see whether I can confirm that, and report back if so. Comments from those more knowledgeable than I would be greatly appreciated. Thanks, - Jeremy Matthew Hannah wrote: > > Jeremy, > > I got distracted whilst still in draft and in the meantime Jim has > responded. Anyway to add - > > I pasted your code into my fresh BioC install and it doesn't work with the ATH1121501 array either, it's inherent in the justPlier code. > Looking at justPlier code shows that it returns a matrix of probeNames(eset) x sampleNames(eset) which is consistent with what we find. After investigating it seems the r$affinity (within justPlier) is 159683 values followed by zeros until length 1757546. This is ~100000 short of what is needed for 1 affinity per probepair but if you divide it by the number of arrays I had = 159683/7 = 22811.86 it is pretty close to the number of probesets on the array(22810). So this would imply a probesets x samples output matrix. > > Not sure if this is still helpful after Jim's response. > > Cheers, > Matt > > Dr. Matt Hannah > Max-Planck Institute of Molecular Plant Physiology Am M?hlenburg 1 > 14476 Golm > Germany > > + 49 (331) 567 8255 (phone) > + 49 (331) 567 8250 (fax) > > > > > > > >>>>>>>>>>>>>>>>>> > Hi, All - > > James MacDonald (I think) answered my previous posting, and I promptly > lost the message. Thanks, James, and apologies. > > The issue at hand was strange and (I believe) incorrect reporting of > probe affinities from justPlier (plier package). At James' suggestion > I have update to R 2.2.0 and plier 1.2.0, but the affinities are still > coming back in a sparse <# probe pairs> X <# arrays> matrix, rather > than as a useful vector. The colnames of this matrix are the > sampleNames from the eset provided to justPlier; the rownames are the probeNames. > > James, you said this doesn't happen to you. How do you retrieve the > affinities? Maybe I'm just looking at the wrong slot (see below). > Looking at the justPlier source code, though, I don't see any other > way to get them. > > Also, does justPlier allow one to pass the affinities back to another > invocation of the method, rather than computing them from the current > data? > > Thanks, > > - Jeremy Gollub > > > The session (edited for readability): > > # > --------------------------------------------------------------------- > > > sessionInfo()) > R version 2.2.0, 2005-10-06, i386-pc-mingw32 > > attached base packages: > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > [7] "datasets" "base" > > other attached packages: > rae230acdf plier affy Biobase qvalue > "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0" > > > data <- ReadAffy() > > data > AffyBatch object > size of arrays=602x602 features (50972 kb) cdf=RAE230A (15923 affyids) > number of samples=18 number of genes=15923 annotation=rae230a > > > res <- justPlier(data, get.affinities = TRUE) dim(res at description > > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > > @preprocessing$affinity) > [1] 175477 18 > > > sum(res at description > > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > > @preprocessing$affinity != 0) > [1] 175407 > > # > ---------------------------------------------------------------------- > > -- > Jeremy Gollub > jeremy at gollub.net > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > -- Jeremy Gollub jeremy at gollub.net -------------------------------------------------------- This email is confidential and intended solely for the use o...{{dropped}}
ADD COMMENT

Login before adding your answer.

Traffic: 871 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6