Bug in probeNames()
1
0
Entering edit mode
@groot-philip-de-1307
Last seen 9.7 years ago
Hello list, I ran into the following particular problem using the Custom CDF-files from Bioconductor 2.0. Please consider the following code: library(simpleaffy) x <- ReadAffy(cdfname="mm430mmentrezg") require(sprintf("%sprobe", cleancdfname(cdfName(x), addcdf=F)), character.only=T) ProbeNames_CustomCDF <- as.character(as.data.frame(mm430mmentrezgprobe)[,4]) grep("66736_at", ProbeNames_CustomCDF, fixed=T) 173151 173152 173153 173154 173155 173156 173157 However, when I do it the following way different indices are returned: grep("66736_at", probeNames(x), fixed=T) 172057 172058 172059 172060 172061 172062 172063 I do not understand the reason for this! To get the order of the basepairs (for this particular probe) I need to use the upper grep- command (I extract the basepairs from the probes library), but to get the CORRECT pm() and mm() intensities I need to use the lower grep- command. Why is this? For the Affymetrix annotation libraries everything works fine (as I would expect). Regards, Dr Philip de Groot Wageningen University
Annotation cdf Annotation cdf • 653 views
ADD COMMENT
0
Entering edit mode
@lgautieralternorg-747
Last seen 9.7 years ago
> Hello list, > > I ran into the following particular problem using the Custom CDF- files from Bioconductor 2.0. Please consider the following code: > > library(simpleaffy) > x <- ReadAffy(cdfname="mm430mmentrezg") > require(sprintf("%sprobe", cleancdfname(cdfName(x), addcdf=F)), > character.only=T) > ProbeNames_CustomCDF <- > as.character(as.data.frame(mm430mmentrezgprobe)[,4]) > > grep("66736_at", ProbeNames_CustomCDF, fixed=T) > 173151 173152 173153 173154 173155 173156 173157 > > However, when I do it the following way different indices are returned: > > grep("66736_at", probeNames(x), fixed=T) > 172057 172058 172059 172060 172061 172062 172063 > > I do not understand the reason for this! To get the order of the basepairs > (for this particular probe) I need to use the upper grep-command (I extract the basepairs from the probes library), but to get the CORRECT pm() and mm() intensities I need to use the lower grep-command. Why is this? For the Affymetrix annotation libraries everything works fine (as I > would expect). It might just be working on Affymetrix libraries by chance. To my knowledge corresponding indexes is not enforced in the code: "probeNames" (that could have been rather named "probesetNames") returns an ordered list of the keys in the corresponding CDF environment (and freedom is given to redefine CDF environments). If I remember right the data.frame in the default probe package is read from a file provided by Affymetrix (and the order of the probesets is the one found in the file). The probe packages contain a data.frame and, although the function getProbeDataAffy in the package matchprobes makes some checks by default checks against the matching CDF, there are no explicit checks for matching indexes at build-time, neither there are run-time consistency checks. The problem you encounter can also be seen as a problem with the pairs environment/probe packages you are using. Hoping this helps, Laurent > Regards, > > Dr Philip de Groot > Wageningen University > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Dear Philip, not a bug. To reinforce what Laurent has said, there is no 1:1 correspondence between the order of the features (rows) in the probe-package and in the AffyBatch. Never. This is simply because the probe-package does not have sequences for all of the reporters. It is built straight from an annotation file from Affymetrix, which misses some reporters and omits the MMs since their sequence can be easily obtained from the PMs. Please map the features (rows) in the probe-package and in the AffyBatch to each other by their (x,y) coordinates on the chip (which together form a unique feature-ID), using the xy2indices and indices2xy functions. Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber lgautier at altern.org ha scritto: >> Hello list, >> >> I ran into the following particular problem using the Custom CDF- files > from Bioconductor 2.0. Please consider the following code: >> library(simpleaffy) >> x <- ReadAffy(cdfname="mm430mmentrezg") >> require(sprintf("%sprobe", cleancdfname(cdfName(x), addcdf=F)), >> character.only=T) >> ProbeNames_CustomCDF <- >> as.character(as.data.frame(mm430mmentrezgprobe)[,4]) >> >> grep("66736_at", ProbeNames_CustomCDF, fixed=T) >> 173151 173152 173153 173154 173155 173156 173157 >> >> However, when I do it the following way different indices are returned: >> >> grep("66736_at", probeNames(x), fixed=T) >> 172057 172058 172059 172060 172061 172062 172063 >> >> I do not understand the reason for this! To get the order of the > basepairs >> (for this particular probe) I need to use the upper grep-command (I > extract the basepairs from the probes library), but to get the CORRECT > pm() and mm() intensities I need to use the lower grep-command. Why is > this? For the Affymetrix annotation libraries everything works fine (as I >> would expect). > > It might just be working on Affymetrix libraries by chance. > > To my knowledge corresponding indexes is not enforced in > the code: "probeNames" (that could have been rather named > "probesetNames") returns an ordered list of the keys in > the corresponding CDF environment (and freedom is given to > redefine CDF environments). If I remember right the data.frame > in the default probe package is read from a file provided by Affymetrix > (and the order of the probesets is the one found in the file). > > The probe packages contain a data.frame and, although the function > getProbeDataAffy in the package matchprobes makes some checks by default > checks against the matching CDF, there are no explicit checks for matching > indexes at build-time, neither there are run-time consistency checks. > > The problem you encounter can also be seen as a problem with the pairs > environment/probe packages you are using. > > Hoping this helps, > > > Laurent > > > >> Regards, >> >> Dr Philip de Groot >> Wageningen University
ADD REPLY

Login before adding your answer.

Traffic: 435 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6