frmaTools: error with 'convertPlatform'
2
0
Entering edit mode
Guido Hooiveld ★ 4.0k
@guido-hooiveld-2020
Last seen 10 hours ago
Wageningen University, Wageningen, the …
Hi, I would like to use the function 'convertPlatform' (from the library frmaTools) to convert an Affybatch object from the MoGene ST v1.1 (GeneTitan array) format into that of the MoGene ST v1.0 format (cartridge array), but I run into an error. The reason that I would like to convert that Affybatch object is that I would like to combine 2 experiments performed on those 2 platform so I can normalize them together. In principle the content of the arrays is the same, that is the probeSETS should be identical, but the design and number of probes are different: the v1.0 array (cartridge) is square (1050cols x 1050rows) whereas the v1.1 array is rectangular (990cols x 1190rows). I think this may be related to the error I experience. Note also that I would like to use a remapped CDF. Any suggestions? Thanks, Guido > affy.data <- ReadAffy(cdfname="mogene11stv1mmentrezg") > affy.data Loading required package: AnnotationDbi AffyBatch object size of arrays=1190x990 features (25 kb) cdf=mogene11stv1mmentrezg (21225 affyids) number of samples=23 number of genes=21225 annotation=mogene11stv1mmentrezg notes= > object.conv <- convertPlatform(affy.data, "mogene10stv1mmentrezg") Loading required package: mogene10stv1mmentrezgprobe Loading required package: mogene11stv1mmentrezgprobe Attaching package: 'mogene10stv1mmentrezgcdf' The following object(s) are masked from 'package:mogene11stv1mmentrezgcdf': i2xy, xy2i Error in convertPlatform(affy.data, "mogene10stv1mmentrezg") : subscript out of bounds > Some maybe relevant array characteristics: > library(affxparser) > GeneSTv1.0 <- readCelHeader("MouseTP_Brain_01_mGENE.CEL") > GeneSTv1.0 $filename [1] "./MouseTP_Brain_01_mGENE.CEL" $version [1] 1 $cols [1] 1050 $rows [1] 1050 $total [1] 1102500 <<snip>> > GeneSTv1.1 <- readCelHeader("MouseBrain_1.CEL") > GeneSTv1.1 $filename [1] "./MouseBrain_1.CEL" $version [1] 1 $cols [1] 990 $rows [1] 1190 $total [1] 1178100 <<snip>> > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] frmaTools_1.8.0 affy_1.34.0 Biobase_2.16.0 BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] affyio_1.24.0 BiocInstaller_1.4.4 DBI_0.2-5 [4] preprocessCore_1.18.0 zlibbioc_1.2.0 --------------------------------------------------------- Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 email: guido.hooiveld@wur.nl internet: http://nutrigene.4t.com http://scholar.google.com/citations?user=qFHaMnoAAAAJ http://www.researcherid.com/rid/F-4912-2010 [[alternative HTML version deleted]]
cdf affy convert cdf affy convert • 1.0k views
ADD COMMENT
0
Entering edit mode
@matthew-mccall-4459
Last seen 5.0 years ago
United States
Guido, The frma and frmaTools packages use oligo (rather than AffyBatch) objects for the ST arrays, so what you're trying to do is a bit outside the intended functionality. I would also caution you against combining data from different platforms as probe behavior can change quite a bit. That said, we can see whether there's some simple modification that could let you try out what you'd like. Can you figure out at what point in the convertPlatform function the error pops up? Best, Matt On Fri, Jun 1, 2012 at 8:20 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi, > > I would like to use the function ?convertPlatform? (from the library > frmaTools) to convert an Affybatch object from the MoGene ST v1.1 (GeneTitan > array) format into that of the MoGene ST v1.0 format (cartridge array), but > I run into an error. The reason that I would like to convert that Affybatch > object is that I would like to combine 2 experiments performed on those 2 > platform so I can normalize them together. > > > > In principle the content of the arrays is the same, that is the probeSETS > should be identical, but the design and number of probes are different: the > v1.0 array (cartridge) is square (1050cols x 1050rows) whereas the v1.1 > array is rectangular (990cols x 1190rows). I think this may be related to > the error I experience. Note also that I would like to use a remapped CDF. > > > > Any suggestions? > > Thanks, > > Guido > > > > > >> affy.data <- ReadAffy(cdfname="mogene11stv1mmentrezg") > >> affy.data > > Loading required package: AnnotationDbi > > > > AffyBatch object > > size of arrays=1190x990 features (25 kb) > > cdf=mogene11stv1mmentrezg (21225 affyids) > > number of samples=23 > > number of genes=21225 > > annotation=mogene11stv1mmentrezg > > notes= > >> object.conv <- convertPlatform(affy.data, "mogene10stv1mmentrezg") > > Loading required package: mogene10stv1mmentrezgprobe > > Loading required package: mogene11stv1mmentrezgprobe > > > > > > Attaching package: ?mogene10stv1mmentrezgcdf? > > > > The following object(s) are masked from ?package:mogene11stv1mmentrezgcdf?: > > > > ??? i2xy, xy2i > > > > Error in convertPlatform(affy.data, "mogene10stv1mmentrezg") : > > ??subscript out of bounds > >> > > > > > > > > > > Some maybe relevant array characteristics: > >> library(affxparser) > >> GeneSTv1.0 <- readCelHeader("MouseTP_Brain_01_mGENE.CEL") > >> GeneSTv1.0 > > $filename > > [1] "./MouseTP_Brain_01_mGENE.CEL" > > > > $version > > [1] 1 > > > > $cols > > [1] 1050 > > > > $rows > > [1] 1050 > > > > $total > > [1] 1102500 > > <<snip>> > > > >> GeneSTv1.1 <- readCelHeader("MouseBrain_1.CEL") > >> GeneSTv1.1 > > $filename > > [1] "./MouseBrain_1.CEL" > > > > $version > > [1] 1 > > > > $cols > > [1] 990 > > > > $rows > > [1] 1190 > > > > $total > > [1] 1178100 > > <<snip>> > > > >> sessionInfo() > > R version 2.15.0 (2012-03-30) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8?????? LC_NUMERIC=C > > ?[3] LC_TIME=en_US.UTF-8??????? LC_COLLATE=en_US.UTF-8 > > ?[5] LC_MONETARY=en_US.UTF-8??? LC_MESSAGES=en_US.UTF-8 > > ?[7] LC_PAPER=C???????????????? LC_NAME=C > > ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats???? graphics? grDevices utils???? datasets? methods?? base > > > > other attached packages: > > [1] frmaTools_1.8.0??? affy_1.34.0??????? Biobase_2.16.0 > BiocGenerics_0.2.0 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.24.0???????? BiocInstaller_1.4.4?? DBI_0.2-5 > > [4] preprocessCore_1.18.0 zlibbioc_1.2.0 > > > > > > --------------------------------------------------------- > > Guido Hooiveld, PhD > > Nutrition, Metabolism & Genomics Group > > Division of Human Nutrition > > Wageningen University > > Biotechnion, Bomenweg 2 > > NL-6703 HD Wageningen > > the Netherlands > > tel: (+)31 317 485788 > > fax: (+)31 317 483342 > > email: ?????guido.hooiveld at wur.nl > > internet:?? http://nutrigene.4t.com > > http://scholar.google.com/citations?user=qFHaMnoAAAAJ > > http://www.researcherid.com/rid/F-4912-2010 > > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD COMMENT
0
Entering edit mode
Hi Matt, Thanks for coming back on this. First of all I am fully aware that I am not using the preferred analysis route for Gene ST arrays (which indeed should go through e.g. oligo or XPS). But the possibilities of your function convertPlatform are so nice I gave it a try with these arrays using the remapped CDFs (which AFAIK are valid CDFs; that is they confirm to all standards). I decided to look at the source code of convertPlatform to manually execute it step-by-step (since the code is not so long), and check the output of each line. By doing so I indeed identified the line were things go wrong. It is happening at the 2nd last line of convertPlatform (i.e. exprs2[index,] <- exprs(object)[pmIndex,]) # 1st rename object according to 'nomenclature' used when function convertPlatform is defined # convertPlatform <- function(object, new.platform){........ > object <- affy.data > new.platform <- "mogene10stv1mmentrezg" > cleancdfname(cdfName(object)) [1] "mogene11stv1mmentrezgcdf" > cdfname <- cleancdfname(cdfName(object)) > old.platform <- gsub("cdf","",cdfname) > old.platform [1] "mogene11stv1mmentrezg" > map <- makeMaps(new.platform, old.platform) > head(map) mogene10stv1mmentrezg mogene11stv1mmentrezg [1,] 831891 213206 [2,] 237305 15731 [3,] 14720 511115 [4,] 615715 549916 [5,] 362313 1064843 [6,] 1080675 271008 > tmp <- new("AffyBatch", cdfName=new.platform) > tmp AffyBatch object size of arrays=0x0 features (15 kb) cdf=mogene10stv1mmentrezg (21225 affyids) number of samples=0 number of genes=21225 annotation= > pns <- probeNames(tmp) > head(pns) [1] "100008567_at" "100008567_at" "100008567_at" "100008567_at" "100008567_at" [6] "100008567_at" # check whether this identical output also occurs when 'real' Affybatch object (i.e. affy.data) is used as input > head(probeNames(affy.data)) [1] "100008567_at" "100008567_at" "100008567_at" "100008567_at" "100008567_at" [6] "100008567_at" # yes, same output > index <- unlist(pmindex(tmp)) > head(index) 100008567_at1 100008567_at2 100008567_at3 100008567_at4 100008567_at5 831891 237305 14720 615715 362313 100008567_at6 1080675 > mIndex <- match(index,map[,1]) > head(mIndex) [1] 1 2 3 4 5 6 > pmIndex <- map[mIndex,2] > head(pmIndex) [1] 213206 15731 511115 549916 1064843 271008 > paste(new.platform,"cdf",sep="") [1] "mogene10stv1mmentrezgcdf" > env <- get(paste(new.platform,"dim",sep="")) # check which environment is defined > paste(new.platform,"dim",sep="") [1] "mogene10stv1mmentrezgdim" # > nc <- env$NCOL > head(nc) [1] 1050 > nr <- env$NROW > head(nr) [1] 1050 > exprs2 <- matrix(nrow=nc*nr, ncol=length(object)) > dim(exprs2) [1] 1102500 23 # Note, nr and nc are indeed the dimension of the v1.0 (cartridge) array, as is the number of probes. See my first email. > exprs2[index,] <- exprs(object)[pmIndex,] Error: subscript out of bounds > ^^^ here it goes wrong. I *think* this is related to the fact that the v1.1 array (GeneTitan) is rectangular... Compare dimensions of newly created expression v1.0 matrix: > dim(exprs2) [1] 1102500 23 With that of the input v1.1 expression matrix: > dim(exprs(object)) [1] 1178100 23 > Number of arrays match, but number of probes not... To me it naively looks some probes of the v1.1 array have to be deleted that do not match cq are not present on the v1.0 array...?? Thanks again for looking into this, Guido BTW: if needed I can send you some CEL files from both platforms. -----Original Message----- From: Matthew McCall [mailto:mccallm@gmail.com] Sent: Friday, June 01, 2012 18:19 To: Hooiveld, Guido Cc: bioconductor (bioconductor at stat.math.ethz.ch) Subject: Re: frmaTools: error with 'convertPlatform' Guido, The frma and frmaTools packages use oligo (rather than AffyBatch) objects for the ST arrays, so what you're trying to do is a bit outside the intended functionality. I would also caution you against combining data from different platforms as probe behavior can change quite a bit. That said, we can see whether there's some simple modification that could let you try out what you'd like. Can you figure out at what point in the convertPlatform function the error pops up? Best, Matt On Fri, Jun 1, 2012 at 8:20 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi, > > I would like to use the function 'convertPlatform' (from the library > frmaTools) to convert an Affybatch object from the MoGene ST v1.1 > (GeneTitan > array) format into that of the MoGene ST v1.0 format (cartridge > array), but I run into an error. The reason that I would like to > convert that Affybatch object is that I would like to combine 2 > experiments performed on those 2 platform so I can normalize them together. > > > > In principle the content of the arrays is the same, that is the > probeSETS should be identical, but the design and number of probes are > different: the > v1.0 array (cartridge) is square (1050cols x 1050rows) whereas the > v1.1 array is rectangular (990cols x 1190rows). I think this may be > related to the error I experience. Note also that I would like to use a remapped CDF. > > > > Any suggestions? > > Thanks, > > Guido > > > > > >> affy.data <- ReadAffy(cdfname="mogene11stv1mmentrezg") > >> affy.data > > Loading required package: AnnotationDbi > > > > AffyBatch object > > size of arrays=1190x990 features (25 kb) > > cdf=mogene11stv1mmentrezg (21225 affyids) > > number of samples=23 > > number of genes=21225 > > annotation=mogene11stv1mmentrezg > > notes= > >> object.conv <- convertPlatform(affy.data, "mogene10stv1mmentrezg") > > Loading required package: mogene10stv1mmentrezgprobe > > Loading required package: mogene11stv1mmentrezgprobe > > > > > > Attaching package: 'mogene10stv1mmentrezgcdf' > > > > The following object(s) are masked from 'package:mogene11stv1mmentrezgcdf': > > > > ??? i2xy, xy2i > > > > Error in convertPlatform(affy.data, "mogene10stv1mmentrezg") : > > ??subscript out of bounds > >> > > > > > > > > > > Some maybe relevant array characteristics: > >> library(affxparser) > >> GeneSTv1.0 <- readCelHeader("MouseTP_Brain_01_mGENE.CEL") > >> GeneSTv1.0 > > $filename > > [1] "./MouseTP_Brain_01_mGENE.CEL" > > > > $version > > [1] 1 > > > > $cols > > [1] 1050 > > > > $rows > > [1] 1050 > > > > $total > > [1] 1102500 > > <<snip>> > > > >> GeneSTv1.1 <- readCelHeader("MouseBrain_1.CEL") > >> GeneSTv1.1 > > $filename > > [1] "./MouseBrain_1.CEL" > > > > $version > > [1] 1 > > > > $cols > > [1] 990 > > > > $rows > > [1] 1190 > > > > $total > > [1] 1178100 > > <<snip>> > > > >> sessionInfo() > > R version 2.15.0 (2012-03-30) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8?????? LC_NUMERIC=C > > ?[3] LC_TIME=en_US.UTF-8??????? LC_COLLATE=en_US.UTF-8 > > ?[5] LC_MONETARY=en_US.UTF-8??? LC_MESSAGES=en_US.UTF-8 > > ?[7] LC_PAPER=C???????????????? LC_NAME=C > > ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats???? graphics? grDevices utils???? datasets? methods?? base > > > > other attached packages: > > [1] frmaTools_1.8.0??? affy_1.34.0??????? Biobase_2.16.0 > BiocGenerics_0.2.0 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.24.0???????? BiocInstaller_1.4.4?? DBI_0.2-5 > > [4] preprocessCore_1.18.0 zlibbioc_1.2.0 > > > > > > --------------------------------------------------------- > > Guido Hooiveld, PhD > > Nutrition, Metabolism & Genomics Group > > Division of Human Nutrition > > Wageningen University > > Biotechnion, Bomenweg 2 > > NL-6703 HD Wageningen > > the Netherlands > > tel: (+)31 317 485788 > > fax: (+)31 317 483342 > > email: ?????guido.hooiveld at wur.nl > > internet:?? http://nutrigene.4t.com > > http://scholar.google.com/citations?user=qFHaMnoAAAAJ > > http://www.researcherid.com/rid/F-4912-2010 > > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD REPLY
0
Entering edit mode
Guido, Thanks for the line by line results. Can you send me the map object -- the result of: map <- makeMaps(new.platform, old.platform)? Best, Matt On Fri, Jun 1, 2012 at 3:53 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi Matt, > Thanks for coming back on this. > > First of all I am fully aware that I am not using the preferred analysis route for Gene ST arrays (which indeed should go through e.g. oligo or XPS). But the possibilities of your function convertPlatform are so nice I gave it a try with these arrays using the remapped CDFs (which AFAIK are valid CDFs; that is they confirm to all standards). > > I decided to look at the source code of convertPlatform to manually execute it step-by-step (since the code is not so long), and check the output of each line. By doing so I indeed identified the line were things go wrong. It is happening at the 2nd last line of convertPlatform (i.e. exprs2[index,] <- exprs(object)[pmIndex,]) > > > # 1st rename object according to 'nomenclature' used when function convertPlatform is defined > # convertPlatform <- function(object, new.platform){........ > >> object <- affy.data >> new.platform <- "mogene10stv1mmentrezg" >> cleancdfname(cdfName(object)) > [1] "mogene11stv1mmentrezgcdf" >> cdfname <- cleancdfname(cdfName(object)) >> old.platform <- gsub("cdf","",cdfname) >> old.platform > [1] "mogene11stv1mmentrezg" >> map <- makeMaps(new.platform, old.platform) >> head(map) > ? ? mogene10stv1mmentrezg mogene11stv1mmentrezg > [1,] ? ? ? ? ? ? ? ?831891 ? ? ? ? ? ? ? ?213206 > [2,] ? ? ? ? ? ? ? ?237305 ? ? ? ? ? ? ? ? 15731 > [3,] ? ? ? ? ? ? ? ? 14720 ? ? ? ? ? ? ? ?511115 > [4,] ? ? ? ? ? ? ? ?615715 ? ? ? ? ? ? ? ?549916 > [5,] ? ? ? ? ? ? ? ?362313 ? ? ? ? ? ? ? 1064843 > [6,] ? ? ? ? ? ? ? 1080675 ? ? ? ? ? ? ? ?271008 >> tmp <- new("AffyBatch", cdfName=new.platform) >> tmp > AffyBatch object > size of arrays=0x0 features (15 kb) > cdf=mogene10stv1mmentrezg (21225 affyids) > number of samples=0 > number of genes=21225 > annotation= >> pns <- probeNames(tmp) >> head(pns) > [1] "100008567_at" "100008567_at" "100008567_at" "100008567_at" "100008567_at" > [6] "100008567_at" > > # check whether this identical output also occurs when 'real' Affybatch object (i.e. affy.data) is used as input >> head(probeNames(affy.data)) > [1] "100008567_at" "100008567_at" "100008567_at" "100008567_at" "100008567_at" > [6] "100008567_at" > # yes, same output > >> index <- unlist(pmindex(tmp)) >> head(index) > 100008567_at1 100008567_at2 100008567_at3 100008567_at4 100008567_at5 > ? ? ? 831891 ? ? ? ?237305 ? ? ? ? 14720 ? ? ? ?615715 ? ? ? ?362313 > 100008567_at6 > ? ? ?1080675 >> mIndex <- match(index,map[,1]) >> head(mIndex) > [1] 1 2 3 4 5 6 >> pmIndex <- map[mIndex,2] >> head(pmIndex) > [1] ?213206 ? 15731 ?511115 ?549916 1064843 ?271008 >> paste(new.platform,"cdf",sep="") > [1] "mogene10stv1mmentrezgcdf" >> env <- get(paste(new.platform,"dim",sep="")) > > # check which environment is defined >> paste(new.platform,"dim",sep="") > [1] "mogene10stv1mmentrezgdim" > # > >> nc <- env$NCOL >> head(nc) > [1] 1050 >> nr <- env$NROW >> head(nr) > [1] 1050 >> exprs2 <- matrix(nrow=nc*nr, ncol=length(object)) >> dim(exprs2) > [1] 1102500 ? ? ?23 > # Note, nr and nc are indeed the dimension of the v1.0 (cartridge) array, as is the number of probes. See my first email. > >> exprs2[index,] <- exprs(object)[pmIndex,] > Error: subscript out of bounds >> > ^^^ here it goes wrong. I *think* this is related to the fact that the v1.1 array (GeneTitan) is rectangular... > Compare dimensions of newly created expression v1.0 matrix: >> dim(exprs2) > [1] 1102500 ? ? ?23 > With that of the input v1.1 expression matrix: >> dim(exprs(object)) > [1] 1178100 ? ? ?23 >> > Number of arrays match, but number of probes not... > > To me it naively looks some probes of the v1.1 array have to be deleted that do not match cq are not present on the v1.0 array...?? > > Thanks again for looking into this, > Guido > > BTW: if needed I can send you some CEL files from both platforms. > > -----Original Message----- > From: Matthew McCall [mailto:mccallm at gmail.com] > Sent: Friday, June 01, 2012 18:19 > To: Hooiveld, Guido > Cc: bioconductor (bioconductor at stat.math.ethz.ch) > Subject: Re: frmaTools: error with 'convertPlatform' > > Guido, > > The frma and frmaTools packages use oligo (rather than AffyBatch) objects for the ST arrays, so what you're trying to do is a bit outside the intended functionality. I would also caution you against combining data from different platforms as probe behavior can change quite a bit. > > That said, we can see whether there's some simple modification that could let you try out what you'd like. Can you figure out at what point in the convertPlatform function the error pops up? > > Best, > Matt > > > > On Fri, Jun 1, 2012 at 8:20 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: >> Hi, >> >> I would like to use the function 'convertPlatform' (from the library >> frmaTools) to convert an Affybatch object from the MoGene ST v1.1 >> (GeneTitan >> array) format into that of the MoGene ST v1.0 format (cartridge >> array), but I run into an error. The reason that I would like to >> convert that Affybatch object is that I would like to combine 2 >> experiments performed on those 2 platform so I can normalize them together. >> >> >> >> In principle the content of the arrays is the same, that is the >> probeSETS should be identical, but the design and number of probes are >> different: the >> v1.0 array (cartridge) is square (1050cols x 1050rows) whereas the >> v1.1 array is rectangular (990cols x 1190rows). I think this may be >> related to the error I experience. Note also that I would like to use a remapped CDF. >> >> >> >> Any suggestions? >> >> Thanks, >> >> Guido >> >> >> >> >> >>> affy.data <- ReadAffy(cdfname="mogene11stv1mmentrezg") >> >>> affy.data >> >> Loading required package: AnnotationDbi >> >> >> >> AffyBatch object >> >> size of arrays=1190x990 features (25 kb) >> >> cdf=mogene11stv1mmentrezg (21225 affyids) >> >> number of samples=23 >> >> number of genes=21225 >> >> annotation=mogene11stv1mmentrezg >> >> notes= >> >>> object.conv <- convertPlatform(affy.data, "mogene10stv1mmentrezg") >> >> Loading required package: mogene10stv1mmentrezgprobe >> >> Loading required package: mogene11stv1mmentrezgprobe >> >> >> >> >> >> Attaching package: 'mogene10stv1mmentrezgcdf' >> >> >> >> The following object(s) are masked from 'package:mogene11stv1mmentrezgcdf': >> >> >> >> ??? i2xy, xy2i >> >> >> >> Error in convertPlatform(affy.data, "mogene10stv1mmentrezg") : >> >> ??subscript out of bounds >> >>> >> >> >> >> >> >> >> >> >> >> Some maybe relevant array characteristics: >> >>> library(affxparser) >> >>> GeneSTv1.0 <- readCelHeader("MouseTP_Brain_01_mGENE.CEL") >> >>> GeneSTv1.0 >> >> $filename >> >> [1] "./MouseTP_Brain_01_mGENE.CEL" >> >> >> >> $version >> >> [1] 1 >> >> >> >> $cols >> >> [1] 1050 >> >> >> >> $rows >> >> [1] 1050 >> >> >> >> $total >> >> [1] 1102500 >> >> <<snip>> >> >> >> >>> GeneSTv1.1 <- readCelHeader("MouseBrain_1.CEL") >> >>> GeneSTv1.1 >> >> $filename >> >> [1] "./MouseBrain_1.CEL" >> >> >> >> $version >> >> [1] 1 >> >> >> >> $cols >> >> [1] 990 >> >> >> >> $rows >> >> [1] 1190 >> >> >> >> $total >> >> [1] 1178100 >> >> <<snip>> >> >> >> >>> sessionInfo() >> >> R version 2.15.0 (2012-03-30) >> >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8?????? LC_NUMERIC=C >> >> ?[3] LC_TIME=en_US.UTF-8??????? LC_COLLATE=en_US.UTF-8 >> >> ?[5] LC_MONETARY=en_US.UTF-8??? LC_MESSAGES=en_US.UTF-8 >> >> ?[7] LC_PAPER=C???????????????? LC_NAME=C >> >> ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C >> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> >> >> attached base packages: >> >> [1] stats???? graphics? grDevices utils???? datasets? methods?? base >> >> >> >> other attached packages: >> >> [1] frmaTools_1.8.0??? affy_1.34.0??????? Biobase_2.16.0 >> BiocGenerics_0.2.0 >> >> >> >> loaded via a namespace (and not attached): >> >> [1] affyio_1.24.0???????? BiocInstaller_1.4.4?? DBI_0.2-5 >> >> [4] preprocessCore_1.18.0 zlibbioc_1.2.0 >> >> >> >> >> >> --------------------------------------------------------- >> >> Guido Hooiveld, PhD >> >> Nutrition, Metabolism & Genomics Group >> >> Division of Human Nutrition >> >> Wageningen University >> >> Biotechnion, Bomenweg 2 >> >> NL-6703 HD Wageningen >> >> the Netherlands >> >> tel: (+)31 317 485788 >> >> fax: (+)31 317 483342 >> >> email: ?????guido.hooiveld at wur.nl >> >> internet:?? http://nutrigene.4t.com >> >> http://scholar.google.com/citations?user=qFHaMnoAAAAJ >> >> http://www.researcherid.com/rid/F-4912-2010 >> >> > > > > -- > Matthew N McCall, PhD > 112 Arvine Heights > Rochester, NY 14611 > Cell: 202-222-5880 > > > > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD REPLY
0
Entering edit mode
@matthew-mccall-4459
Last seen 5.0 years ago
United States
Guido, Well I've found the problem, but I'm not sure exactly what the solution is. The issue is that multiple probes on 1.0 are mapping to the same probe on 1.1: > sum(duplicated(map[,1])) [1] 0 > sum(duplicated(map[,2])) [1] 1749 I think this may be a feature of the alternative CDF, but I'm not positive (perhaps someone else can weigh in on other this is the case). But that is what is "breaking" the platform conversion. Sorry I couldn't be of more help. Best, Matt On Fri, Jun 1, 2012 at 6:04 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi, > I uploaded it here: > https://sendit.wur.nl/Download.aspx?id=cb769829-a7e5-4f7f-9311-290df 518ce5d > > Guido > > -----Original Message----- > From: Matthew McCall [mailto:mccallm at gmail.com] > Sent: Friday, June 01, 2012 22:04 > To: Hooiveld, Guido > Cc: bioconductor (bioconductor at stat.math.ethz.ch) > Subject: Re: frmaTools: error with 'convertPlatform' > > Guido, > > Thanks for the line by line results. Can you send me the map object -- the result of: map <- makeMaps(new.platform, old.platform)? > > Best, > Matt > > On Fri, Jun 1, 2012 at 3:53 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: >> Hi Matt, >> Thanks for coming back on this. >> >> First of all I am fully aware that I am not using the preferred analysis route for Gene ST arrays (which indeed should go through e.g. oligo or XPS). But the possibilities of your function convertPlatform are so nice I gave it a try with these arrays using the remapped CDFs (which AFAIK are valid CDFs; that is they confirm to all standards). >> >> I decided to look at the source code of convertPlatform to manually >> execute it step-by-step (since the code is not so long), and check the >> output of each line. By doing so I indeed identified the line were >> things go wrong. It is happening at the 2nd last line of >> convertPlatform (i.e. exprs2[index,] <- exprs(object)[pmIndex,]) >> >> >> # 1st rename object according to 'nomenclature' used when function >> convertPlatform is defined # convertPlatform <- function(object, new.platform){........ >> >>> object <- affy.data >>> new.platform <- "mogene10stv1mmentrezg" >>> cleancdfname(cdfName(object)) >> [1] "mogene11stv1mmentrezgcdf" >>> cdfname <- cleancdfname(cdfName(object)) old.platform <- >>> gsub("cdf","",cdfname) old.platform >> [1] "mogene11stv1mmentrezg" >>> map <- makeMaps(new.platform, old.platform) >>> head(map) >> ? ? mogene10stv1mmentrezg mogene11stv1mmentrezg [1,] >> 831891 ? ? ? ? ? ? ? ?213206 [2,] ? ? ? ? ? ? ? ?237305 >> 15731 [3,] ? ? ? ? ? ? ? ? 14720 ? ? ? ? ? ? ? ?511115 [4,] >> 615715 ? ? ? ? ? ? ? ?549916 [5,] ? ? ? ? ? ? ? ?362313 >> 1064843 [6,] ? ? ? ? ? ? ? 1080675 ? ? ? ? ? ? ? ?271008 >>> tmp <- new("AffyBatch", cdfName=new.platform) tmp >> AffyBatch object >> size of arrays=0x0 features (15 kb) >> cdf=mogene10stv1mmentrezg (21225 affyids) number of samples=0 number >> of genes=21225 annotation= >>> pns <- probeNames(tmp) >>> head(pns) >> [1] "100008567_at" "100008567_at" "100008567_at" "100008567_at" "100008567_at" >> [6] "100008567_at" >> >> # check whether this identical output also occurs when 'real' >> Affybatch object (i.e. affy.data) is used as input >>> head(probeNames(affy.data)) >> [1] "100008567_at" "100008567_at" "100008567_at" "100008567_at" "100008567_at" >> [6] "100008567_at" >> # yes, same output >> >>> index <- unlist(pmindex(tmp)) >>> head(index) >> 100008567_at1 100008567_at2 100008567_at3 100008567_at4 100008567_at5 >> ? ? ? 831891 ? ? ? ?237305 ? ? ? ? 14720 ? ? ? ?615715 ? ? ? ?362313 >> 100008567_at6 >> ? ? ?1080675 >>> mIndex <- match(index,map[,1]) >>> head(mIndex) >> [1] 1 2 3 4 5 6 >>> pmIndex <- map[mIndex,2] >>> head(pmIndex) >> [1] ?213206 ? 15731 ?511115 ?549916 1064843 ?271008 >>> paste(new.platform,"cdf",sep="") >> [1] "mogene10stv1mmentrezgcdf" >>> env <- get(paste(new.platform,"dim",sep="")) >> >> # check which environment is defined >>> paste(new.platform,"dim",sep="") >> [1] "mogene10stv1mmentrezgdim" >> # >> >>> nc <- env$NCOL >>> head(nc) >> [1] 1050 >>> nr <- env$NROW >>> head(nr) >> [1] 1050 >>> exprs2 <- matrix(nrow=nc*nr, ncol=length(object)) >>> dim(exprs2) >> [1] 1102500 ? ? ?23 >> # Note, nr and nc are indeed the dimension of the v1.0 (cartridge) array, as is the number of probes. See my first email. >> >>> exprs2[index,] <- exprs(object)[pmIndex,] >> Error: subscript out of bounds >>> >> ^^^ here it goes wrong. I *think* this is related to the fact that the v1.1 array (GeneTitan) is rectangular... >> Compare dimensions of newly created expression v1.0 matrix: >>> dim(exprs2) >> [1] 1102500 ? ? ?23 >> With that of the input v1.1 expression matrix: >>> dim(exprs(object)) >> [1] 1178100 ? ? ?23 >>> >> Number of arrays match, but number of probes not... >> >> To me it naively looks some probes of the v1.1 array have to be deleted that do not match cq are not present on the v1.0 array...?? >> >> Thanks again for looking into this, >> Guido >> >> BTW: if needed I can send you some CEL files from both platforms. >> >> -----Original Message----- >> From: Matthew McCall [mailto:mccallm at gmail.com] >> Sent: Friday, June 01, 2012 18:19 >> To: Hooiveld, Guido >> Cc: bioconductor (bioconductor at stat.math.ethz.ch) >> Subject: Re: frmaTools: error with 'convertPlatform' >> >> Guido, >> >> The frma and frmaTools packages use oligo (rather than AffyBatch) objects for the ST arrays, so what you're trying to do is a bit outside the intended functionality. I would also caution you against combining data from different platforms as probe behavior can change quite a bit. >> >> That said, we can see whether there's some simple modification that could let you try out what you'd like. Can you figure out at what point in the convertPlatform function the error pops up? >> >> Best, >> Matt >> >> >> >> On Fri, Jun 1, 2012 at 8:20 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: >>> Hi, >>> >>> I would like to use the function 'convertPlatform' (from the library >>> frmaTools) to convert an Affybatch object from the MoGene ST v1.1 >>> (GeneTitan >>> array) format into that of the MoGene ST v1.0 format (cartridge >>> array), but I run into an error. The reason that I would like to >>> convert that Affybatch object is that I would like to combine 2 >>> experiments performed on those 2 platform so I can normalize them together. >>> >>> >>> >>> In principle the content of the arrays is the same, that is the >>> probeSETS should be identical, but the design and number of probes >>> are >>> different: the >>> v1.0 array (cartridge) is square (1050cols x 1050rows) whereas the >>> v1.1 array is rectangular (990cols x 1190rows). I think this may be >>> related to the error I experience. Note also that I would like to use a remapped CDF. >>> >>> >>> >>> Any suggestions? >>> >>> Thanks, >>> >>> Guido >>> >>> >>> >>> >>> >>>> affy.data <- ReadAffy(cdfname="mogene11stv1mmentrezg") >>> >>>> affy.data >>> >>> Loading required package: AnnotationDbi >>> >>> >>> >>> AffyBatch object >>> >>> size of arrays=1190x990 features (25 kb) >>> >>> cdf=mogene11stv1mmentrezg (21225 affyids) >>> >>> number of samples=23 >>> >>> number of genes=21225 >>> >>> annotation=mogene11stv1mmentrezg >>> >>> notes= >>> >>>> object.conv <- convertPlatform(affy.data, "mogene10stv1mmentrezg") >>> >>> Loading required package: mogene10stv1mmentrezgprobe >>> >>> Loading required package: mogene11stv1mmentrezgprobe >>> >>> >>> >>> >>> >>> Attaching package: 'mogene10stv1mmentrezgcdf' >>> >>> >>> >>> The following object(s) are masked from 'package:mogene11stv1mmentrezgcdf': >>> >>> >>> >>> ??? i2xy, xy2i >>> >>> >>> >>> Error in convertPlatform(affy.data, "mogene10stv1mmentrezg") : >>> >>> ??subscript out of bounds >>> >>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Some maybe relevant array characteristics: >>> >>>> library(affxparser) >>> >>>> GeneSTv1.0 <- readCelHeader("MouseTP_Brain_01_mGENE.CEL") >>> >>>> GeneSTv1.0 >>> >>> $filename >>> >>> [1] "./MouseTP_Brain_01_mGENE.CEL" >>> >>> >>> >>> $version >>> >>> [1] 1 >>> >>> >>> >>> $cols >>> >>> [1] 1050 >>> >>> >>> >>> $rows >>> >>> [1] 1050 >>> >>> >>> >>> $total >>> >>> [1] 1102500 >>> >>> <<snip>> >>> >>> >>> >>>> GeneSTv1.1 <- readCelHeader("MouseBrain_1.CEL") >>> >>>> GeneSTv1.1 >>> >>> $filename >>> >>> [1] "./MouseBrain_1.CEL" >>> >>> >>> >>> $version >>> >>> [1] 1 >>> >>> >>> >>> $cols >>> >>> [1] 990 >>> >>> >>> >>> $rows >>> >>> [1] 1190 >>> >>> >>> >>> $total >>> >>> [1] 1178100 >>> >>> <<snip>> >>> >>> >>> >>>> sessionInfo() >>> >>> R version 2.15.0 (2012-03-30) >>> >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> >>> >>> locale: >>> >>> [1] LC_CTYPE=en_US.UTF-8?????? LC_NUMERIC=C >>> >>> ?[3] LC_TIME=en_US.UTF-8??????? LC_COLLATE=en_US.UTF-8 >>> >>> ?[5] LC_MONETARY=en_US.UTF-8??? LC_MESSAGES=en_US.UTF-8 >>> >>> ?[7] LC_PAPER=C???????????????? LC_NAME=C >>> >>> ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C >>> >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> >>> >>> attached base packages: >>> >>> [1] stats???? graphics? grDevices utils???? datasets? methods?? base >>> >>> >>> >>> other attached packages: >>> >>> [1] frmaTools_1.8.0??? affy_1.34.0??????? Biobase_2.16.0 >>> BiocGenerics_0.2.0 >>> >>> >>> >>> loaded via a namespace (and not attached): >>> >>> [1] affyio_1.24.0???????? BiocInstaller_1.4.4?? DBI_0.2-5 >>> >>> [4] preprocessCore_1.18.0 zlibbioc_1.2.0 >>> >>> >>> >>> >>> >>> --------------------------------------------------------- >>> >>> Guido Hooiveld, PhD >>> >>> Nutrition, Metabolism & Genomics Group >>> >>> Division of Human Nutrition >>> >>> Wageningen University >>> >>> Biotechnion, Bomenweg 2 >>> >>> NL-6703 HD Wageningen >>> >>> the Netherlands >>> >>> tel: (+)31 317 485788 >>> >>> fax: (+)31 317 483342 >>> >>> email: ?????guido.hooiveld at wur.nl >>> >>> internet:?? http://nutrigene.4t.com >>> >>> http://scholar.google.com/citations?user=qFHaMnoAAAAJ >>> >>> http://www.researcherid.com/rid/F-4912-2010 >>> >>> >> >> >> >> -- >> Matthew N McCall, PhD >> 112 Arvine Heights >> Rochester, NY 14611 >> Cell: 202-222-5880 >> >> >> >> > > > > -- > Matthew N McCall, PhD > 112 Arvine Heights > Rochester, NY 14611 > Cell: 202-222-5880 > > > > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD COMMENT

Login before adding your answer.

Traffic: 511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6