Problems using text to subset array information from an expression set

0

Entering edit mode

Jeff Lande ▴ 110

@jeff-lande-390

Last seen 9.6 years ago

I have an odd problem that I cannot seem to figure out. I have a set of CEL files in a directory, which I read using the ReadAffy() command. Then I run the rma command to preprocess. > Data <- ReadAffy() > alldata <- rma(Data) I've done this many times before without problems. However, when I try to use text instead of numbers for subscripting, I get an error. For example, I am able to access data from the first row and column using numeric subscripts > alldata[1,1] Expression Set (exprSet) with 1 genes 1 samples phenoData object with 1 variables and 1 cases varLabels sample: arbitrary numbering but using text for either subscript, I get an error. > alldata["1007_s_at",] Error in alldata["1007_s_at", ] : no 'dimnames' attribute for array > alldata[,"AA100.CEL"] Error in alldata[, "AA100.CEL"] : no 'dimnames' attribute for array I actually went through what I think was the same process last week (and many times previously) and had no problems, so I'm stumped. Here is my session information: > sessionInfo() R version 2.1.0, 2005-04-18, ia64-unknown-linux-gnu attached base packages: [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" [7] "datasets" "base" other attached packages: hgu133acdf affy reposTools Biobase "1.4.3" "1.6.7" "1.5.19" "1.5.12" I must be missing something obvious, but I just can't figure out what is going wrong. Does anyone have insight into this problem? Jeff Lande Post-Doctoral Associate University of Minnesota

cdf reposTools affy PROcess cdf reposTools affy PROcess • 1.5k views

ADD COMMENT • link updated 18.1 years ago by Benilton Carvalho ★ 4.3k • written 18.1 years ago by Jeff Lande ▴ 110

0

Entering edit mode

Benilton Carvalho ★ 4.3k

@benilton-carvalho-1375

Last seen 4.1 years ago

Brazil/Campinas/UNICAMP

isn't exprs(alldata)["1007_s_at",] exprs(alldata)[, "AA100.CEL"] what you want? b On Tue, 4 Apr 2006, Jeff Lande wrote: > I have an odd problem that I cannot seem to figure out. > > I have a set of CEL files in a directory, which I read using the ReadAffy() > command. Then I run the rma command to preprocess. > >> Data <- ReadAffy() >> alldata <- rma(Data) > > I've done this many times before without problems. However, when I try to > use text instead of numbers for subscripting, I get an error. > > For example, I am able to access data from the first row and column using > numeric subscripts > >> alldata[1,1] > Expression Set (exprSet) with > 1 genes > 1 samples > phenoData object with 1 variables and 1 cases > varLabels > sample: arbitrary numbering > > but using text for either subscript, I get an error. > >> alldata["1007_s_at",] > Error in alldata["1007_s_at", ] : no 'dimnames' attribute for array >> alldata[,"AA100.CEL"] > Error in alldata[, "AA100.CEL"] : no 'dimnames' attribute for array > > I actually went through what I think was the same process last week (and > many times previously) and had no problems, so I'm stumped. > > Here is my session information: > >> sessionInfo() > R version 2.1.0, 2005-04-18, ia64-unknown-linux-gnu > > attached base packages: > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > [7] "datasets" "base" > > other attached packages: > hgu133acdf affy reposTools Biobase > "1.4.3" "1.6.7" "1.5.19" "1.5.12" > > I must be missing something obvious, but I just can't figure out what is > going wrong. Does anyone have insight into this problem? > > Jeff Lande > Post-Doctoral Associate > University of Minnesota > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 18.1 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

Jeff: you are using a very old version of Biobase (1.5.12). If I use a current version (1.8.0) I can subset exprSet's in the way you want (tested by running the example for exprSet and then subsetting using eset["31738_at",] ) It might also (instead of just being an old version) be because of the way the exprSet is constructed using rma. Could you do the following 1) Do a traceback() after the error 2) test what the rownames/colnames are of exprs(Data), se.exprs(Data) I assume that se.exprs(Data) is a <0 x 0 matrix>. /Kasper On Apr 4, 2006, at 11:31 AM, Benilton Carvalho wrote: > isn't > > exprs(alldata)["1007_s_at",] > exprs(alldata)[, "AA100.CEL"] > > what you want? > > b > > On Tue, 4 Apr 2006, Jeff Lande wrote: > >> I have an odd problem that I cannot seem to figure out. >> >> I have a set of CEL files in a directory, which I read using the >> ReadAffy() >> command. Then I run the rma command to preprocess. >> >>> Data <- ReadAffy() >>> alldata <- rma(Data) >> >> I've done this many times before without problems. However, when >> I try to >> use text instead of numbers for subscripting, I get an error. >> >> For example, I am able to access data from the first row and >> column using >> numeric subscripts >> >>> alldata[1,1] >> Expression Set (exprSet) with >> 1 genes >> 1 samples >> phenoData object with 1 variables and 1 cases >> varLabels >> sample: arbitrary numbering >> >> but using text for either subscript, I get an error. >> >>> alldata["1007_s_at",] >> Error in alldata["1007_s_at", ] : no 'dimnames' attribute for array >>> alldata[,"AA100.CEL"] >> Error in alldata[, "AA100.CEL"] : no 'dimnames' attribute for array >> >> I actually went through what I think was the same process last >> week (and >> many times previously) and had no problems, so I'm stumped. >> >> Here is my session information: >> >>> sessionInfo() >> R version 2.1.0, 2005-04-18, ia64-unknown-linux-gnu >> >> attached base packages: >> [1] "tools" "methods" "stats" "graphics" "grDevices" >> "utils" >> [7] "datasets" "base" >> >> other attached packages: >> hgu133acdf affy reposTools Biobase >> "1.4.3" "1.6.7" "1.5.19" "1.5.12" >> >> I must be missing something obvious, but I just can't figure out >> what is >> going wrong. Does anyone have insight into this problem? >> >> Jeff Lande >> Post-Doctoral Associate >> University of Minnesota >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/ >> gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

ADD REPLY • link 18.1 years ago Kasper Daniel Hansen ★ 6.5k

0

Entering edit mode

Kasper, On traceback(), I just get > traceback() 1: newdata["1007_s_at", ] One thing that I noticed when trying to compare expression sets that I was able to use text for subscripting and those that I was not was that the se.exprs was a <0 x 0 matrix> for the former and NA for the latter. Also, there is a UNIX version that I use for processing large data sets and a PC version that I use for less memory intensive work (I have control of updating packages, etc with the PC version but I don't have administrative rights on the UNIX version). I used a workaround to subset by arrays. When I assigned a new phenoData object to the subset (within the PC version), I was able to use text subscripting on the resulting expression set (code below). > atsarrays <- c("AA1.CEL, ..., "AA132.CEL") > atsmatch <- sampleNames(alldata) %in% atsarrays > atsdata <- alldata[,atsmatch] > pd <- read.phenoData("ATS_phenodata.TXT") > phenoData(atsdata) <- pd > atsdata <- new('exprSet', exprs=exprs(atsdata), phenoData = pd) > I'm still confused why I was having trouble using text to subscript, but I seem to be able to continue on with analysis now. Thanks, Jeff Jeff: you are using a very old version of Biobase (1.5.12). If I use a current version (1.8.0) I can subset exprSet's in the way you want (tested by running the example for exprSet and then subsetting using eset["31738_at",] ) It might also (instead of just being an old version) be because of the way the exprSet is constructed using rma. Could you do the following 1) Do a traceback() after the error 2) test what the rownames/colnames are of exprs(Data), se.exprs(Data) I assume that se.exprs(Data) is a <0 x 0 matrix>. /Kasper On Apr 4, 2006, at 11:31 AM, Benilton Carvalho wrote: > isn't > > exprs(alldata)["1007_s_at",] > exprs(alldata)[, "AA100.CEL"] > > what you want? > > b > > On Tue, 4 Apr 2006, Jeff Lande wrote: > >> I have an odd problem that I cannot seem to figure out. >> >> I have a set of CEL files in a directory, which I read using the >> ReadAffy() >> command. Then I run the rma command to preprocess. >> >>> Data <- ReadAffy() >>> alldata <- rma(Data) >> >> I've done this many times before without problems. However, when >> I try to >> use text instead of numbers for subscripting, I get an error. >> >> For example, I am able to access data from the first row and >> column using >> numeric subscripts >> >>> alldata[1,1] >> Expression Set (exprSet) with >> 1 genes >> 1 samples >> phenoData object with 1 variables and 1 cases >> varLabels >> sample: arbitrary numbering >> >> but using text for either subscript, I get an error. >> >>> alldata["1007_s_at",] >> Error in alldata["1007_s_at", ] : no 'dimnames' attribute for array >>> alldata[,"AA100.CEL"] >> Error in alldata[, "AA100.CEL"] : no 'dimnames' attribute for array >> >> I actually went through what I think was the same process last >> week (and >> many times previously) and had no problems, so I'm stumped. >> >> Here is my session information: >> >>> sessionInfo() >> R version 2.1.0, 2005-04-18, ia64-unknown-linux-gnu >> >> attached base packages: >> [1] "tools" "methods" "stats" "graphics" "grDevices" >> "utils" >> [7] "datasets" "base" >> >> other attached packages: >> hgu133acdf affy reposTools Biobase >> "1.4.3" "1.6.7" "1.5.19" "1.5.12" >> >> I must be missing something obvious, but I just can't figure out >> what is >> going wrong. Does anyone have insight into this problem? >> >> Jeff Lande >> Post-Doctoral Associate >> University of Minnesota >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/ >> gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

ADD REPLY • link 18.1 years ago Jeff Lande ▴ 110

0

Entering edit mode

On Apr 4, 2006, at 2:33 PM, Jeff Lande wrote: > Kasper, > > On traceback(), I just get > >> traceback() > 1: newdata["1007_s_at", ] > > One thing that I noticed when trying to compare expression sets > that I was > able to use text for subscripting and those that I was not was that > the > se.exprs was a <0 x 0 matrix> for the former and NA for the latter. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the crucial part. I have looked into the issue and the culprit is indeed the rma function. The output object from that function (in your case alldata) has a matrix filled with NA with the same dimensions as the exprs slot. In the code for subsetting an exprSet it checks whether nrows(se.exprs) > 0 (which is true for the alldata object but not for <0 x 0 matrix>) and if true proceeds to subset on that. That subsetting fails, because only the expr slot of the output from rma has the relevant rownames. I would suggest either adding the relevant rownames to the giant se.exprs slot (which would take up some space) or simply set the se.exprs slot to be <0 x 0 matrix>, which is clearly allowable according to the documentation for exprSet. I have cc:ed the package maintainer for affy, he will hopefully make the necessary changes (short summary: the output from the rma function is not sub-settable by probeid) Jeff: a cleaner way to solve your problem (now and until it is fixed in the codebase) is dimnames(se.exprs(alldata)) = dimnames(exprs(alldata)) /Kasper > Also, there is a UNIX version that I use for processing large data > sets and > a PC version that I use for less memory intensive work (I have > control of > updating packages, etc with the PC version but I don't have > administrative > rights on the UNIX version). I used a workaround to subset by > arrays. When > I assigned a new phenoData object to the subset (within the PC > version), I > was able to use text subscripting on the resulting expression set > (code > below). > >> atsarrays <- c("AA1.CEL, ..., "AA132.CEL") >> atsmatch <- sampleNames(alldata) %in% atsarrays >> atsdata <- alldata[,atsmatch] >> pd <- read.phenoData("ATS_phenodata.TXT") >> phenoData(atsdata) <- pd >> atsdata <- new('exprSet', exprs=exprs(atsdata), phenoData = pd) >> > > I'm still confused why I was having trouble using text to > subscript, but I > seem to be able to continue on with analysis now. > > Thanks, > > Jeff > > Jeff: you are using a very old version of Biobase (1.5.12). If I use > a current version (1.8.0) I can subset exprSet's in the way you want > (tested by running the example for exprSet and then subsetting using > eset["31738_at",] > ) > > It might also (instead of just being an old version) be because of > the way the exprSet is constructed using rma. Could you do the > following > 1) Do a traceback() after the error > 2) test what the rownames/colnames are of > exprs(Data), se.exprs(Data) > I assume that se.exprs(Data) is a <0 x 0 matrix>. > > /Kasper > > On Apr 4, 2006, at 11:31 AM, Benilton Carvalho wrote: > >> isn't >> >> exprs(alldata)["1007_s_at",] >> exprs(alldata)[, "AA100.CEL"] >> >> what you want? >> >> b >> >> On Tue, 4 Apr 2006, Jeff Lande wrote: >> >>> I have an odd problem that I cannot seem to figure out. >>> >>> I have a set of CEL files in a directory, which I read using the >>> ReadAffy() >>> command. Then I run the rma command to preprocess. >>> >>>> Data <- ReadAffy() >>>> alldata <- rma(Data) >>> >>> I've done this many times before without problems. However, when >>> I try to >>> use text instead of numbers for subscripting, I get an error. >>> >>> For example, I am able to access data from the first row and >>> column using >>> numeric subscripts >>> >>>> alldata[1,1] >>> Expression Set (exprSet) with >>> 1 genes >>> 1 samples >>> phenoData object with 1 variables and 1 cases >>> varLabels >>> sample: arbitrary numbering >>> >>> but using text for either subscript, I get an error. >>> >>>> alldata["1007_s_at",] >>> Error in alldata["1007_s_at", ] : no 'dimnames' attribute for array >>>> alldata[,"AA100.CEL"] >>> Error in alldata[, "AA100.CEL"] : no 'dimnames' attribute for array >>> >>> I actually went through what I think was the same process last >>> week (and >>> many times previously) and had no problems, so I'm stumped. >>> >>> Here is my session information: >>> >>>> sessionInfo() >>> R version 2.1.0, 2005-04-18, ia64-unknown-linux-gnu >>> >>> attached base packages: >>> [1] "tools" "methods" "stats" "graphics" "grDevices" >>> "utils" >>> [7] "datasets" "base" >>> >>> other attached packages: >>> hgu133acdf affy reposTools Biobase >>> "1.4.3" "1.6.7" "1.5.19" "1.5.12" >>> >>> I must be missing something obvious, but I just can't figure out >>> what is >>> going wrong. Does anyone have insight into this problem? >>> >>> Jeff Lande >>> Post-Doctoral Associate >>> University of Minnesota >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/ >>> gmane.science.biology.informatics.conductor >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/ >> gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

ADD REPLY • link 18.1 years ago Kasper Daniel Hansen ★ 6.5k

Login before adding your answer.