Question

geneName in GEOquery package

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

On 1/12/06 1:58 PM, "Ting-Yuan Liu" <tliu at="" fhcrc.org=""> wrote: > > Hi Sean, > > I notice that you do some modification in GEOquery to handle the geneNames > in the transformed exprSets. I am really glad to see this improvement, > but I think there is still a bug in the geneNames. For example, > >> library(GEOquery) >> >> gds82 <- getGEO("GDS82") > trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/data/gds/soft_gz/GDS82.soft.gz' > ftp data connection made, file length 98375 bytes > opened URL > ================================================== > downloaded 96Kb > > File stored at: > /tmp/RtmpY010FQ/GDS82.soft.gz > parsing geodata > parsing subsets > ready to return >> gds82eSet <- GDS2eSet(gds82, do.log2=FALSE) >> head(geneNames(gds82eSet), 20) > [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" > "15" > [16] "16" "17" "18" "19" "20" > > This is not quite right. I think you used the first column in the data > table to be the geneNames, but I think it is supposed to be the second > column: Ting-Yuan There is a problem with using the IDENTIFIER column--it doesn't need to be unique and the geneNames for an exprSet do need to be unique. ID_REF, on the other hand, is unique and for the typical affy GDS, includes affymetrix probeset ids; that is the reason for using it over the IDENTIFIER column. If you know that the identifier column IS unique and would rather use that, it is pretty simple to do so: geneNames(gds82eSet) <- Table(gds82)$IDENTIFIER I hope that solves your problem problem. Sean

affy GLAD GEOquery affy GLAD GEOquery • 606 views

ADD COMMENT • link 18.3 years ago Sean Davis 21k