Entering edit mode
On 1/12/06 1:58 PM, "Ting-Yuan Liu" <tliu at="" fhcrc.org=""> wrote:
>
> Hi Sean,
>
> I notice that you do some modification in GEOquery to handle the
geneNames
> in the transformed exprSets. I am really glad to see this
improvement,
> but I think there is still a bug in the geneNames. For example,
>
>> library(GEOquery)
>>
>> gds82 <- getGEO("GDS82")
> trying URL
'ftp://ftp.ncbi.nih.gov/pub/geo/data/gds/soft_gz/GDS82.soft.gz'
> ftp data connection made, file length 98375 bytes
> opened URL
> ==================================================
> downloaded 96Kb
>
> File stored at:
> /tmp/RtmpY010FQ/GDS82.soft.gz
> parsing geodata
> parsing subsets
> ready to return
>> gds82eSet <- GDS2eSet(gds82, do.log2=FALSE)
>> head(geneNames(gds82eSet), 20)
> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12"
"13" "14"
> "15"
> [16] "16" "17" "18" "19" "20"
>
> This is not quite right. I think you used the first column in the
data
> table to be the geneNames, but I think it is supposed to be the
second
> column:
Ting-Yuan
There is a problem with using the IDENTIFIER column--it doesn't need
to be
unique and the geneNames for an exprSet do need to be unique. ID_REF,
on
the other hand, is unique and for the typical affy GDS, includes
affymetrix
probeset ids; that is the reason for using it over the IDENTIFIER
column.
If you know that the identifier column IS unique and would rather use
that,
it is pretty simple to do so:
geneNames(gds82eSet) <- Table(gds82)$IDENTIFIER
I hope that solves your problem problem.
Sean