Compare Row.names and ID from GSE datasets
1
1
Entering edit mode
PyPer ▴ 20
@pyper-6819
Last seen 9.5 years ago
Australia

I am annotating data from a GSE dataset. I want to check that the row.names are equivalent to ID to ensure that there are no mistakes.

gse10072 <- getGEO('gse10072', GSEMATRIX=TRUE)
g72 <- gse10072[[1]]
total <- pData(featureData(g72))

t1 <- data.frame(row.names(total))
t2 <- data.frame(total$ID)


why is it that when I perform a comparison
identical (t1,t2)

the output is false?

I'm sure it seems trivial, but I would like to compare other columns in the future. Why do two seemingly identical data.frames appear to be different?
 

 

geoquery GSE identical row.names • 1.2k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States

The row names are actually derived from the IDs, so there really isn't a need to check, but in case you wanted to:

> library(GEOquery)
> g72 <- getGEO('gse10072', GSEMatrix=TRUE)[[1]]
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE10nnn/GSE10072/matrix/
Found 1 file(s)
GSE10072_series_matrix.txt.gz
Using locally cached version: /var/folders/21/8t47kwys6vqb8606kdfn71780000gn/T//RtmpEmvD0e/GSE10072_series_matrix.txt.gz
Using locally cached version of GPL96 found here:
/var/folders/21/8t47kwys6vqb8606kdfn71780000gn/T//RtmpEmvD0e/GPL96.soft 
> total <- fData(g72)
> t1 <- row.names(total)
# Convert to character vector from factor!
> t2 <- as.character(total$ID)
> identical(t1,t2)
TRUE

Comparing a character vector to a factor (total$ID is a factor) will result in identical() returning FALSE.  However, after converting to a character vector, identical() returns TRUE.

ADD COMMENT

Login before adding your answer.

Traffic: 602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6