Question: Compare Row.names and ID from GSE datasets
gravatar for PyPer
5.2 years ago by
PyPer20 wrote:

I am annotating data from a GSE dataset. I want to check that the row.names are equivalent to ID to ensure that there are no mistakes.

gse10072 <- getGEO('gse10072', GSEMATRIX=TRUE)
g72 <- gse10072[[1]]
total <- pData(featureData(g72))

t1 <- data.frame(row.names(total))
t2 <- data.frame(total$ID)

why is it that when I perform a comparison
identical (t1,t2)

the output is false?

I'm sure it seems trivial, but I would like to compare other columns in the future. Why do two seemingly identical data.frames appear to be different?


ADD COMMENTlink modified 5.2 years ago by Sean Davis21k • written 5.2 years ago by PyPer20
Answer: Compare Row.names and ID from GSE datasets
gravatar for Sean Davis
5.2 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:

The row names are actually derived from the IDs, so there really isn't a need to check, but in case you wanted to:

> library(GEOquery)
> g72 <- getGEO('gse10072', GSEMatrix=TRUE)[[1]]
Found 1 file(s)
Using locally cached version: /var/folders/21/8t47kwys6vqb8606kdfn71780000gn/T//RtmpEmvD0e/GSE10072_series_matrix.txt.gz
Using locally cached version of GPL96 found here:
> total <- fData(g72)
> t1 <- row.names(total)
# Convert to character vector from factor!
> t2 <- as.character(total$ID)
> identical(t1,t2)

Comparing a character vector to a factor (total$ID is a factor) will result in identical() returning FALSE.  However, after converting to a character vector, identical() returns TRUE.

ADD COMMENTlink written 5.2 years ago by Sean Davis21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 282 users visited in the last hour