Search
Question: org.Hs.eg.db output has more lines than the input- I need to combine the input and output
0
gravatar for dp0618
2.2 years ago by
dp06180
dp06180 wrote:

I'd like to combine data2 and the output from org.Hs.eg.db. But I realized org.Hs.eg.db added 3 extra lines. Do you have any suggestions to fix it?

head(data2)

   genes shp400_FC     T58A_FC      omo_FC
1   AVIL  2.730086 -1.10721312 -1.21584380
2   BAI2  3.104302 -2.17959085  0.16769160
3    CA9 -3.208643  0.03214854 -1.05244810
5   CNN3  2.121578 -0.36076018  0.53284659
6  CPNE6  4.477493 -0.39318830 -1.10612350
7 DNAH17  4.196555 -0.43432671  0.02131942

genes2 <- as.character(data2$genes)

entrez <- select(org.Hs.eg.db, keys = genes2, columns=c("ENTREZID"), 
                 keytype="SYMBOL")

head(entrez)

  SYMBOL ENTREZID
1   AVIL    10677
2   BAI2     <NA>
3    CA9      768
4   CNN3     1266
5  CPNE6     9362
6 DNAH17     8632

data2 <- cbind(data2, entrez)

Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 13128, 13131
ADD COMMENTlink modified 2.2 years ago by Aaron Lun17k • written 2.2 years ago by dp06180
1
gravatar for Aaron Lun
2.2 years ago by
Aaron Lun17k
Cambridge, United Kingdom
Aaron Lun17k wrote:

The extra rows are probably due to duplicate mappings between SYMBOL and ENTREZID, i.e., some gene symbols are used by multiple Entrez IDs. I'm not aware of any way to coerce a 1:1 mapping from select, though I think the development version will at least tell you if there's 1-to-many mappings. Anyway, for your current problem, you can resolve this by picking the first Entrez ID for each gene symbol:

pick.first <- entrez[match(genes2, entrez$SYMBOL),]
cbind(data2, pick.first)

Alternatively, you can subset on !duplicated(entrez$SYMBOL), it should give the same results.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Aaron Lun17k

Thanks! It works now

ADD REPLYlink written 2.2 years ago by dp06180

mapIds() implements this and other strategies.

ADD REPLYlink written 2.2 years ago by Martin Morgan ♦♦ 20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 138 users visited in the last hour