Entering edit mode
@mehmet-ilyas-cosacak-9020
Last seen 7.0 years ago
Germany/Dresden/ CRTD - DZNE
Hi,
I am trying to generate a data frame as below from toTable(org.Dr.egENSEMBL2EG)
.
as an example convert the following rows into a single row:
gene_id ensembl_id 16939 100000783 ENSDARG00000093071 16940 100000783 ENSDARG00000103015 16941 100000783 ENSDARG00000086233 16942 100000783 ENSDARG00000099123 16943 100000783 ENSDARG00000086304 16944 100000783 ENSDARG00000086591 16945 100000783 ENSDARG00000051736
as below:
gene_id ensembl_id 1 100000783 "ENSDARG00000093071,ENSDARG00000103015,ENSDARG00000086233,ENSDARG00000099123,ENSDARG00000086304,ENSDARG00000086591,ENSDARG00000051736"
my code is as below but it takes long time to generate the data.frame that I want to generate.
library(org.Dr.eg.db) nDf <- toTable(org.Dr.egENSEMBL2EG) d <- duplicated(nDf[,1]) nDb <- nDf[!d,] tmp1 <- nDf[d,] for(i in 1:length(nDb[,1])){ idxs <- which(tmp1[,1] == nDb[i,1]) nDb[i,2] <- paste(nDb[i,2], paste(tmp1[c(idxs),2], collapse = ","), sep = ",") }
best,
ilyas.
Thank you very much James! Sometimes I need a data.frame or an input that as above, e.g., for topGO, an input file with ensembl_id in first column and all go_id s in the second column. That is one of the aim that I am trying to learn a quicker way to generate the data.frame that has multiple mappings on another column.
best,
ilyas.