Hi I am doing the following to get the tximport count matrix with gene name in the first column
txdf <- transcripts(EnsDb.Mmusculus.v79, return.type = "DataFrame") txdf$symbol <- mapIds(EnsDb.Mmusculus.v79, txdf$gene_id, "GENENAME", "GENEID") tx2gene <- as.data.frame(txdf[,c("tx_id","symbol")])
txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE,dropInfReps=TRUE)
However when I do head(txi$abundance)
0 wt 0 wt 0 wt 0 wt 6 wt 6 wt 6 wt
71.50353 112.29713 73.64570 73.13216 60.17879 56.01880 57.25439
0610007P14Rik 0.00000 16.73136 69.46050 60.45882 86.66511 27.10330 48.84700
0610009B22Rik 0.00000 16.34480 29.00857 26.11050 0.00000 18.28440 25.29169
I am getting an extra row at the top. Can someone help me to rectify this or let me know if I am doing anything wrong.
Tanya
I believe the "extra row" she means is the one under the column names.
71.50353 112.29713 73.64570 73.13216 60.17879 56.01880 57.25439
I did the same analysis with the same annotation this week and also have a row with no rowname. I was worried there might be an off by one error somewhere, but my results look similar to those from another tool so that's probably not the case. Just one tx_id that doesn't have a corresponding symbol I guess. Or could it be many tx_ids that are collapsed to the symbol "" during summarization?
Re “many tx_ids that are collapsed to the symbol "... If so, you can go looking in your tx2gene.