Hello there,
I have a count file containing 50 human samples. The gencode gtf file was used to map the reads. STAR was used to map and featureCounts was used to generate count data. So, the rownames in my files are ensenbl gene_ids which looks like: "ENSG00000231251.1" "ENSG00000236335.1" "ENSG00000231949.1" "ENSG00000162510.5"
total 61471. How can I convert these ids to refseq ids for edgeR analysis? I have tried several ways but failed each time.
`gids=mapIds(org.Hs.eg.db,keys = rownames(y1),keytype = 'ENSEMBL',column = "SYMBOL")` it says
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments. but the rownames are the ensembl ids!!
idfound=y1$genes$genes %in% mappedkeys(org.Hs.egENSEMBL) . Only 33 match!!
gids=y1$genes$genes %in% mappedkeys(org.Hs.egREFSEQ) . Only 33 match!!
y1 is the DGElist object and y$genes$genes are the ensembl ids as mentioned above. Any help in this matter? Thanks in advance. Best Regards Zillur
Thank you very much for your quick response. I have managed to overcome this problem. Your suggestions helped me a lot. I am facing another problem. I m getting totally opposite results using glmQLFit, glmQLFTest in place of glmFit and glmLRT for same contrasts. Which method I need to use to see differential expression between two groups? I assume later according to the user guide. But why I am getting opposite results? Best regards Zillur
If you want to ask something new then start a new question rather than adding a comment to an old question. I can tell you though that glmQLTest and glmLRT do not give opposite results so, when if you post a question, you would need to give much more detail of what is bothering you that you have here.