We have been using DESeq2 on our RNA-seq data to look for differential expression of genes and it works well.
One issue that keeps on cropping up is the allocation of the EnsemblIDs per row in results(dds), frequently we are getting multiple IDs per row, for example:
ENSG00000001084+ENSG00000231683 6.325517e+02 -0.2914254001 0.10554200 7.586041e+00 5.882200e-03 0.0813763865
Obviously this interfers with annotation so have split it by + and annotated both for gene names etc.
However with many of them per dataset I wondered how best to handle them quickly and easily? I know I could manually check each but with a 100 or so like this per dataset it seems to be not a great use of time.
Searching around I have seen little posted about it, although one suggested just ignore them which seems strange?
How do people handle this? Thanks in advance for any advice.