Hi,
I've been following Beginner's guide to using the DESeq2 package to analyze some RNA-seq data and to make heatmaps. However, I can't figure out how to label the rows in the heatmap with gene names as opposed to ensembl IDs. I've tried to add gene names/hgnc symbols to the rlog transform values using bioMart, but I was unsuccessful. Here is the code I have been using from the guide to generate heatmaps.
library( "genefilter" )
topVarGenes <- head( order( rowVars( assay(rld) ), decreasing=TRUE ), 35 )
heatmap.2( assay(rld)[ topVarGenes, ], scale="row",
trace="none", dendrogram="column",
col = colorRampPalette( rev(brewer.pal(9, "RdBu")) )(255))
Is there an easy way to do this. I would greatly appreciate any advice on how to get the gene names onto the heatmaps. Thanks.
Thank you. I really appreciate your help.
However, I'm getting the following error:
> row.names(mat)[match(gns[,1], row.names(mat))] <- gns[,2]
Error in row.names(mat)[match(gns[, 1], row.names(mat))] <- gns[, 2] :
NAs are not allowed in subscripted assignments
I'm not sure how to convert the data in gns to characters as you have mentioned? I'm fairly new to R.
Let this be a lesson to you about taking advice from random strangers on the intertubes ;-D But this is a really good way to learn R, by decomposing the code and seeing where it went wrong. You will have to learn how to do this, if you use R much at all, because it is not possible to always write perfect code that works the first time.
The goal was to map the Ensembl IDs to HUGO symbols, and then replace the Ensembl IDs (at least those for which we got a Ensembl -> HUGO mapping). The last line was intended to do the replacement:
But note that the call to getBM() was
And if we look at that output, we get this:
We ask getBM() to return the original Ensembl Gene ID as well as the HUGO symbols because the return object isn't necessarily in the same order as the data we sent to the Biomart server, and we want to ensure that we get the correct mapping between Ensembl Gene ID and symbol.
So let's deconstruct the code that didn't work. At a high level what we are doing is
where we add in this business
because we know that the gns object isn't in the same order as the original row.names of your matrix, so we use both columns of the gns object to do the Ensembl Gene ID -> HUGO mapping. However, I made a mistake; the second column of the gns data.frame contains the Ensembl IDs, and the first column contains the HUGO symbols. So when we try to match() gns[,1] with the row.names of the matrix, we get all NA values. We instead needed to match() gns[,2] with the row.names of the matrix:
Does that make sense?
Yup, I went through a bunch of shenanigans trying to convert the gns to characters and in the end saw that the columns had been switched...
Thanks again stranger :)
By the way, the alternative org package worked. Thank you for your help!