Search
Question: Adding gene names to rlog transformations for heatmaps?
0
gravatar for alchemist4au
2.7 years ago by
United States
alchemist4au0 wrote:

Hi,

I've been following Beginner's guide to using the DESeq2 package to analyze some RNA-seq data and to make heatmaps. However, I can't figure out how to label the rows in the heatmap with gene names as opposed to ensembl IDs. I've tried to add gene names/hgnc symbols to the rlog transform values using bioMart, but I was unsuccessful. Here is the code I have been using from the guide to generate heatmaps.

library( "genefilter" )
topVarGenes <- head( order( rowVars( assay(rld) ), decreasing=TRUE ), 35 )

heatmap.2( assay(rld)[ topVarGenes, ], scale="row",
trace="none", dendrogram="column",
col = colorRampPalette( rev(brewer.pal(9, "RdBu")) )(255))

Is there an easy way to do this. I would greatly appreciate any advice on how to get the gene names onto the heatmaps. Thanks.

ADD COMMENTlink modified 2.7 years ago by James W. MacDonald45k • written 2.7 years ago by alchemist4au0
2
gravatar for James W. MacDonald
2.7 years ago by
United States
James W. MacDonald45k wrote:

You should be able to use biomaRt. It would have been helpful if you had showed the code that was unsuccessful. Something like

mat <- assay(rld)[ topVarGenes, ]
mart <- useMart("ensembl","hsapiens_gene_ensembl") ## assuming human, as you don't say
gns <- getBM(c("hgnc_symbol","ensembl_gene_id"), "ensembl_gene_id", row.names(mat), mart)
row.names(mat)[match(gns[,1], row.names(mat))] <- gns[,2]

Note that you might need to convert the data in 'gns' to character for match() to work correctly.

Alternatively you can just use an org package

library(org.Hs.eg.db)
gns <- select(org.Hs.eg.db, row.names(mat), "SYMBOL", "ENSEMBL")

And match as above.

ADD COMMENTlink written 2.7 years ago by James W. MacDonald45k

Thank you. I really appreciate your help.

However, I'm getting the following error:

> row.names(mat)[match(gns[,1], row.names(mat))] <- gns[,2]
Error in row.names(mat)[match(gns[, 1], row.names(mat))] <- gns[, 2] :
  NAs are not allowed in subscripted assignments

I'm not sure how to convert the data in gns to characters as you have mentioned? I'm fairly new to R.

ADD REPLYlink written 2.7 years ago by alchemist4au0
2

Let this be a lesson to you about taking advice from random strangers on the intertubes ;-D But this is a really good way to learn R, by decomposing the code and seeing where it went wrong. You will have to learn how to do this, if you use R much at all, because it is not possible to always write perfect code that works the first time.

The goal was to map the Ensembl IDs to HUGO symbols, and then replace the Ensembl IDs (at least those for which we got a Ensembl -> HUGO mapping). The last line was intended to do the replacement:

row.names(mat)[match(gns[,1], row.names(mat))] <- gns[,2]

But note that the call to getBM() was

gns <- getBM(c("hgnc_symbol","ensembl_gene_id"), "ensembl_gene_id", row.names(mat), mart)

And if we look at that output, we get this:

> head(gns)
  hgnc_symbol ensembl_gene_id
1        USP2 ENSG00000036672
2      PTGER3 ENSG00000050628
3        BCL3 ENSG00000069399
4       NEDD4 ENSG00000069869
5      RNF126 ENSG00000070423
6        PAK3 ENSG00000077264

We ask getBM() to return the original Ensembl Gene ID as well as the HUGO symbols because the return object isn't necessarily in the same order as the data we sent to the Biomart server, and we want to ensure that we get the correct mapping between Ensembl Gene ID and symbol.

So let's deconstruct the code that didn't work. At a high level what we are doing is

row.names(mat) <- gns[,2]

where we add in this business

[match(gns[,1], row.names(mat))]

because we know that the gns object isn't in the same order as the original row.names of your matrix, so we use both columns of the gns object to do the Ensembl Gene ID -> HUGO mapping. However, I made a mistake; the second column of the gns data.frame contains the Ensembl IDs, and the first column contains the HUGO symbols. So when we try to match() gns[,1] with the row.names of the matrix, we get all NA values. We instead needed to match() gns[,2] with the row.names of the matrix:

row.names(mat)[match(gns[,2], row.names(mat))] <- gns[,1]

Does that make sense?

ADD REPLYlink written 2.7 years ago by James W. MacDonald45k

Yup, I went through a bunch of shenanigans trying to convert the gns to characters and in the end saw that the columns had been switched...

Thanks again stranger :)

ADD REPLYlink written 2.7 years ago by alchemist4au0

By the way, the alternative org package worked. Thank you for your help!

 

ADD REPLYlink written 2.7 years ago by alchemist4au0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 144 users visited in the last hour