Question: Heatmap of a subset of genes
gravatar for giroudpaul
3.6 years ago by
giroudpaul40 wrote:


I am trying to reproduce this kind of figure (Chemokines differentially regulated in the five differents cells type on top) :


I would like to do the same, first with just M1 and M2 conditions, and eventually more. I would also like to be able to choose the genes in this heatmap (if you have advises on how to do this, I'll take it !)

So for now I have :

  1. the AffyBatch form the CEL files
  2. the annotated expression set normalized with rma().
  3. The MarrayLM from lmfit(data.rma,design)
  4. The MarrayLM from eBayes( with pair-wise comparisons (contrasts)

From which one should I build the matrix for heatmap() ? How ?

I want to focus on gene coding for plasma membrane proteins, is there a way to do this ?

Thanks for your help !

heatmap • 2.7k views
ADD COMMENTlink modified 3.0 years ago by caroline0 • written 3.6 years ago by giroudpaul40
Answer: Heatmap of a subset of genes
gravatar for chris86
3.6 years ago by
UCL, United Kingdom
chris86390 wrote:

So you want to work with the normalised expression matrix. I normally start with a data frame actually, they are easier to sort and subset etc. You can convert between them using and as.matrix(), but it may be easier just to write the matrix then read it in again, that is what I do after neqc() normalisation.

I always have an annotation file (data frame) to use which will be in the same order as the columns in the matrix/ data frame. The annotation file can be used to add annotation in the heatmap such as the M1 and M2 conditions you describe. I prefer to use aheatmap function from NMF, it is easier to use than heatmap.2, and I think the output looks better. People normally select a subset of genes to use, such as the most variable or differentially expressed from limma.

You can easily subset the genes which should be the row names in your data frame using, I would do subsetteddf <- subset(newdf, row.names %in% listofgenes) to get a subsetted data frame. Then you can convert it back to a matrix for plotting.

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by chris86390

So correct me if I'm wrong:

data.rma <- rma(data) = lmFit(data.rma,design) <- eBayes(

And using topTable, I extract expression values for M1 and M2, and then do the subset, and that's it ?

However, If I do this, I get Expression values, is it ok to plot this or do I have to transform it into something else (log ? But it won't be logFC since I don't compare to another condition)

ADD REPLYlink written 3.6 years ago by giroudpaul40

You dont want to extract any expression values or fold changes from limma. Get the expression in normalised intensities after you normalise using some function you have. Then get the genes that are DE from limma seperately.  Then select those genes in the data frame and then run aheatmap or heatmap.2 or whatever you have.

ADD REPLYlink written 3.6 years ago by chris86390

Yeah, thanks, so I did extracted the exprs(data.rma)corresponding to the gene I identified with Limma.

I tried heatmap, heatmap.2 and aheatmap, and I also find aheatmap is the easiest and looks betters than the other two.

Two questions though. Should I Scale the data if I just take upregulated gene ? Because using scale=row make one condition, the upregulated one all red with about +1.5 Z-Score, and the other, all white with Z-scores about -1.5. Doesn't it make it confusing ? This genes are not downregulated in the second conditions, there just less expressed. Or is it the same. On the other hand, using normalized expression values may not be the most informative thing to plot right ?

Also, how do you control the row label size. It seems that cexRow does nothing.


ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by giroudpaul40

I don't think there is a right way of doing it with the scaling, I scale each row, but I think the results should be similar. I have never found the row label size to be a problem with aheatmap, there may be more options in the developer version on github. If not you will have to edit the code yourself.

ADD REPLYlink written 3.6 years ago by chris86390

Yes, the devel version (0.22 or higher) is way better than the Bioconductor one (0.20)

However, I had a hard time installing it on windows... (Need to install Rtools first)

ADD REPLYlink written 3.6 years ago by giroudpaul40
Answer: Heatmap of a subset of genes
gravatar for caroline
3.0 years ago by
caroline0 wrote:

The chemokines family of proteins has broad, diverse functional repertoires. However, their structural variation is narrow. Chemokines are small (8-10 kDa), secreted single polypeptide chains 70-100 residues long. Across the family, the proteins have 20-95% amino acid sequence identity (including conserved cysteine residues) and new members are continuing to be identified at a rapid pace; 

ADD COMMENTlink modified 3.0 years ago by Gordon Smyth39k • written 3.0 years ago by caroline0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 225 users visited in the last hour