How to adjust the Heatmap.2 color key

0

Entering edit mode

Martin Bonke ▴ 40

@martin-bonke-2901

Last seen 11.4 years ago

Dear all, I'm a postdoc at the University of Helsinki and currently I'm in the middle of the analyses of a huge data set of microarray data. A couple of months ago I made the jump from Genespring to using R and although the learning curve has been somewhat steep, I'm quite happy that I have done so. Right now I'm making heatmaps with the gene lists that I've generated using heatmap.2. In general I'm quite happy with the results, but in several of them I'm having some trouble with the color coding of the heatmap. My data has been normalized towards control experiments, to get a factor of up or down regulation (experiment values are divided by control values) and in general I see that genes are somewhat stronger down regulated compared to upregulated. To give an example, the strongest downregulated gene could be at -8 fold, while the strongest upregulated could be at +5 fold. So the distributon is then from -8 to +5, which puts the middle at -1.5 in the color key that heatmap.2 automatically assigns. As a result, those genes that are not really affected by my experiments (and thus have 0 fold difference towards the control experiment) fall in a slightly green zone in the color key that heatmap.2 assigns. This makes visual identification of interesting gene clusters a lot more difficult. So my question to you all is whether there is a way to tell heatmap.2 which colors should be assigned to a certain level of expression? I've thought about checking each matrix for the strongest up and down regulated values and then forcing the data to max out on whichever of the two is lowest, but that will be a lot of work, and it'll mean that I have to duplicate all data in order to conserve the original values as well. So if there is a better way, I'll gladly hear it. My thanks in advance. Best, Martin Bonke [[alternative HTML version deleted]]

Microarray GeneSpring Microarray GeneSpring • 17k views

ADD COMMENT • link 17.6 years ago Martin Bonke ▴ 40

0

Entering edit mode

Benjamin Otto ▴ 830

@benjamin-otto-1519

Last seen 11.4 years ago

Hi Martin, I would define my own color sequence. For example if your maximum logratio in your table is 5 and the minimum is -8 then you will have to decide how much color steps you like. Let me assume you use RColorBrewer for choosing a color palette. You can check the range of your data with range(#whatyoutableiscalled#). Then you could do: > mycol <- c(brewer.pal(8,"Greens"),"black",brewer.pal(5,"Reds")[5:1]) > heatmap.2(mytable, col=mycol) Regards, Benjamin -----Urspr?ngliche Nachricht----- Von: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] Im Auftrag von Martin Bonke Gesendet: Wednesday, July 09, 2008 12:21 PM An: bioconductor at stat.math.ethz.ch Betreff: [BioC] How to adjust the Heatmap.2 color key Dear all, I'm a postdoc at the University of Helsinki and currently I'm in the middle of the analyses of a huge data set of microarray data. A couple of months ago I made the jump from Genespring to using R and although the learning curve has been somewhat steep, I'm quite happy that I have done so. Right now I'm making heatmaps with the gene lists that I've generated using heatmap.2. In general I'm quite happy with the results, but in several of them I'm having some trouble with the color coding of the heatmap. My data has been normalized towards control experiments, to get a factor of up or down regulation (experiment values are divided by control values) and in general I see that genes are somewhat stronger down regulated compared to upregulated. To give an example, the strongest downregulated gene could be at -8 fold, while the strongest upregulated could be at +5 fold. So the distributon is then from -8 to +5, which puts the middle at -1.5 in the color key that heatmap.2 automatically assigns. As a result, those genes that are not really affected by my experiments (and thus have 0 fold difference towards the control experiment) fall in a slightly green zone in the color key that heatmap.2 assigns. This makes visual identification of interesting gene clusters a lot more difficult. So my question to you all is whether there is a way to tell heatmap.2 which colors should be assigned to a certain level of expression? I've thought about checking each matrix for the strongest up and down regulated values and then forcing the data to max out on whichever of the two is lowest, but that will be a lot of work, and it'll mean that I have to duplicate all data in order to conserve the original values as well. So if there is a better way, I'll gladly hear it. My thanks in advance. Best, Martin Bonke [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf K?rperschaft des ?ffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus

ADD COMMENT • link 17.6 years ago Benjamin Otto ▴ 830

0

Entering edit mode

Hi Martin, One of the many(!) arguments to heatmap.2 is breaks; see ?heatmap.2 for the explanation. I also tried what Benjamin suggested, but breaks seems to make a smoother color scale. Here's how I use it: > max(heatdata) [1] 9.167324 > min(heatdata) [1] -6.469931 > pairs.breaks <- c(seq(-7, 0, length.out=50),seq(0, 9.2, length.out=50)) > mycol <- colorpanel(n=99,low="green",mid="black",high="red") > heatmap.2(heatdata, breaks=pairs.breaks, col=mycol) Cheers, Jenny At 06:33 AM 7/9/2008, Benjamin Otto wrote: >Hi Martin, I would define my own color sequence. >For example if your maximum logratio in your >table is 5 and the minimum is -8 then you will >have to decide how much color steps you like. >Let me assume you use RColorBrewer for choosing >a color palette. You can check the range of your >data with range(#whatyoutableiscalled#). Then >you could do: > mycol <- >c(brewer.pal(8,"Greens"),"black",brewer.pal(5,"Reds")[5:1]) > > heatmap.2(mytable, col=mycol) Regards, >Benjamin -----Urspr??ngliche Nachricht----- Von: >bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch] >Im Auftrag von Martin Bonke Gesendet: Wednesday, >July 09, 2008 12:21 PM An: >bioconductor at stat.math.ethz.ch Betreff: [BioC] >How to adjust the Heatmap.2 color key Dear all, >I'm a postdoc at the University of Helsinki and >currently I'm in the middle of the analyses of a >huge data set of microarray data. A couple of >months ago I made the jump from Genespring to >using R and although the learning curve has been >somewhat steep, I'm quite happy that I have done >so. Right now I'm making heatmaps with the gene >lists that I've generated using heatmap.2. In >general I'm quite happy with the results, but in >several of them I'm having some trouble with the >color coding of the heatmap. My data has been >normalized towards control experiments, to get a >factor of up or down regulation (experiment >values are divided by control values) and in >general I see that genes are somewhat stronger >down regulated compared to upregulated. To give >an example, the strongest downregulated gene >could be at -8 fold, while the strongest >upregulated could be at +5 fold. So the >distributon is then from -8 to +5, which puts >the middle at -1.5 in the color key that >heatmap.2 automatically assigns. As a result, >those genes that are not really affected by my >experiments (and thus have 0 fold difference >towards the control experiment) fall in a >slightly green zone in the color key that >heatmap.2 assigns. This makes visual >identification of interesting gene clusters a >lot more difficult. So my question to you all is >whether there is a way to tell heatmap.2 which >colors should be assigned to a certain level of >expression? I've thought about checking each >matrix for the strongest up and down regulated >values and then forcing the data to max out on >whichever of the two is lowest, but that will be >a lot of work, and it'll mean that I have to >duplicate all data in order to conserve the >original values as well. So if there is a better >way, I'll gladly hear it. My thanks in advance. >Best, Martin Bonke [[alternative HTML >version deleted]] >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor >-- Pflichtangaben gem???? Gesetz ??ber >elektronische Handelsregister und >Genossenschaftsregister sowie das >Unternehmensregister (EHUG): >Universit??tsklinikum Hamburg-Eppendorf >K??rperschaft des ??ffentlichen Rechts >Gerichtsstand: Hamburg Vorstandsmitglieder: >Prof. Dr. J??rg F. Debatin (Vorsitzender) Dr. >Alexander Kirstein Ricarda Klein Prof. Dr. Dr. >Uwe Koch-Gromus >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

ADD REPLY • link 17.6 years ago Jenny Drnevich ★ 2.0k

0

Entering edit mode

Hi Jenny, hi Martin, Yes, Jenny definitely is right, I forgot about the "breaks" option because I never used it for shifting the mean. Maybe one should add two points here: 1. Jennys solution, as far as I can see with a quick glimpse on the code, will give you an asymmetric binning size on each side of the zero value, as long as you set the length.out argument is 50 for each color. So if you do want to use maybe an equal bin size (e.g. each step about 0.5 signal) then you'll have to adjust the number of color bins according to the ratio of the min and max. 2. The reason I didn't have that argument in mind that moment was a nice and mighty feature of that "breaks" argument I used to associate it with. That is the possibility of the asymmetric bin sizes from another point of view. Suppose you have some very few outliers with very extreme values masking your real data distribution then they will tend to spread the color scale on the edges and condense it in the middle. You won't see the real differences in your normal value range because they will probably have very similar colors. In this case you can replace parts of Jennys pairs.breaks sequence with bins derived from your data quantiles. Regards, Benjamin -----Urspr?ngliche Nachricht----- Von: Jenny Drnevich [mailto:drnevich at illinois.edu] Gesendet: Wednesday, July 09, 2008 3:49 PM An: Benjamin Otto; 'Martin Bonke'; bioconductor at stat.math.ethz.ch Betreff: Re: [BioC] How to adjust the Heatmap.2 color key Hi Martin, One of the many(!) arguments to heatmap.2 is breaks; see ?heatmap.2 for the explanation. I also tried what Benjamin suggested, but breaks seems to make a smoother color scale. Here's how I use it: > max(heatdata) [1] 9.167324 > min(heatdata) [1] -6.469931 > pairs.breaks <- c(seq(-7, 0, length.out=50),seq(0, 9.2, length.out=50)) > mycol <- colorpanel(n=99,low="green",mid="black",high="red") > heatmap.2(heatdata, breaks=pairs.breaks, col=mycol) Cheers, Jenny At 06:33 AM 7/9/2008, Benjamin Otto wrote: >Hi Martin, I would define my own color sequence. >For example if your maximum logratio in your >table is 5 and the minimum is -8 then you will >have to decide how much color steps you like. >Let me assume you use RColorBrewer for choosing >a color palette. You can check the range of your >data with range(#whatyoutableiscalled#). Then >you could do: > mycol <- >c(brewer.pal(8,"Greens"),"black",brewer.pal(5,"Reds")[5:1]) > > heatmap.2(mytable, col=mycol) Regards, >Benjamin -----Urspr??ngliche Nachricht----- Von: >bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch] >Im Auftrag von Martin Bonke Gesendet: Wednesday, >July 09, 2008 12:21 PM An: >bioconductor at stat.math.ethz.ch Betreff: [BioC] >How to adjust the Heatmap.2 color key Dear all, >I'm a postdoc at the University of Helsinki and >currently I'm in the middle of the analyses of a >huge data set of microarray data. A couple of >months ago I made the jump from Genespring to >using R and although the learning curve has been >somewhat steep, I'm quite happy that I have done >so. Right now I'm making heatmaps with the gene >lists that I've generated using heatmap.2. In >general I'm quite happy with the results, but in >several of them I'm having some trouble with the >color coding of the heatmap. My data has been >normalized towards control experiments, to get a >factor of up or down regulation (experiment >values are divided by control values) and in >general I see that genes are somewhat stronger >down regulated compared to upregulated. To give >an example, the strongest downregulated gene >could be at -8 fold, while the strongest >upregulated could be at +5 fold. So the >distributon is then from -8 to +5, which puts >the middle at -1.5 in the color key that >heatmap.2 automatically assigns. As a result, >those genes that are not really affected by my >experiments (and thus have 0 fold difference >towards the control experiment) fall in a >slightly green zone in the color key that >heatmap.2 assigns. This makes visual >identification of interesting gene clusters a >lot more difficult. So my question to you all is >whether there is a way to tell heatmap.2 which >colors should be assigned to a certain level of >expression? I've thought about checking each >matrix for the strongest up and down regulated >values and then forcing the data to max out on >whichever of the two is lowest, but that will be >a lot of work, and it'll mean that I have to >duplicate all data in order to conserve the >original values as well. So if there is a better >way, I'll gladly hear it. My thanks in advance. >Best, Martin Bonke [[alternative HTML >version deleted]] >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor >-- Pflichtangaben gem???? Gesetz ??ber >elektronische Handelsregister und >Genossenschaftsregister sowie das >Unternehmensregister (EHUG): >Universit??tsklinikum Hamburg-Eppendorf >K??rperschaft des ??ffentlichen Rechts >Gerichtsstand: Hamburg Vorstandsmitglieder: >Prof. Dr. J??rg F. Debatin (Vorsitzender) Dr. >Alexander Kirstein Ricarda Klein Prof. Dr. Dr. >Uwe Koch-Gromus >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf K?rperschaft des ?ffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus

ADD REPLY • link 17.6 years ago Benjamin Otto ▴ 830

0

Entering edit mode

Hi Benjamin, Good points! Some adjustments to codes: >1. Jennys solution, as far as I can see with a >quick glimpse on the code, will give you an >asymmetric binning size on each side of the zero >value, as long as you set the length.out >argument is 50 for each color. So if you do want >to use maybe an equal bin size (e.g. each step >about 0.5 signal) then you'll have to adjust the >number of color bins according to the ratio of the min and max. Make symmetrical bins of 0.5 signal ranging to +/- 10: > pairs.breaks <- seq(-10, 10, by=0.5) > length(pairs.breaks) [1] 41 Then make a color panel that has 40 bins (one less than the number of breaks): > mycol <- colorpanel(n=40,low="green",mid="black",high="red") > length(mycol) [1] 40 Then when you call heatmap.2, subset both pairs.break and mycol to the range of your data (in this case, trim off the first 6 and last one breaks/colors to range from -7 to 9.5): > heatmap.2(heatdata, breaks=pairs.breaks[7:40], col=mycol[7:39]) >2. The reason I didn't have that argument in >mind that moment was a nice and mighty feature >of that "breaks" argument I used to associate it >with. That is the possibility of the asymmetric >bin sizes from another point of view. Suppose >you have some very few outliers with very >extreme values masking your real data >distribution then they will tend to spread the >color scale on the edges and condense it in the >middle. You won't see the real differences in >your normal value range because they will >probably have very similar colors. In this case >you can replace parts of Jennys pairs.breaks >sequence with bins derived from your data quantiles. Here's a simple situation of just having the final bin be a + group. Say you have an outlier: > sort(heatdata,decreasing=T)[1:5] [1] 255.1548114 9.167324 6.802191 4.948795 4.789992 You could adjust the final break so that the final bin means 9.5+ : > pairs.breaks[40:41] [1] 9.5 10.0 > pairs.breaks[41] <- 500 > pairs.breaks[40:41] [1] 9.5 500.0 Then just cut off the first 6 breaks/colors > heatmap.2(heatdata, breaks=pairs.breaks[7:41], col=mycol[7:40]) Of course, how then do you change the labels on the color key to indicate 9.5+? Jenny >Regards, Benjamin -----Urspr??ngliche >Nachricht----- Von: Jenny Drnevich >[mailto:drnevich at illinois.edu] Gesendet: >Wednesday, July 09, 2008 3:49 PM An: Benjamin >Otto; 'Martin Bonke'; >bioconductor at stat.math.ethz.ch Betreff: Re: >[BioC] How to adjust the Heatmap.2 color key Hi >Martin, One of the many(!) arguments to >heatmap.2 is breaks; see ?heatmap.2 for the >explanation. I also tried what Benjamin >suggested, but breaks seems to make a smoother >color scale. Here's how I use it: > >max(heatdata) [1] 9.167324 > min(heatdata) [1] >-6.469931 > pairs.breaks <- c(seq(-7, 0, >length.out=50),seq(0, 9.2, length.out=50)) > >mycol <- >colorpanel(n=99,low="green",mid="black",high="red") > > heatmap.2(heatdata, breaks=pairs.breaks, >col=mycol) Cheers, Jenny At 06:33 AM 7/9/2008, >Benjamin Otto wrote: >Hi Martin, I would define >my own color sequence. >For example if your >maximum logratio in your >table is 5 and the >minimum is -8 then you will >have to decide how >much color steps you like. >Let me assume you >use RColorBrewer for choosing >a color palette. >You can check the range of your >data with >range(#whatyoutableiscalled#). Then >you could >do: > mycol ><- >c(brewer.pal(8,"Greens"),"black",brewer.pal(5 >,"Reds")[5:1]) > > heatmap.2(mytable, >col=mycol) Regards, >Benjamin >-----Urspr????ngliche Nachricht----- >Von: >bioconductor-bounces at stat.math.ethz.ch >[ma >ilto:bioconductor-bounces at stat.math.ethz.ch] >Im >Auftrag von Martin Bonke Gesendet: >Wednesday, >July 09, 2008 12:21 PM >An: >bioconductor at stat.math.ethz.ch Betreff: >[BioC] >How to adjust the Heatmap.2 color key >Dear all, >I'm a postdoc at the University of >Helsinki and >currently I'm in the middle of the >analyses of a >huge data set of microarray data. >A couple of >months ago I made the jump from >Genespring to >using R and although the learning >curve has been >somewhat steep, I'm quite happy >that I have done >so. Right now I'm making >heatmaps with the gene >lists that I've >generated using heatmap.2. In >general I'm quite >happy with the results, but in >several of them >I'm having some trouble with the >color coding >of the heatmap. My data has been >normalized >towards control experiments, to get a >factor of >up or down regulation (experiment >values are >divided by control values) and in >general I see >that genes are somewhat stronger >down regulated >compared to upregulated. To give >an example, >the strongest downregulated gene >could be at -8 >fold, while the strongest >upregulated could be >at +5 fold. So the >distributon is then from -8 >to +5, which puts >the middle at -1.5 in the >color key that >heatmap.2 automatically assigns. >As a result, >those genes that are not really >affected by my >experiments (and thus have 0 >fold difference >towards the control experiment) >fall in a >slightly green zone in the color key >that >heatmap.2 assigns. This makes >visual >identification of interesting gene >clusters a >lot more difficult. So my question >to you all is >whether there is a way to tell >heatmap.2 which >colors should be assigned to a >certain level of >expression? I've thought about >checking each >matrix for the strongest up and >down regulated >values and then forcing the data >to max out on >whichever of the two is lowest, >but that will be >a lot of work, and it'll mean >that I have to >duplicate all data in order to >conserve the >original values as well. So if >there is a better >way, I'll gladly hear it. My >thanks in advance. >Best, Martin >Bonke [[alternative HTML >version >deleted]] >______________________________________ >_________ >Bioconductor mailing >list >Bioconductor at stat.math.ethz.ch >https://sta >t.ethz.ch/mailman/listinfo/bioconductor >Search >the >archives: >http://news.gmane.org/gmane.science.bi >ology.informatics.conductor >-- Pflichtangaben >gem??????? Gesetz ????ber >elektronische >Handelsregister und >Genossenschaftsregister >sowie das >Unternehmensregister >(EHUG): >Universit????tsklinikum >Hamburg-Eppendorf >K????rperschaft des >????ffentlichen Rechts >Gerichtsstand: Hamburg >Vorstandsmitglieder: >Prof. Dr. J????rg F. >Debatin (Vorsitzender) Dr. >Alexander Kirstein >Ricarda Klein Prof. Dr. Dr. >Uwe >Koch-Gromus >____________________________________ >___________ >Bioconductor mailing >list >Bioconductor at stat.math.ethz.ch >https://sta >t.ethz.ch/mailman/listinfo/bioconductor >Search >the >archives: >http://news.gmane.org/gmane.science.bi >ology.informatics.conductor Jenny Drnevich, >Ph.D. Functional Genomics Bioinformatics >Specialist W.M. Keck Center for Comparative and >Functional Genomics Roy J. Carver Biotechnology >Center University of Illinois, Urbana-Champaign >330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 >USA ph: 217-244-7355 fax: 217-265-5066 e-mail: >drnevich at illinois.edu -- Pflichtangaben gem???? >Gesetz ??ber elektronische Handelsregister und >Genossenschaftsregister sowie das >Unternehmensregister (EHUG): >Universit??tsklinikum Hamburg-Eppendorf >K??rperschaft des ??ffentlichen Rechts >Gerichtsstand: Hamburg Vorstandsmitglieder: >Prof. Dr. J??rg F. Debatin (Vorsitzender) Dr. >Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus

ADD REPLY • link 17.6 years ago Jenny Drnevich ★ 2.0k

0

Entering edit mode

> Of course, how then do you change the labels on the color key to indicate 9.5+? Well actually the color key is labeled by the provided breaks, right? So the last color is labeled with 9.5 and 500 because the tick is not centered to the color cell but you have two ticks one on the left and one the right. That is why you need one break more than colors. You wouldn't need a "9.5+" here. Or do I misunderstand? Quite the nastier problem of the labeling of heatmap.2 is the size and placement of the key. I usually had problems reading the key labels because half of them were missing due to the condensed size in the upper left corner. So what Martin could do, is have a second look at the Heatplus package which places the color key, or there it is called legend, at the bottom. Looks better if you really use more than 7 or 9 color grades. Benjamin -----Urspr?ngliche Nachricht----- Von: Jenny Drnevich [mailto:drnevich at illinois.edu] Gesendet: Wednesday, July 09, 2008 5:57 PM An: Benjamin Otto; 'Martin Bonke'; bioconductor at stat.math.ethz.ch Betreff: Re: AW: [BioC] How to adjust the Heatmap.2 color key Hi Benjamin, Good points! Some adjustments to codes: >1. Jennys solution, as far as I can see with a >quick glimpse on the code, will give you an >asymmetric binning size on each side of the zero >value, as long as you set the length.out >argument is 50 for each color. So if you do want >to use maybe an equal bin size (e.g. each step >about 0.5 signal) then you'll have to adjust the >number of color bins according to the ratio of the min and max. Make symmetrical bins of 0.5 signal ranging to +/- 10: > pairs.breaks <- seq(-10, 10, by=0.5) > length(pairs.breaks) [1] 41 Then make a color panel that has 40 bins (one less than the number of breaks): > mycol <- colorpanel(n=40,low="green",mid="black",high="red") > length(mycol) [1] 40 Then when you call heatmap.2, subset both pairs.break and mycol to the range of your data (in this case, trim off the first 6 and last one breaks/colors to range from -7 to 9.5): > heatmap.2(heatdata, breaks=pairs.breaks[7:40], col=mycol[7:39]) >2. The reason I didn't have that argument in >mind that moment was a nice and mighty feature >of that "breaks" argument I used to associate it >with. That is the possibility of the asymmetric >bin sizes from another point of view. Suppose >you have some very few outliers with very >extreme values masking your real data >distribution then they will tend to spread the >color scale on the edges and condense it in the >middle. You won't see the real differences in >your normal value range because they will >probably have very similar colors. In this case >you can replace parts of Jennys pairs.breaks >sequence with bins derived from your data quantiles. Here's a simple situation of just having the final bin be a + group. Say you have an outlier: > sort(heatdata,decreasing=T)[1:5] [1] 255.1548114 9.167324 6.802191 4.948795 4.789992 You could adjust the final break so that the final bin means 9.5+ : > pairs.breaks[40:41] [1] 9.5 10.0 > pairs.breaks[41] <- 500 > pairs.breaks[40:41] [1] 9.5 500.0 Then just cut off the first 6 breaks/colors > heatmap.2(heatdata, breaks=pairs.breaks[7:41], col=mycol[7:40]) Of course, how then do you change the labels on the color key to indicate 9.5+? Jenny >Regards, Benjamin -----Urspr??ngliche >Nachricht----- Von: Jenny Drnevich >[mailto:drnevich at illinois.edu] Gesendet: >Wednesday, July 09, 2008 3:49 PM An: Benjamin >Otto; 'Martin Bonke'; >bioconductor at stat.math.ethz.ch Betreff: Re: >[BioC] How to adjust the Heatmap.2 color key Hi >Martin, One of the many(!) arguments to >heatmap.2 is breaks; see ?heatmap.2 for the >explanation. I also tried what Benjamin >suggested, but breaks seems to make a smoother >color scale. Here's how I use it: > >max(heatdata) [1] 9.167324 > min(heatdata) [1] >-6.469931 > pairs.breaks <- c(seq(-7, 0, >length.out=50),seq(0, 9.2, length.out=50)) > >mycol <- >colorpanel(n=99,low="green",mid="black",high="red") > > heatmap.2(heatdata, breaks=pairs.breaks, >col=mycol) Cheers, Jenny At 06:33 AM 7/9/2008, >Benjamin Otto wrote: >Hi Martin, I would define >my own color sequence. >For example if your >maximum logratio in your >table is 5 and the >minimum is -8 then you will >have to decide how >much color steps you like. >Let me assume you >use RColorBrewer for choosing >a color palette. >You can check the range of your >data with >range(#whatyoutableiscalled#). Then >you could >do: > mycol ><- >c(brewer.pal(8,"Greens"),"black",brewer.pal(5 >,"Reds")[5:1]) > > heatmap.2(mytable, >col=mycol) Regards, >Benjamin >-----Urspr????ngliche Nachricht----- >Von: >bioconductor-bounces at stat.math.ethz.ch >[ma >ilto:bioconductor-bounces at stat.math.ethz.ch] >Im >Auftrag von Martin Bonke Gesendet: >Wednesday, >July 09, 2008 12:21 PM >An: >bioconductor at stat.math.ethz.ch Betreff: >[BioC] >How to adjust the Heatmap.2 color key >Dear all, >I'm a postdoc at the University of >Helsinki and >currently I'm in the middle of the >analyses of a >huge data set of microarray data. >A couple of >months ago I made the jump from >Genespring to >using R and although the learning >curve has been >somewhat steep, I'm quite happy >that I have done >so. Right now I'm making >heatmaps with the gene >lists that I've >generated using heatmap.2. In >general I'm quite >happy with the results, but in >several of them >I'm having some trouble with the >color coding >of the heatmap. My data has been >normalized >towards control experiments, to get a >factor of >up or down regulation (experiment >values are >divided by control values) and in >general I see >that genes are somewhat stronger >down regulated >compared to upregulated. To give >an example, >the strongest downregulated gene >could be at -8 >fold, while the strongest >upregulated could be >at +5 fold. So the >distributon is then from -8 >to +5, which puts >the middle at -1.5 in the >color key that >heatmap.2 automatically assigns. >As a result, >those genes that are not really >affected by my >experiments (and thus have 0 >fold difference >towards the control experiment) >fall in a >slightly green zone in the color key >that >heatmap.2 assigns. This makes >visual >identification of interesting gene >clusters a >lot more difficult. So my question >to you all is >whether there is a way to tell >heatmap.2 which >colors should be assigned to a >certain level of >expression? I've thought about >checking each >matrix for the strongest up and >down regulated >values and then forcing the data >to max out on >whichever of the two is lowest, >but that will be >a lot of work, and it'll mean >that I have to >duplicate all data in order to >conserve the >original values as well. So if >there is a better >way, I'll gladly hear it. My >thanks in advance. >Best, Martin >Bonke [[alternative HTML >version >deleted]] >______________________________________ >_________ >Bioconductor mailing >list >Bioconductor at stat.math.ethz.ch >https://sta >t.ethz.ch/mailman/listinfo/bioconductor >Search >the >archives: >http://news.gmane.org/gmane.science.bi >ology.informatics.conductor >-- Pflichtangaben >gem??????? Gesetz ????ber >elektronische >Handelsregister und >Genossenschaftsregister >sowie das >Unternehmensregister >(EHUG): >Universit????tsklinikum >Hamburg-Eppendorf >K????rperschaft des >????ffentlichen Rechts >Gerichtsstand: Hamburg >Vorstandsmitglieder: >Prof. Dr. J????rg F. >Debatin (Vorsitzender) Dr. >Alexander Kirstein >Ricarda Klein Prof. Dr. Dr. >Uwe >Koch-Gromus >____________________________________ >___________ >Bioconductor mailing >list >Bioconductor at stat.math.ethz.ch >https://sta >t.ethz.ch/mailman/listinfo/bioconductor >Search >the >archives: >http://news.gmane.org/gmane.science.bi >ology.informatics.conductor Jenny Drnevich, >Ph.D. Functional Genomics Bioinformatics >Specialist W.M. Keck Center for Comparative and >Functional Genomics Roy J. Carver Biotechnology >Center University of Illinois, Urbana-Champaign >330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 >USA ph: 217-244-7355 fax: 217-265-5066 e-mail: >drnevich at illinois.edu -- Pflichtangaben gem???? >Gesetz ??ber elektronische Handelsregister und >Genossenschaftsregister sowie das >Unternehmensregister (EHUG): >Universit??tsklinikum Hamburg-Eppendorf >K??rperschaft des ??ffentlichen Rechts >Gerichtsstand: Hamburg Vorstandsmitglieder: >Prof. Dr. J??rg F. Debatin (Vorsitzender) Dr. >Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf K?rperschaft des ?ffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus

ADD REPLY • link 17.6 years ago Benjamin Otto ▴ 830

0

Entering edit mode

Dear Martin, some time ago I wrote the following function heatmapCol (package SLmisc) which could be useful for your purpose (see also the example below). heatmapCol <- function (data, col, lim) { nrcol <- length(col) data.range <- range(data) if (diff(data.range) == 0) stop("data has range 0") if (lim <= 0) stop("lim has to be positive") if (lim > min(abs(data.range))) { warning("specified bound 'lim' is out of data range\n\n hence 'min(abs(range(data)))' is used") lim <- min(abs(data.range)) } nrcol <- length(col) reps1 <- ceiling(nrcol * (-lim - data.range[1])/(2 * lim)) reps2 <- ceiling(nrcol * (data.range[2] - lim)/(2 * lim)) col1 <- c(rep(col[1], reps1), col, rep(col[nrcol], reps2)) return(col1) } ## Example data.plot <- matrix(rnorm(100*50, sd = 1), ncol = 50) colnames(data.plot) <- paste("patient", 1:50) rownames(data.plot) <- paste("gene", 1:100) data.plot[1:70, 1:30] <- data.plot[1:70, 1:30] + 3 data.plot[71:100, 31:50] <- data.plot[71:100, 31:50] - 1.4 data.plot[1:70, 31:50] <- rnorm(1400, sd = 1.2) data.plot[71:100, 1:30] <- rnorm(900, sd = 1.2) nrcol <- 128 require(gplots) require(RColorBrewer) heatmap.2(data.plot, col = rev(colorRampPalette(brewer.pal(10, "RdBu"))(nrcol)), trace = "none", tracecol = "black") farbe <- heatmapCol(data = data.plot, col = rev(colorRampPalette(brewer.pal(10, "RdBu"))(nrcol)), lim = min(abs(range(data.plot)))-1) heatmap.2(data.plot, col = farbe, trace = "none", tracecol = "black") Best, Matthias Benjamin Otto wrote: >> Of course, how then do you change the labels on the color key to indicate 9.5+? >> > > Well actually the color key is labeled by the provided breaks, right? So the last color is labeled with 9.5 and 500 because the tick is not centered to the color cell but you have two ticks one on the left and one the right. That is why you need one break more than colors. You wouldn't need a "9.5+" here. Or do I misunderstand? > > Quite the nastier problem of the labeling of heatmap.2 is the size and placement of the key. I usually had problems reading the key labels because half of them were missing due to the condensed size in the upper left corner. So what Martin could do, is have a second look at the Heatplus package which places the color key, or there it is called legend, at the bottom. Looks better if you really use more than 7 or 9 color grades. > > Benjamin > > > > > > -----Urspr?ngliche Nachricht----- > Von: Jenny Drnevich [mailto:drnevich at illinois.edu] > Gesendet: Wednesday, July 09, 2008 5:57 PM > An: Benjamin Otto; 'Martin Bonke'; bioconductor at stat.math.ethz.ch > Betreff: Re: AW: [BioC] How to adjust the Heatmap.2 color key > > Hi Benjamin, > > Good points! Some adjustments to codes: > > >> 1. Jennys solution, as far as I can see with a >> quick glimpse on the code, will give you an >> asymmetric binning size on each side of the zero >> value, as long as you set the length.out >> argument is 50 for each color. So if you do want >> to use maybe an equal bin size (e.g. each step >> about 0.5 signal) then you'll have to adjust the >> number of color bins according to the ratio of the min and max. >> > > Make symmetrical bins of 0.5 signal ranging to +/- 10: > > > pairs.breaks <- seq(-10, 10, by=0.5) > > length(pairs.breaks) > [1] 41 > > Then make a color panel that has 40 bins (one less than the number of breaks): > > > mycol <- colorpanel(n=40,low="green",mid="black",high="red") > > length(mycol) > [1] 40 > > Then when you call heatmap.2, subset both > pairs.break and mycol to the range of your data > (in this case, trim off the first 6 and last one > breaks/colors to range from -7 to 9.5): > > > heatmap.2(heatdata, breaks=pairs.breaks[7:40], col=mycol[7:39]) > > > >> 2. The reason I didn't have that argument in >> mind that moment was a nice and mighty feature >> of that "breaks" argument I used to associate it >> with. That is the possibility of the asymmetric >> bin sizes from another point of view. Suppose >> you have some very few outliers with very >> extreme values masking your real data >> distribution then they will tend to spread the >> color scale on the edges and condense it in the >> middle. You won't see the real differences in >> your normal value range because they will >> probably have very similar colors. In this case >> you can replace parts of Jennys pairs.breaks >> sequence with bins derived from your data quantiles. >> > > Here's a simple situation of just having the > final bin be a + group. Say you have an outlier: > > > sort(heatdata,decreasing=T)[1:5] > [1] 255.1548114 9.167324 6.802191 4.948795 4.789992 > > You could adjust the final break so that the final bin means 9.5+ : > > pairs.breaks[40:41] > [1] 9.5 10.0 > > > pairs.breaks[41] <- 500 > > > pairs.breaks[40:41] > [1] 9.5 500.0 > > Then just cut off the first 6 breaks/colors > > > heatmap.2(heatdata, breaks=pairs.breaks[7:41], col=mycol[7:40]) > > Of course, how then do you change the labels on the color key to indicate 9.5+? > > Jenny > > > >> Regards, Benjamin -----Urspr??ngliche >> Nachricht----- Von: Jenny Drnevich >> [mailto:drnevich at illinois.edu] Gesendet: >> Wednesday, July 09, 2008 3:49 PM An: Benjamin >> Otto; 'Martin Bonke'; >> bioconductor at stat.math.ethz.ch Betreff: Re: >> [BioC] How to adjust the Heatmap.2 color key Hi >> Martin, One of the many(!) arguments to >> heatmap.2 is breaks; see ?heatmap.2 for the >> explanation. I also tried what Benjamin >> suggested, but breaks seems to make a smoother >> color scale. Here's how I use it: > >> max(heatdata) [1] 9.167324 > min(heatdata) [1] >> -6.469931 > pairs.breaks <- c(seq(-7, 0, >> length.out=50),seq(0, 9.2, length.out=50)) > >> mycol <- >> colorpanel(n=99,low="green",mid="black",high="red") >> >>> heatmap.2(heatdata, breaks=pairs.breaks, >>> >> col=mycol) Cheers, Jenny At 06:33 AM 7/9/2008, >> Benjamin Otto wrote: >Hi Martin, I would define >> my own color sequence. >For example if your >> maximum logratio in your >table is 5 and the >> minimum is -8 then you will >have to decide how >> much color steps you like. >Let me assume you >> use RColorBrewer for choosing >a color palette. >> You can check the range of your >data with >> range(#whatyoutableiscalled#). Then >you could >> do: > mycol >> <- >c(brewer.pal(8,"Greens"),"black",brewer.pal(5 >> ,"Reds")[5:1]) > > heatmap.2(mytable, >> col=mycol) Regards, >Benjamin >> -----Urspr????ngliche Nachricht----- >> Von: >bioconductor-bounces at stat.math.ethz.ch >[ma >> ilto:bioconductor-bounces at stat.math.ethz.ch] >Im >> Auftrag von Martin Bonke Gesendet: >> Wednesday, >July 09, 2008 12:21 PM >> An: >bioconductor at stat.math.ethz.ch Betreff: >> [BioC] >How to adjust the Heatmap.2 color key >> Dear all, >I'm a postdoc at the University of >> Helsinki and >currently I'm in the middle of the >> analyses of a >huge data set of microarray data. >> A couple of >months ago I made the jump from >> Genespring to >using R and although the learning >> curve has been >somewhat steep, I'm quite happy >> that I have done >so. Right now I'm making >> heatmaps with the gene >lists that I've >> generated using heatmap.2. In >general I'm quite >> happy with the results, but in >several of them >> I'm having some trouble with the >color coding >> of the heatmap. My data has been >normalized >> towards control experiments, to get a >factor of >> up or down regulation (experiment >values are >> divided by control values) and in >general I see >> that genes are somewhat stronger >down regulated >> compared to upregulated. To give >an example, >> the strongest downregulated gene >could be at -8 >> fold, while the strongest >upregulated could be >> at +5 fold. So the >distributon is then from -8 >> to +5, which puts >the middle at -1.5 in the >> color key that >heatmap.2 automatically assigns. >> As a result, >those genes that are not really >> affected by my >experiments (and thus have 0 >> fold difference >towards the control experiment) >> fall in a >slightly green zone in the color key >> that >heatmap.2 assigns. This makes >> visual >identification of interesting gene >> clusters a >lot more difficult. So my question >> to you all is >whether there is a way to tell >> heatmap.2 which >colors should be assigned to a >> certain level of >expression? I've thought about >> checking each >matrix for the strongest up and >> down regulated >values and then forcing the data >> to max out on >whichever of the two is lowest, >> but that will be >a lot of work, and it'll mean >> that I have to >duplicate all data in order to >> conserve the >original values as well. So if >> there is a better >way, I'll gladly hear it. My >> thanks in advance. >Best, Martin >> Bonke [[alternative HTML >version >> deleted]] >______________________________________ >> _________ >Bioconductor mailing >> list >Bioconductor at stat.math.ethz.ch >https://sta >> t.ethz.ch/mailman/listinfo/bioconductor >Search >> the >> archives: >http://news.gmane.org/gmane.science.bi >> ology.informatics.conductor >-- Pflichtangaben >> gem??????? Gesetz ????ber >elektronische >> Handelsregister und >Genossenschaftsregister >> sowie das >Unternehmensregister >> (EHUG): >Universit????tsklinikum >> Hamburg-Eppendorf >K????rperschaft des >> ????ffentlichen Rechts >Gerichtsstand: Hamburg >> Vorstandsmitglieder: >Prof. Dr. J????rg F. >> Debatin (Vorsitzender) Dr. >Alexander Kirstein >> Ricarda Klein Prof. Dr. Dr. >Uwe >> Koch-Gromus >____________________________________ >> ___________ >Bioconductor mailing >> list >Bioconductor at stat.math.ethz.ch >https://sta >> t.ethz.ch/mailman/listinfo/bioconductor >Search >> the >> archives: >http://news.gmane.org/gmane.science.bi >> ology.informatics.conductor Jenny Drnevich, >> Ph.D. Functional Genomics Bioinformatics >> Specialist W.M. Keck Center for Comparative and >> Functional Genomics Roy J. Carver Biotechnology >> Center University of Illinois, Urbana-Champaign >> 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 >> USA ph: 217-244-7355 fax: 217-265-5066 e-mail: >> drnevich at illinois.edu -- Pflichtangaben gem???? >> Gesetz ??ber elektronische Handelsregister und >> Genossenschaftsregister sowie das >> Unternehmensregister (EHUG): >> Universit??tsklinikum Hamburg-Eppendorf >> K??rperschaft des ??ffentlichen Rechts >> Gerichtsstand: Hamburg Vorstandsmitglieder: >> Prof. Dr. J??rg F. Debatin (Vorsitzender) Dr. >> Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus >> > > > > > -- Dr. Matthias Kohl www.stamats.de

ADD REPLY • link 17.6 years ago Matthias Kohl ▴ 160

0

Entering edit mode

Martin Bonke ▴ 40

@martin-bonke-2901

Last seen 11.4 years ago

Thank you all, I'm going to look in to all the suggestions to see which one works best for me, but all your help has been very welcome. Best, Martin

ADD COMMENT • link 17.6 years ago Martin Bonke ▴ 40

Login before adding your answer.