Question

Method for generating heatmap of overlapping DEGs?

0

Entering edit mode

axe880 • 0

@172ed42b

Last seen 3.8 years ago

United Kingdom

Hi everyone,

My experiment includes identifying DEGs in potatoes at various timepoints during drought when compared to the control (with no drought conditions). I have two varieties of crop - tolerant and susceptible.

I have identified multiple DEGs across my experiment. Specifically, I have identified DEGs (1): unique to the Tolerant variety compared to the Susceptible at each time point, as well as (2): genes unique to the Tolerant time points compared back to the Tolerant control.

With this in mind, I hope to make a heatmap of these unique DEGs and my samples. I am unsure however, how to extract these DEGs from my vsd file successfully.

I assume it is using the filter command, where I would filter for the unique genes in each comparison?

If anybody could please suggest a basic script to achieve this I would be most appreciative.

Thanks.

heatm heatmaps • 1.4k views

ADD COMMENT • link updated 3.8 years ago by James W. MacDonald 68k • written 3.8 years ago by axe880 • 0

score 1 · Answer 1 · 2022-04-18

You don't say what packages you are using to get DEGs, nor even what sort of data you are using (RNA-Seq, microarray, NanoString?). You do say vsd file, so perhaps you are using DESeq2?

Anyway, most analysis packages generate output that have the row.names of the input data as the row.names of the output. As an example, we can use the example from ?results in DESeq2

> library(DESeq2)
> example("results")
<snip>
> rn <- row.names(subset(results(dds, contrast = c("group", "IIIB", "IIIA")), padj < 0.35))
> rn
[1] "gene10" "gene17" "gene26" "gene61" "gene69" "gene87"

We had to use a very large padj because it's just example data. But now we have the genes that are 'significant' in that comparison. We could do the same for all the other comparisons we care about, and then just extract the data. If you used vst or varianceStabilizingTransformation on your data, it's still just a SummarizedExperiment, and can be extracted using the assay accessor.

> vsd <- varianceStabilizingTransformation(dds)

> assay(vsd[rn,])
        sample1  sample2  sample3  sample4  sample5  sample6  sample7  sample8
gene10 5.737655 6.268506 5.280569 5.294676 5.246103 5.262955 5.705902 4.893763
gene17 3.198312 3.664058 4.148044 4.498651 3.975222 4.160359 4.006217 3.198312
gene26 4.184748 3.998229 3.679565 3.644796 4.091935 3.685972 3.860782 5.150434
gene61 5.607286 6.134023 6.501767 5.516013 5.625639 6.237912 5.493368 5.488325
gene69 5.368041 5.336134 5.835071 6.718193 5.480504 5.563168 5.705902 5.629472
gene87 4.989361 4.315954 4.519199 4.498651 3.198312 4.366362 3.860782 3.645202
        sample9 sample10 sample11 sample12 sample13 sample14 sample15 sample16
gene10 6.377084 5.080081 5.004280 5.482404 5.371975 5.976381 4.136924 4.758678
gene17 3.645463 4.011367 3.198312 4.350827 3.664058 4.133618 4.338359 3.975528
gene26 4.272973 5.329421 4.221377 4.495979 4.619383 4.133618 4.014651 3.835418
gene61 6.175925 5.719109 6.190394 6.725056 6.134023 6.088619 6.259544 7.125659
gene69 4.894660 5.570303 4.681564 5.452341 4.855867 5.691612 4.887306 6.127486
gene87 4.082303 4.239184 3.198312 4.681561 3.998229 4.011753 4.014651 5.418455
       sample17 sample18
gene10 4.890174 4.642547
gene17 4.079782 5.805123
gene26 3.198312 3.198312
gene61 6.076281 6.493409
gene69 6.702419 6.354500
gene87 4.180167 5.089673

So you would just extract the row.names for the significant genes in each comparison, concatenate them, and then extract all the stabilized count data. At that point you can use ComplexHeatmap or whatever to generate the heatmap.