Question: Using a truth table for ColumnData in DESeq2
0
11 months ago by
jrlarsen10
jrlarsen10 wrote:

Hi,

I am doing differential expression analyses for the first time and using DESeq2. The tutorials are great, but I have a couple niche questions that I cannot find the precise answer to.

1) If I use a truth table for my ColumnData in DESeq2 where my rows are samples and my columns are events that occur or dont occur as follow:

Event A  Event B Event C

S1     1             1            0

S2     1             0            1

S3     0             0            0

S4     1             0            1

Where 1 is the column event is occurring and 0 means the event is not occurring. Can DESeq2 recognize these as binary indicators of categorical data or will it assume it as a measure?

2) I would like to create a heatmap that increases from least difference to most difference, for the respective columns selected in ColumnData, from left to right and bottom to top. How do I do this appropriately for a count matrix?

Thank you, any help is appreciated, I just want to make sure I am proceeding correctly.

deseq2 • 224 views
modified 11 months ago by Michael Love21k • written 11 months ago by jrlarsen10
Answer: Using a truth table for ColumnData in DESeq2
0
11 months ago by
Michael Love21k
United States
Michael Love21k wrote:

You should turn them into factors with levels 0 and 1. It won't make big difference to the model, but it will be easier on some helper functions that break things into groups, e.g. plotCounts()

See the vignette, we have examples of heatmaps. You would just make a subset of the data by the top genes, and then specify to the heatmap software not to reorder the rows of the heatmap.

1)Thank you so much Mike! I have never used factors, I know I can make columns into arrays and use factor() on those. Though I am not sure how to apply this to a data.frame() let alone one the has columns representing both quantitative and categorical data?

2) I just want to make sure this is the ordering by p-value? Lower the p-value the greater the difference?

Thank you so much for taking the time.

If you want to use DE methods in R/Bioconductor, you should get to know factors!

These are a workhorse class for linear models and making comparisons in R. I'd suggest following some of these links:

http://genomicsclass.github.io/book/pages/resources.html

Yes you would order by p-value. Yes, lower p-values mean that the null hypothesis should be rejected, where the null hypothesis is typically "no difference". You should also probably do some catch-up on basics of inference, p-values, adjusted p-values, FDR, etc. See the Inference section here, and further down, the multiple testing section:

http://genomicsclass.github.io/book/