Question

Covariates in DESeq2 for Polysome profiling Data

0

Entering edit mode

c.oertlin • 0

@coertlin-9945

Last seen 6.5 years ago

Dear Micheal,

We are want to use DESeq2 using gene specific covariates (i.e. a separate set of covariates for each gene; these are also count data with the identical characteristics as compared to the input RNAseq data) but are struggling in what would be the best way of doing this.
We thought that you might have an idea?

Kind regards,
Christian

edit: to DESeq2... ;)

deseq2 • 2.0k views

ADD COMMENT • link updated 8.1 years ago by marco.trizzino83 ▴ 10 • written 8.1 years ago by c.oertlin • 0

0

Entering edit mode

marco.trizzino83 ▴ 10

@marcotrizzino83-9987

Last seen 5.7 years ago

Hi Mike,

So, I guess assay1 (IP) and assay2 (INPUT) need to be two different matrix. At this point I am not sure of how to code it.
This is how I was coding it without considering input:

dds <- DESeqDataSetFromMatrix(countData = IP_reads, colData=colData, design= ~condition)

Should I do something like that now?

design(dds) = ~ assay + condition + assay:condition
dds <- DESeqDataSetFromMatrix(countData = IP_reads, INPUT_reads, colData=colData, test = "LRT", reduced + ~ assay + condition)

Thanks again!

Marco

ADD COMMENT • link 8.1 years ago marco.trizzino83 ▴ 10

0

Entering edit mode

hi Marco,

You would just make a single count matrix using cbind and provide this as countData. And make certain that colnames(countData) aligns with the rows of colData. You don't need to do anything else special. The size factor normalization should work as normally, and then the assay term will account for remaining assay differences on a gene-by-gene basis.

ADD REPLY • link 8.1 years ago Michael Love 41k

0

Entering edit mode

marco.trizzino83 ▴ 10

@marcotrizzino83-9987

Last seen 5.7 years ago

Thanks Mike!

ADD COMMENT • link 8.1 years ago marco.trizzino83 ▴ 10

0

Entering edit mode

marco.trizzino83 ▴ 10

@marcotrizzino83-9987

Last seen 5.7 years ago

So, coding it like that (Nreads is the boud matrix):

colData=colData_HS_other

design(dds) = ~ assay + condition + assay:condition

dds <- DESeq( Nreads, test = "LRT", reduced = ~ assay + condition)

I have these two errors:

Error in design(dds) = ~assay + condition + assay:condition :
object 'dds' not found

Error in is(object, "DESeqDataSet") : object 'dds' not found

Also, where is should I include the colData?

ADD COMMENT • link 8.1 years ago marco.trizzino83 ▴ 10

0

Entering edit mode

Just fyi, you can use the comment/reply buttons to make threaded discussions.

So the errors you are encountering are because of the order of operations. "'dds' not found" means that you did not define 'dds' yet. You can't store information (a design formula) to a slot of an object that hasn't been created yet, same for running the function DESeq(). This is equivalent to > x+1, when x hasn't been defined in your session yet.

You should: 1) make a matrix which combines all the counts, 2) make a data.frame called colData with all the sample information, 3) use the DESeqDataSetFromMatrix() constructor function to build an object called 'dds' 4) then run DESeq on the DESeqDataSet object.

ADD REPLY • link 8.1 years ago Michael Love 41k

0

Entering edit mode

Thanks Mike!

Now it works if I only have two conditions. But when I have more than two (6) it doesn't. So, I grouped condition and assay as you recommended in another thread, coding like this:

dds <- DESeqDataSetFromMatrix(countData = Nreads, colData=colData_all, design= ~condition)
dds$group <- factor(paste0(dds$condition, dds$assay))
design(dds) = ~ group
dds = DESeq(dds, test = "LRT", reduced = ~ group)

But i get this error:

Error in nbinomLRT(object, full = full, reduced = reduced, betaPrior = betaPrior, :
less than one degree of freedom, perhaps full and reduced models are not in the correct order

ADD REPLY • link 8.1 years ago marco.trizzino83 ▴ 10

0

Entering edit mode

Why design=~condition when constructing the DESeqDataSet? And why ~group?

You want design=~assay + condition + assay:condition for the analysis we are talking about here. And reduced=~assay + condition. The only issue above was that you had the order of operations mixed up, and were calling objects that weren't defined, but you definitely don't want to change the design from the interaction formula. Simply:

dds <- DESeqDataSetFromMatrix(countData, colData, 
         design=~assay + condition + assay:condition)
dds <- DESeq(dds, test="LRT", reduced=~assay + condition)

ADD REPLY • link 8.1 years ago Michael Love 41k

0

Entering edit mode

So, I re-coded it as you suggested, but while when I have only two conditions the results look good and are comparable with the numbers that I got in previous analysis without the input, when instead I have more than two conditions (6) the results are weird: whatever is the pairwise comparison that I try to make, the number of differentially bound regions is always the same in all of the comparisons.

This is the code:

dds <- DESeqDataSetFromMatrix(countData = Nreads, colData=colData_all, design=~assay + condition + assay:condition)
dds = DESeq(dds, test = "LRT", reduced = ~ assay + condition)

#Now I want to look at 'HS' vs 'CH' no fold change
res_HS_CH<-results(dds, contrast=c("condition","HS","CH"))

#Remove those genes for which a multiple correction p-value is "NA"

res_HS_CH_p=res_HS_CH[!is.na(res_HS_CH$padj), ]

#Only keep genes with corrected p-values below 0.05

res_HS_CH_p=res_HS_CH_p[res_HS_CH_p$padj<0.05,]

And this is how colData looks like

	condition	assay
HS_B16_H3K27Ac	HS	IP
HS_B2_H3K27Ac	HS	IP
HS_B48_H3K27Ac	HS	IP
CH_40191_H3K27Ac	CH	IP
CH_4X0523_H3K27Ac	CH	IP
RH_27415_H3K27Ac	RH	IP
RH_27425_H3K27Ac	RH	IP
ML_7022f_Basil_H3K27Ac	ML	IP
ML_ALFA_ALFA_H3K27Ac	ML	IP
MS_17086_H3K27Ac	MS	IP
MS_31480_H3K27Ac	MS	IP
MS_32842_H3K27Ac	MS	IP
BB_1413m_Tuff_H3K27Ac	BB	IP
BB_8012_H3K27Ac	BB	IP
input_HS_B16_INPUT	HS	INPUT
input_HS_B2_INPUT	HS	INPUT
input_HS_B48_INPUT	HS	INPUT
input_CH_40191_INPUT	CH	INPUT
input_CH_4X0523_INPUT	CH	INPUT
input_RH_27415_INPUT	RH	INPUT
input_RH_27425_INPUT	RH	INPUT
input_ML_7022f_Basil_INPUT	ML	INPUT
input_ML_ALFA_ALFA_INPUT	ML	INPUT
input_MS_17086_INPUT	MS	INPUT
input_MS_31480_INPUT	MS	INPUT
input_MS_32842_INPUT	MS	INPUT
input_BB_1413m_Tuff_INPUT	BB	INPUT
input_BB_8012_INPUT	BB	INPUT

ADD REPLY • link 8.1 years ago marco.trizzino83 ▴ 10

0

Entering edit mode

Read in the manual page for ?results the section "For analyses using the likelihood ratio test..."

What you want to do is test each interaction term separately, and you can do this by setting name="..." and test="Wald".

ADD REPLY • link 8.1 years ago Michael Love 41k

0

Entering edit mode

Will do, thanks so much Mike for being so helpful!

ADD REPLY • link 8.1 years ago marco.trizzino83 ▴ 10

score 3 · Accepted Answer · 2016-03-18

3

Entering edit mode

Michael Love 41k

@mikelove

Last seen 2 hours ago

United States

DESeq2 doesn't support gene-specific covariates, but maybe you can tell me more about what you want to test, and there could be a way to reformulate the problem.

See this post for a suggestion on testing ratios of ratios using DESeq2:

DESeq2 testing ratio of ratios (RIP-Seq, CLIP-Seq, ribosomal profiling)

ADD COMMENT • link 8.1 years ago Michael Love 41k

0

Entering edit mode

We want to test differential translation in polysome profiling data. To assess this Polysome associated mRNA counts have to be corrected for cytosolic mRNA counts.
An idea was to run a per gene DESeq analysis with different model parameters each time. So for a dataset containing X genes, run X DESeq analyses, where for each gene a unique model is used where the unique term consists of the corresponding cytosolic mRNA value. But I do not know if this would lead to problems for parameter estimations etc.

Would DESeq2 still be reliable when being used on a per gene basis?
What would be the best way to put the cytosolic mRNA counts into the model? (raw counts, normalised counts, transformed/untransformed ...)

ADD REPLY • link 8.1 years ago c.oertlin • 0

0

Entering edit mode

Please take a look at the link I posted. This is the approach you want to take, which will test for differences in the polysome associated counts controlling for the cytosolic counts. You just need to put all the raw counts (polysome associated and cytosolic) into the counts matrix, and annotate the different assay type in colData(dds).

No, you can't run DESeq2 on a per gene basis.

ADD REPLY • link 8.1 years ago Michael Love 41k