Covariates in DESeq2 for Polysome profiling Data
4
0
Entering edit mode
c.oertlin • 0
@coertlin-9945
Last seen 6.5 years ago

Dear Micheal,

We are want to use DESeq2 using gene specific covariates (i.e. a separate set of covariates for each gene; these are also count data with the identical characteristics as compared to the input RNAseq data) but are struggling in what would be the best way of doing this.
We thought that you might have an idea?


Kind regards,
Christian

edit: to DESeq2... ;)

deseq2 • 2.0k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

DESeq2 doesn't support gene-specific covariates, but maybe you can tell me more about what you want to test, and there could be a way to reformulate the problem.

See this post for a suggestion on testing ratios of ratios using DESeq2:

DESeq2 testing ratio of ratios (RIP-Seq, CLIP-Seq, ribosomal profiling)

ADD COMMENT
0
Entering edit mode

We want to test differential translation in polysome profiling data. To assess this Polysome associated mRNA counts have to be corrected for cytosolic mRNA counts.
An idea was to run a per gene DESeq analysis with different model parameters each time. So for a dataset containing X genes, run X DESeq analyses, where for each gene a unique model is used where the unique term consists of the corresponding cytosolic mRNA value. But I do not know if this would lead to problems for parameter estimations etc.


Would DESeq2 still be reliable when being used on a per gene basis?
What would be the best way to put the cytosolic mRNA counts  into the model? (raw counts, normalised counts, transformed/untransformed ...)  

 

ADD REPLY
0
Entering edit mode

Please take a look at the link I posted. This is the approach you want to take, which will test for differences in the polysome associated counts controlling for the cytosolic counts. You just need to put all the raw counts (polysome associated and cytosolic) into the counts matrix, and annotate the different assay type in colData(dds).

No, you can't run DESeq2 on a per gene basis.

ADD REPLY
0
Entering edit mode
@marcotrizzino83-9987
Last seen 5.7 years ago

Hi Mike,

So, I guess assay1 (IP) and assay2 (INPUT) need to be two different matrix. At this point I am not sure of how to code it.
This is how I was coding it without considering input:

dds <- DESeqDataSetFromMatrix(countData = IP_reads, colData=colData, design= ~condition)

Should I do something like that now?

design(dds) = ~ assay + condition + assay:condition
dds <- DESeqDataSetFromMatrix(countData = IP_reads, INPUT_reads, colData=colData, test = "LRT", reduced + ~ assay + condition)
 

Thanks again!

Marco

ADD COMMENT
0
Entering edit mode

hi Marco,

You would just make a single count matrix using cbind and provide this as countData. And make certain that colnames(countData) aligns with the rows of colData. You don't need to do anything else special. The size factor normalization should work as normally, and then the assay term will account for remaining assay differences on a gene-by-gene basis.

ADD REPLY
0
Entering edit mode
@marcotrizzino83-9987
Last seen 5.7 years ago

Thanks Mike!

 

 

ADD COMMENT
0
Entering edit mode
@marcotrizzino83-9987
Last seen 5.7 years ago

So, coding it like that (Nreads is the boud matrix):

colData=colData_HS_other

design(dds) = ~ assay + condition + assay:condition

dds <- DESeq( Nreads, test = "LRT", reduced = ~ assay + condition)

 

I have these two errors:

Error in design(dds) = ~assay + condition + assay:condition : 
  object 'dds' not found

Error in is(object, "DESeqDataSet") : object 'dds' not found

 

Also, where is should I include the colData?

 

 

ADD COMMENT
0
Entering edit mode

Just fyi, you can use the comment/reply buttons to make threaded discussions.

So the errors you are encountering are because of the order of operations. "'dds' not found" means that you did not define 'dds' yet. You can't store information (a design formula) to a slot of an object that hasn't been created yet, same for running the function DESeq(). This is equivalent to > x+1, when x hasn't been defined in your session yet.

You should: 1) make a matrix which combines all the counts, 2) make a data.frame called colData with all the sample information, 3) use the DESeqDataSetFromMatrix() constructor function to build an object called 'dds' 4) then run DESeq on the DESeqDataSet object.

ADD REPLY
0
Entering edit mode

Thanks Mike!

Now it works if I only have two conditions. But when I have more than two (6) it doesn't. So, I grouped condition and assay as you recommended in another thread, coding like this:

dds <- DESeqDataSetFromMatrix(countData = Nreads, colData=colData_all, design= ~condition)
dds$group <- factor(paste0(dds$condition, dds$assay))
design(dds) = ~ group 
dds = DESeq(dds, test = "LRT", reduced = ~ group)

But i get this error: 

Error in nbinomLRT(object, full = full, reduced = reduced, betaPrior = betaPrior,  : 
  less than one degree of freedom, perhaps full and reduced models are not in the correct order

 

 

ADD REPLY
0
Entering edit mode

Why design=~condition when constructing the DESeqDataSet? And why ~group?

You want design=~assay + condition + assay:condition for the analysis we are talking about here. And reduced=~assay + condition. The only issue above was that you had the order of operations mixed up, and were calling objects that weren't defined, but you definitely don't want to change the design from the interaction formula. Simply:

dds <- DESeqDataSetFromMatrix(countData, colData, 
         design=~assay + condition + assay:condition)
dds <- DESeq(dds, test="LRT", reduced=~assay + condition)
ADD REPLY
0
Entering edit mode

So, I re-coded it as you suggested, but while when I have only two conditions the results look good and are comparable with the numbers that I got in previous analysis without the input, when instead I have more than two conditions (6) the results are weird: whatever is the pairwise comparison that I try to make, the number of differentially bound regions is always the same in all of the comparisons.

This is the code:

dds <- DESeqDataSetFromMatrix(countData = Nreads, colData=colData_all, design=~assay + condition + assay:condition)
dds = DESeq(dds, test = "LRT", reduced = ~ assay + condition)

#Now I want to look at 'HS' vs 'CH' no fold change
res_HS_CH<-results(dds, contrast=c("condition","HS","CH"))

#Remove those genes for which a multiple correction p-value is "NA"

res_HS_CH_p=res_HS_CH[!is.na(res_HS_CH$padj), ] 

#Only keep genes with corrected p-values below 0.05

res_HS_CH_p=res_HS_CH_p[res_HS_CH_p$padj<0.05,]

And this is how colData looks like

 

condition  assay
HS_B16_H3K27Ac HS IP
HS_B2_H3K27Ac HS IP
HS_B48_H3K27Ac HS IP
CH_40191_H3K27Ac CH IP
CH_4X0523_H3K27Ac CH IP
RH_27415_H3K27Ac RH IP
RH_27425_H3K27Ac RH IP
ML_7022f_Basil_H3K27Ac ML IP
ML_ALFA_ALFA_H3K27Ac ML IP
MS_17086_H3K27Ac MS IP
MS_31480_H3K27Ac MS IP
MS_32842_H3K27Ac MS IP
BB_1413m_Tuff_H3K27Ac BB IP
BB_8012_H3K27Ac BB IP
input_HS_B16_INPUT HS INPUT
input_HS_B2_INPUT HS INPUT
input_HS_B48_INPUT HS INPUT
input_CH_40191_INPUT CH INPUT
input_CH_4X0523_INPUT CH INPUT
input_RH_27415_INPUT RH INPUT
input_RH_27425_INPUT RH INPUT
input_ML_7022f_Basil_INPUT ML INPUT
input_ML_ALFA_ALFA_INPUT ML INPUT
input_MS_17086_INPUT MS INPUT
input_MS_31480_INPUT MS INPUT
input_MS_32842_INPUT MS INPUT
input_BB_1413m_Tuff_INPUT BB INPUT
input_BB_8012_INPUT BB INPUT

 

 

ADD REPLY
0
Entering edit mode

Read in the manual page for ?results the section "For analyses using the likelihood ratio test..."

What you want to do is test each interaction term separately, and you can do this by setting name="..." and test="Wald".

ADD REPLY
0
Entering edit mode

Will do, thanks so much Mike for being so helpful!

ADD REPLY

Login before adding your answer.

Traffic: 711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6