Entering edit mode
ribioinfo
▴
100
@ribioinfo-9434
Last seen 4.4 years ago
Hello,
I read this http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments about time course experiment.
Could you please explain me what is the meaning of these results obtained by resultsNames(ddsTC):
minute_30_vs_0
strain_mut_vs_wt
strainmut.minute15
I tried to read the ?results but it did not help me.
Thank you.
Thank you!
I have some difficulties to choose the correct model to use in DESeq2, this is the condition table:
I do not have a time 0 and the experiment can be summarized in this way:
WT is the wild type animal, A is the WT animal in which one gene was inserted in the genome and B is the WT animal in which a gene (different from A) was inserted in the genome. A molecule M was given to the animals in order to express the gene in A and in B. M was given to all the animals and after 2 days WT, A and B were sacrificed while other animals WT, A and B were treated for other 3 days (so 5 days) with M and after 5 days these animals were sacrificed.
1)Is this experiment a time course experiment that I have to analyzed as written here: http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments ?
2) or should I use as model: ~ condition + time?
3) Or should I use this condition table?
4) Is it important the order of the variable in the model?
For example ~ condition + time and ~ condition + time + condition:time are different from ~ condition:time + time + condition?
5) I would really appreciate if you could briefly explain me what DESeq2 does when I use the models that I proposed in 1), 2) or 3) an maybe could be also helpful for other people.
Thank you very much for your time.
It's easiest to provide help if you can state what you want to test. What kind of comparisons do you want to make?
The section of the vignette where I explain interactions with figures is the best I can do in terms of explaining interaction models in a general manner. I can hopefully point you on the right track if you let me know what kind of comparisons you are interested in here. In general, I also recommend investigators who are new to linear models to partner with local statisticians who can help to explain and interpret different linear models and contrasts.
Thanks Michael!
This is is what I would like to test:
A 2 days vs WT 2 days
A 5 days vs WT 5 days
B 2 days vs WT 2 days
B 5 days vs WT 5 days
A 2 days vs B 2 days
A 5 days vs B 5 days
Since I am interested in the pvalue and fold change of these specific comparison I have to use the Wald test after the LRT test as explained here: http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments because LRT tests the differences considering all the time points and all the conditions together, right?
I read in the vignette https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions that is possible to use a condition table like that:
B5
and then use the contrast to do the comparisons instead to use the interactions. Is this solution like the first solution that uses interaction, LRT and then Wald test?
Thank you.
You're right. For those comparisons, it's easiest to combine the condition and time into a factor and use a design of ~condition and then make these comparisons with the 'contrast' argument, e.g.:
Thanks Michael, I created the model as you suggested me and I performed a PCA rlog blind true:
http://imgur.com/a/cVAXB
As you can see Sample 6, 7 and should be near but 8 is in another group and also Sample 16, 17, 18 and 19 are split in two groups. Is it better to remove the replicates and use only two replicates or is it better to have more replicates even if they are not very similar?
Thank you
I wouldn't recommend removing samples when you have so few. Something seems to be going on with PC2, maybe batch related, but that's not a good reason to toss samples.
Thanks. When is it advisable to remove samples?
That's a really good question, hard to answer briefly.
I look for technical justification either in sequencing QC measures, or the sample is deviating from all the rest in a PCA plot, etc.
Here, the samples are not deviating away from all the others, only from the others within their biological condition. To me that implies either batches, or a potential sample swap, but I'd want to confirm the latter through other avenues, not just assuming it was a sample swap from the PCA alone. Do you know about batches?
One difference is that the RNA was extracted after 2 and 5 days because is a time course experiment. Should I use sva?
Thanks
The 2 vs 5 difference in addition to condition was encoded in color I thought. I can't see the legend anyway.
SVA would be for finding differences *other* than the design.
Hi Michael, I would like to generate a PCA after to have used sva, I used this code:
but the PCA is the same as without using sva, is it correct the generation of the new PCA?
Thank you
rlog does not remove variation associated with the design, for that you can use limma's removeBatchEffect() on the matrix in assay(rld).
Thanks, is this the correct way to use the information from DESeq2 and SVA in limma?
Thanks
Sorry I think that this is more correct:
Since my ddsva@colData is:
cond SV1 SV2 SV3
<factor> <numeric> <numeric> <numeric>
Sample-6 control2 -0.2427044 -0.12319065 0.36255386
Sample-7 control2 -0.2951574 0.01406966 0.20129016
and using
converts the factor in this way:
cond SV1 SV2 SV3
Sample-6 1 -0.242704406 -0.12319065 0.36255386
Sample-7 1 -0.295157408 0.01406966 0.20129016
and now the PCA looks better. Is it correct what I wrote?
Thank you.
yes
Thank you!
If I wanted the read counts normalized without batch effects could I use this command?
Thank you
Hi I think that there is a problem because some values are negative, what could be the problem?
I have also another question regarding this code:
Since the SV variables are calculated using the normalized read counts is it correct to use removeBatchEffect with the matrix that come from the function rlog and the SV variable that come from the matrix of the normalized read counts?
Thank you
Hi Michael, I think that I found the problem, removeBatchEffect wants the log counts so using the log2 of the counts normalized by DESeq2 it seems work, do you agree?
You can use either rlog() values, vst() values, or simply log2 normalized counts with removeBatchEffect() and using the SV calculated from normalized counts
Hi Michael, thank you for your help. My WT2 and WT5 are my controls at different times but what if later I will be only interested in the comparison of, for example, A2 vs B2 and A5 vs B5? Should I perform again the analysis excluding WT2 and WT5 from the table of counts used by DESeq2?
Or is it better to leave these samples, even if they will be not used in the comparison, because they add useful information to estimate the dispersion of the genes?
Thank you.
This is one of the FAQ in the vignette