Design formula in DESeq2
1
0
Entering edit mode
@998340ab
Last seen 2.4 years ago
United Kingdom

Hello,

thanks for developing an amazing tool like DESeq2 and having it available to the scientific community. I am new to RNAseq analysis. I have read the vignettes and I have some questions about my dataset. I looked online but I could not find an answer similar to my question.

I have a study of two animal genotypes, control and genetically altered and these animals received treatment or left untreated. So given these conditions I have control treated, control untreated, genetically altered treated and genetically altered untreated. I have collected tissue samples from different areas of the gut as I am looking on colorectal cancer (small intestine and large intestine).

My questions are as follows: 1) Shall I have two datasets for my analysis (i.e. one for large intestine and one for small intestine)? I did the RNAseq run of all the samples. I tried to have everything in DESeqDataSetFromMatrix but the design gets very complicated. I am interested only in each tissue separately (i.e. either small intestine or large intestine) and not in their differences. 2)My second question has to do with the design. I have noted gender, parent, batch which can all be confounding variables. Shall I include in the design or use sva to remove hidden batch effects? If I use sva, shall I have the confouding factors in my initial design formula and then change according to sva (I can use as reference the code given in the vignette)? I am not sure if gender and parent have an effect.

I am providing an example of the design I have used and it worked. For more complicated designs, I get an error for “Model matrix not full rank”


#dds_colon<-DESeqDataSetFromMatrix(countData=countdata_colon,colData=coldata_colon, design= ~Gender + Genotype +Treatment+ Genotype:Treatment)

3)My third and last question has to do with one part of my analysis. For some samples I got tumours and healthy tissue, so they are paired samples. However not all the samples are paired. Shall I create a separate column in metadata for this comparison or shall I create a different model?

I would like to thank you in advance for all your suggestions and advice. I know some of my questions sound a bit basic but I am completely lost on how to proceed.

Cheers, Maria

model BatchEffect hidden Design DESeq2 • 1.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 7 hours ago
United States

1) There is a FAQ on this in the vignette. With sufficient replicates, it's fine to have split analyses.

2) This is a choice for the analyst. With a few variables, such as what you've listed, I just include them in the design (supposing the experimental design is not confounded, which is a larger problem). With many variables I prefer SVA / RUV. In this case you should include in the design (this is covered in the workflow), do not remove effects from counts and provide residuals to DESeq2 (it won't let you anyway).

3) DESeq2 cannot control for pairing among a subset, so you can either set this information aside, or you can use another package such as limma.

ADD COMMENT
0
Entering edit mode

Thanks Michael for the quick response! I will let you know if I run into other issues.

ADD REPLY

Login before adding your answer.

Traffic: 539 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6