Dear all,
I was wondering if I could get advice on a complicated model for DESeq2? Currently, I have different mutations in two cell lines that are tissue-relevant (neurons and glia) with multiple replicates for each. Essentially, I have a double knockout (0), heterozygous knockout (1), wild type (2), and duplication (4) for each of these mutations (gene dosage). I am trying to make a model in which I can compare to wildtype and account for the different cell lines and or mutations. I'm not sure whether I should separate the cell lines and analyze seperately, and then look at overlap. Or if I should include the cell line into the model. I've listed my variables below. Condition is wildtype or mutant.
Both GENE1 and GENE2 are relevant for the same disease.
Celltype condition dosage RIN MutationGene
1 WT 2 9 GENE1
1 WT 2 9 GENE1
1 WT 2 8.2 GENE1
1 Mutant 1 7 GENE1
1 Mutant 1 9 GENE1
1 Mutant 1 8.2 GENE1
1 Mutant 0 7 GENE1
1 Mutant 0 10 GENE1
1 Mutant 0 9.2 GENE1
1 Mutant 3 9.2 GENE1
1 Mutant 3 8.3 GENE1
1 Mutant 3 9 GENE1
2 WT 2 9 GENE1
2 WT 2 9 GENE1
2 WT 2 8.2 GENE1
2 Mutant 1 7 GENE1
2 Mutant 1 9 GENE1
2 Mutant 1 8.2 GENE1
2 Mutant 0 7 GENE1
2 Mutant 0 10 GENE1
2 Mutant 0 9.2 GENE1
2 Mutant 3 9.2 GENE1
2 Mutant 3 8.3 GENE1
2 Mutant 3 9 GENE1
3 WT 2 9 GENE2
3 WT 2 9 GENE2
3 WT 2 8.2 GENE2
3 Mutant 1 7 GENE2
3 Mutant 1 9 GENE2
3 Mutant 1 8.2 GENE2
3 Mutant 0 7 GENE2
3 Mutant 0 10 GENE2
3 Mutant 0 9.2 GENE2
3 Mutant 3 9.2 GENE2
3 Mutant 3 8.3 GENE2
3 Mutant 3 9 GENE2
4 WT 2 9 GENE2
4 WT 2 9 GENE2
4 WT 2 8.2 GENE2
4 Mutant 1 7 GENE2
4 Mutant 1 9 GENE2
4 Mutant 1 8.2 GENE2
4 Mutant 0 7 GENE2
4 Mutant 0 10 GENE2
4 Mutant 0 9.2 GENE2
4 Mutant 3 9.2 GENE2
4 Mutant 3 8.3 GENE2
4 Mutant 3 9 GENE2
My two main questions are:
#1) Should I separate them by gene and/or cell line? The cell lines are related (i.e. neuron and glia)
#2) Does this model make sense?
~condition + condition:dosage + condition:Celltype + RIN + dosage:MutationGene? I get an error unfortunately.
# Ive also tried
~dosage + dosage:Celltype + RIN + dosage:MutationGene
Apologies for the complexity.
Thanks a lot!
Thanks for the response!
You're right, there is likely a different expression profile. Unfortunately, I'm taking over someone's previously generated data and have little control over what they produced. They essentially did the homozygous, heterozygous knockout and duplication in two separate brain cell types. And then they repeated this and did it for different genes.
I'm trying to see if there is some kind of convergence or overlap between all the different gene mutants. I would expect to see that for relevant genes being disrupted by the homozygous knockout, heterozygous knockout, or duplication and would correspondingly be affected. (i.e. we should see little to no expression for an affected gene, slightly more for het, and positive fold change for duplication). Almost a form of validation.
I'm thinking of just doing some kind of venn diagram if I can't think of an appropriate statistical model. I appreciate the thoughts.