Dear all,
My question was partially addressed in similar threads, but I'd like to raise the point one more time.
I have read counts that come form RNAseq data, which was mapped to diploid genome of hybrid yeast (there are 3 biological replicates), thus I have counts for both parental homologs.
Count matrix looks like:
Par1_rep1 Par1_rep2 Par1_rep3 Par2_rep1 Par2_rep2 Par2_rep3 gene1 1405 697 1594 992 367 1081 gene2 219 220 259 246 229 272 gene3 896 799 814 498 438 410 gene4 27 59 106 36 62 40 gene5 853 1638 1995 877 2624 2239
What I want is to asses allele-specific expression between two parents using Deseq2. My overall calculations look like
groups<-factor(x=c(rep("par1",3), rep("par2",3)), levels=c("par1","par2")) colData <- DataFrame(condition=groups) dds_hybrid <- DESeqDataSetFromMatrix(countData, colData, formula(~condition)) sizeFactors(dds_hybrid) = c(rep(1,6)) dds_hybrid <- DESeq(dds_hybrid) res_hybrid <- results(dds_hybrid, cooksCutoff=FALSE)
I'd appreciate very much if someone can comment of this and point out if something is wrong.
Especially I am interested whether in this kind of design SizeFactors should be set to 1.
Cheers,
Hi Michael,
Thanks for reply! As far as I understand, in your example there are two conditions, while my case actually the are no conditions - I just need to compare ref and alt counts. So `condition` in my formula represents the alleles in fact. In this situation do you think the above analysis is correct, and if not I am wondering what does it violate?
I would appreciate your comments and thoughts since this topic is not covered well in internet and other forums.
Just take out condition from the formula, so just: ~sample + counts (where the 'counts' variable denotes ref and alt)