I am new to the Deseq 2. And I am a little puzzled by the outlier detection. I just finished an RNA-seq project. It includes 4 donors and with/without drug treatment, as following. Donor Treatment sample 1 1 Untreated sample 2 1 Treated sample 3 2 Untreated sample 4 2 Treated sample 5 3 Untreated sample 6 3 Treated sample 7 4 Untreated sample 8 4 Treated
My design is "design = ~Donor + Treatment". My first question is based on Cook's distance calculation, (p is the number of parameters including the intercept and m number of samples), in my case, p will be 3 (Donor, Treatment and intercept) and m will be 8, am I right?
My second question is the cook's distance cutoff value "The default is to use the 99% quantile of the F(p,m-p) distribution (with p the number of parameters including the intercept and m number of samples)", is this cutoff value (99% quantile) based on samples, like all genes in sample 1? or based on genes, like gene i in all the samples?
Thank you so much!