I have two conditions (A, B) and two levels for each condition (Positive, Negative). Each group has ~4-5 biological replicates (Condition + Positive/ Negative). Based on exploratory analyses, I identified three outliers (A1Positive, B1Negative, A2Negative) which were removed from the downstream DESEQ2 analyses.
I ran DESeq 2 twice-
- 1) removing only A1Positive sample
- 2) removing three outliers (A1Positive, B1Negative, A2Negative)
I exported the DE results for the following contrasts:
- 1) APositive vs BPositive (excluding A1Positive outlier sample from the DE analyses)
- 2) APositive vs BPositive (excluding A1Positive, B1Negative, A2Negative outliers)
While comparing the results from 1) and 2) - I noticed that genes had the same log2FC and baseMean in both the files but lfcSE, Stats, pvalue and adj p-value differed. Note: The only difference here was the number of outliers removed from 1) and 2).
Can someone explain why the values (pval,adj pval, lfcSE) in contrast between APositive vs BPositive group differ in 1) and 2). Does removing A1Negative and B1Negative outliers will impact the contrasts for APositive vs BPositive group.
Any insight would be helpful.
Code should be placed in three backticks as shown below
# include your problematic code here with any corresponding output # please also include the results of running the following in an R session sessionInfo( )