Hi

I have RNA seq data for six different treatments (A,B,C,D,E,F) of a model organism, with four-fold biological (NOT technical) replicates.

FASTQC revealed no abnormalites in the RNAseq data and after normalization (rlogtransformation) with DESeq2 I generated a PCA plot (using the 500 most variable genes).

Based on the PCA plot (see link: http://imgur.com/NVcWv5j) and a hierachical clustering (HC) analysis (not shown) I would think that the dots with a rectangle (1,2,3) can be considered as outliers and might be left out for further differential expression analysis (between treatments).

However, this is just based on visual inspection of the PCA/HC analysis. I was wondering if there is any objective metric to determine whether an RNAseq sample can be considered as an outlier (instead of just by visual inspection of PCA, like most papers do).

In a recent paper of Conesa et al 2016 (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0881-8) they state the following:

"Reproducibility among technical replicates should be generally high (Spearman R2 > 0.9) [1], **but no clear standard exists for biological replicates, as this depends on the heterogeneity of the experimental system.**"

So one might consider to include all replicates (incl. outliers) based on Conesa et al. 2016, but then you might end up with a lower number of diff. expressed genes between treatments...

Any advice/help regarding this topic would be much appreciated

based on eye-balling your PCA plot I am not sure if you can justify the exclusion of the marked points as outliers. Your sample size is quite low (statistically speaking - I know that it's hard to have more) and the variability is not so small as to clearly flag the points as 'wild' outliers. But if you want to use a statistical test for outlier removal you can calculate the mean (or median) pairwise distance (within group or maybe for all groups pooled) and the standard deviation. Then you can flag those points that are greater then mean/median ± 2 sd. I'll note though that while this makes it consistent between groups, the threshold is still arbitrary (although frequently used).