Question: Batch Variation-MDS Plot
Hi Guys, 

I need a suggestion for batch variation for edgeR analysis.

Here is my samples

1) NW1

2) NW2

3) NW3

After plot MDS for NW1,NW2 and NW3, I choose to take NW1 and NW3 as they are similar more than NW2. 

Similarly, I have another set of sample NWS1, NWS2, NWS3. After Plot MDS, I have taken NWS1 and NWS3 for the DE using EdgeR

but my plot MDS for NW1, NW3, NWS1 and NWS3 look weird. As NW1 is quite away from the NW3. Any suggestion for this batch effects.

Thanks a million


It is generally a Very Bad Idea to remove samples based on similarity in a MDS plot. If you have three replicates, it is inevitable that two of the samples will be closer together than the remaining sample - this is not an adequate reason for removing the latter. Indeed, by removing the least-similar sample, you are probably understating the variation between replicates. This will result in anticonservativeness and a greater number of false positives.

I only remove samples if it is very clear that something went wrong, based on information other than the MDS plot. For example, if the library size is 100-fold lower than that of other samples, it is obvious that there was a technical problem with sequencing, so the corresponding sample should not be used in downstream analysis.

