Question: EdgeR: Replicate samples diverge in MDS plot
gravatar for fawazfebin
2.1 years ago by
fawazfebin30 wrote:


I was analysing RNA Seq datasets of an experiment selected from GEO datasets. Alignment to reference genome was done using STAR algorithm and quantification of transcripts was done using Subread package. The output 'counts.txt' was fed into edgeR for performing differential expression. The data exploration step (MDS plot) revealed a considerable amount of divergence among the replicates of same sample. Is this kind of divergence favourable for the edgeR analysis. Can I proceed to the next steps in differential expression analysis?


>countdata <- read.table("counts.txt", header=TRUE, row.names=1)

>countdata <- countdata[ ,6:ncol(countdata)]

>colnames(countdata) <- c(“sensitive1”,”sensitive2”,”resistant1”,”resistant2”)

> condition <- c(1,1,2,2)

>dge <- DGEList(counts=countdata,group=condition)


> countsPerMillion <- cpm(dge)

> countCheck <- countsPerMillion > 1

> keep <- which(rowSums(countCheck) >= 2)

> dge <- dge[keep,]

> dge <- calcNormFactors(dge, method="TMM")

> plotMDS(dge)

Here is the url of the plot image : . Kindly requesting for your valuable guidance.


edger plotmds • 442 views
ADD COMMENTlink modified 2.0 years ago • written 2.1 years ago by fawazfebin30
Great thanks for your guidance. Quick response as well !
ADD REPLYlink written 2.0 years ago by fawazfebin30
Answer: EdgeR: Replicate samples diverge in MDS plot
gravatar for Ryan C. Thompson
2.1 years ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson7.4k wrote:

It's impossible to know for sure with only 4 samples, but one possible explanation is that dimension 1 represents some sort of sample quality issue in the resistant2 sample that perturbed the measurements of gene expression in that sample, causing it to diverge from the others, while dimension 2 represents the effect of interest, control vs resistant. In any case, with so few samples, I don't think there's anything you can do to correct for this. You pretty much just have to take the data as is. If you had more samples, I would have recommended using sva to correct for such problems.

ADD COMMENTlink written 2.1 years ago by Ryan C. Thompson7.4k

Thanks Ryan! I shall check for the availability of more samples for the same experiment and use sva package if needed.


ADD REPLYlink written 2.1 years ago by fawazfebin30

Is sva correction possible with six samples?


ADD REPLYlink written 2.1 years ago by fawazfebin30

I'm not sure. It might be possible. It depends on the severity of the confounding effect and the number of samples affected. sva and similar methods work best with lots of samples.

ADD REPLYlink written 2.1 years ago by Ryan C. Thompson7.4k
Answer: EdgeR: Replicate samples diverge in MDS plot
gravatar for Gordon Smyth
2.0 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

As Ryan says, you have to work with the data you have. You can set robust=TRUE when you run estimateDisp() in edgeR. If the problems with the resistant2 sample are isolated to a certain group of genes, then this will isolate those genes and and the analysis will run fine. If problems with resistant2 are widespread, then the variability of the resistant samples will simply decrease the number of DE genes you will find at any significance level.

ADD COMMENTlink written 2.0 years ago by Gordon Smyth38k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 339 users visited in the last hour