I was analysing RNA Seq datasets of an experiment selected from GEO datasets. Alignment to reference genome was done using STAR algorithm and quantification of transcripts was done using Subread package. The output 'counts.txt' was fed into edgeR for performing differential expression. The data exploration step (MDS plot) revealed a considerable amount of divergence among the replicates of same sample. Is this kind of divergence favourable for the edgeR analysis. Can I proceed to the next steps in differential expression analysis?
>countdata <- read.table("counts.txt", header=TRUE, row.names=1)
>countdata <- countdata[ ,6:ncol(countdata)]
>colnames(countdata) <- c(“sensitive1”,”sensitive2”,”resistant1”,”resistant2”)
> condition <- c(1,1,2,2)
>dge <- DGEList(counts=countdata,group=condition)
> countsPerMillion <- cpm(dge)
> countCheck <- countsPerMillion > 1
> keep <- which(rowSums(countCheck) >= 2)
> dge <- dge[keep,]
> dge <- calcNormFactors(dge, method="TMM")
Here is the url of the plot image : https://imgur.com/XPDq93d . Kindly requesting for your valuable guidance.