Question: edgeR_robust: where are moderately expressed genes?
0
gravatar for marek_koter
3.8 years ago by
Poland
marek_koter0 wrote:

Dear all,

I'm running this edgeR-robust code to analyse my miRNA-seq DE sequences. But I receive only highly/lowly expressed miRNAs with logFC values higher than 4/lower than -4. But nothing in between... Should I change  some parametres to get the whole logFC spectrum???

Thank you, Marek D. Koter

My code:

rm(list = ls())
library(edgeR)
y <- read.delim("3DPI_All_bez_K2_I2.txt", row.names="seq", header = T)
group <- factor(c(1,1,1,2,2,2))
keep<-rowSums(cpm(y)>5) >=1
y<-y[keep, ]
y <- DGEList(counts=y,group=group)
design <- model.matrix(~group)
y <- calcNormFactors(y)
y <- estimateGLMRobustDisp(y, design)
fit <- glmFit(y,design)
lrt <- glmLRT(fit,coef=2)
topTags(lrt, n = 50)

Fragment of my results:

  logFC logCPM LR PValue FDR
CTACCCCTGAACTTATTTTAT -5,28003 2,902426 4,407249 0,035787 0,362601
TTTCGTTACTCTGAGGCACA -4,83913 2,484342 3,876514 0,048966 0,362601
AAAGAGACATAACTTTGGGCT 4,173604 1,771509 3,934848 0,047295 0,362601
AACAGGACACTCTAAATCGGG 4,173604 1,771509 3,934848 0,047295 0,362601
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250   
[3] LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C                  
[5] LC_TIME=Polish_Poland.1250    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] edgeR_3.10.5  limma_3.24.15
loaded via a namespace (and not attached):
[1] tools_3.2.2

 

 

 

 

rnaseq edger microrna • 855 views
ADD COMMENTlink modified 3.8 years ago by Aaron Lun24k • written 3.8 years ago by marek_koter0

You're using an old version of Bioconductor. I suggest you upgrade to BioC 3.2 and edgeR 3.12.0 and use that for your analyses, to make sure this behaviour isn't present in the latest version.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Aaron Lun24k
Answer: edgeR_robust: where are moderately expressed genes?
2
gravatar for Aaron Lun
3.8 years ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

Obviously, if you take the top set of DE genes with topTags, you'll be looking at the genes that have the strongest differences between conditions. So it's no surprise that you get lots of genes with extreme log-fold changes. If you want to get all the genes (including those with smaller log-FCs), you should set n=Inf when running topTags. This will report all genes in the data set, ordered by p-value.

As an aside, it seems that none of your top genes are significant at a FDR of 5%, even though they have strong log-fold changes. I'd suggest having a closer look at the counts and seeing if your large fold changes are being driven by an outlier sample. This would also inflate the dispersion and reduce the significance of any DE. If there is an outlier, you could try removing it to see whether it improves DE detection. Of course, the other possibility is that your data is just inherently variable, in which case there's not much that can be done.

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Aaron Lun24k

You forgot to mention that there is something you can do with variable data! The OP could try using voomWithQualityWeights!

ADD REPLYlink written 3.8 years ago by Steve Lianoglou12k

While that's true, I was thinking more of a situation where you have high biological variability across all samples, rather than variability induced by a number of poor-quality samples in each condition. In such cases, I don't think sample-based weighting would provide any benefit, as each sample would be of the same (relative) quality such that the weights would all be the same. Besides, you'd have to switch from edgeR to voom, which may or may not be a problem for miRNA data depending on the size of the counts, the mean-variance relationship, etc. I've never worked with that stuff, so I wouldn't know the details - I suspect that voom would be okay for miRNA counts, but the proof is in the pudding.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Aaron Lun24k

Thanks Steve :-)

ADD REPLYlink written 3.8 years ago by marek_koter0

Many thanks Aaron :-)

ADD REPLYlink written 3.8 years ago by marek_koter0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 259 users visited in the last hour