edgeR_robust: where are moderately expressed genes?
1
0
Entering edit mode
@marek_koter-8946
Last seen 9.2 years ago
Poland

Dear all,

I'm running this edgeR-robust code to analyse my miRNA-seq DE sequences. But I receive only highly/lowly expressed miRNAs with logFC values higher than 4/lower than -4. But nothing in between... Should I change  some parametres to get the whole logFC spectrum???

Thank you, Marek D. Koter

My code:

rm(list = ls())
library(edgeR)
y <- read.delim("3DPI_All_bez_K2_I2.txt", row.names="seq", header = T)
group <- factor(c(1,1,1,2,2,2))
keep<-rowSums(cpm(y)>5) >=1
y<-y[keep, ]
y <- DGEList(counts=y,group=group)
design <- model.matrix(~group)
y <- calcNormFactors(y)
y <- estimateGLMRobustDisp(y, design)
fit <- glmFit(y,design)
lrt <- glmLRT(fit,coef=2)
topTags(lrt, n = 50)

Fragment of my results:

  logFC logCPM LR PValue FDR
CTACCCCTGAACTTATTTTAT -5,28003 2,902426 4,407249 0,035787 0,362601
TTTCGTTACTCTGAGGCACA -4,83913 2,484342 3,876514 0,048966 0,362601
AAAGAGACATAACTTTGGGCT 4,173604 1,771509 3,934848 0,047295 0,362601
AACAGGACACTCTAAATCGGG 4,173604 1,771509 3,934848 0,047295 0,362601
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250   
[3] LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C                  
[5] LC_TIME=Polish_Poland.1250    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] edgeR_3.10.5  limma_3.24.15
loaded via a namespace (and not attached):
[1] tools_3.2.2

 

 

 

 

edger rnaseq microrna • 1.9k views
ADD COMMENT
0
Entering edit mode

You're using an old version of Bioconductor. I suggest you upgrade to BioC 3.2 and edgeR 3.12.0 and use that for your analyses, to make sure this behaviour isn't present in the latest version.

ADD REPLY
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 2 hours ago
The city by the bay

Obviously, if you take the top set of DE genes with topTags, you'll be looking at the genes that have the strongest differences between conditions. So it's no surprise that you get lots of genes with extreme log-fold changes. If you want to get all the genes (including those with smaller log-FCs), you should set n=Inf when running topTags. This will report all genes in the data set, ordered by p-value.

As an aside, it seems that none of your top genes are significant at a FDR of 5%, even though they have strong log-fold changes. I'd suggest having a closer look at the counts and seeing if your large fold changes are being driven by an outlier sample. This would also inflate the dispersion and reduce the significance of any DE. If there is an outlier, you could try removing it to see whether it improves DE detection. Of course, the other possibility is that your data is just inherently variable, in which case there's not much that can be done.

ADD COMMENT
0
Entering edit mode

You forgot to mention that there is something you can do with variable data! The OP could try using voomWithQualityWeights!

ADD REPLY
0
Entering edit mode

While that's true, I was thinking more of a situation where you have high biological variability across all samples, rather than variability induced by a number of poor-quality samples in each condition. In such cases, I don't think sample-based weighting would provide any benefit, as each sample would be of the same (relative) quality such that the weights would all be the same. Besides, you'd have to switch from edgeR to voom, which may or may not be a problem for miRNA data depending on the size of the counts, the mean-variance relationship, etc. I've never worked with that stuff, so I wouldn't know the details - I suspect that voom would be okay for miRNA counts, but the proof is in the pudding.

ADD REPLY
0
Entering edit mode

Thanks Steve :-)

ADD REPLY
0
Entering edit mode

Many thanks Aaron :-)

ADD REPLY

Login before adding your answer.

Traffic: 466 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6