7 months ago by
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
These issues are discussed in the workflow article
You don't say which edgeR tutorial you are reading, but I guess it is the workflow published in F1000Research:
or else the Bioconductor workflow version of the same article:
How to choose the filtering step (and whether or not to use glmTreat) is carefully discussed in that article. I'm a bit puzzled why you are not following the advice given in the article. How much of the article have you read? Was the advice not clear?
Just use filterByExpr
You seem to have copied the filtering step used for a different dataset without adapting it to your dataset. The filtering has to be adapted to your sample sizes and library sizes -- you can't just copy the code like that. In the current version of edgeR, we have made it easier for you. You can now simply use:
keep <- filterByExpr(y, design)
That function will then tune the filtering to be appropriate for your sequencing depth and experimental design.
glmTreat is designed to reduce the number of DE genes
As explained in the workflow article, glmTreat is designed to reduce the number of DE genes that you get, and to prioritize the most biologically meaningful of them. I'm a bit puzzled why you would use glmTreat(), but then complain that you don't have enough DE genes, considering that reducing the amount of DE is the whole purpose of the function. 1000 DE genes sounds a lot to me. Why would you want any more DE genes than that?