I have quite exactly the same question, but I did "cpm" instead of "aveLogCPM", according to the official book p11 :
keep <- rowSums(cpm(y)>1) >= 3
y <- y[keep, , keep.lib.sizes=FALSE]
I put "3" due to my 3 replicats per conditions.
By doing this I go from 60 000 to 20 000 contigs. But I have difficulties to understand later why in my DGE "condition 1 VS condition 2" table, I still see very low logCPM (-0.86). Do I have to filter again at this time and where to fix the limit on these logCPM ? 0 ? 1,858 (log2(3) ?
Note, I was frightened by the comment of JW MacDonald. I believe that the “open world” and all other “open source and stuffs” are only rich from their communities and exchanges.
One should think about that : “Do you have to be a car manufacturer to drive a car ? No. You have to learn to drive, change a wheel and put gas”.
I am also working with the help of a statistician but my goal is to gain in autonomy. Unfortunately, some scientists (often the good ones) are lacking the capacities to talk to “unsophisticated audience”.