Hi All: I am using EdgeR to analyze the differential gene expression but today I met a problem I never seen before. In the step of filtering out all the low expression gene using filterByExpr(x, group = group), it had all my counts filtered out.
My code is following:
files <- c("E11Rep1.txt", "E11Rep2.txt", "E14Rep1.txt", "E14Rep2.txt", "E18Rep1.txt", "E18Rep2.txt", "Adult_Rep1.txt", "Adult_Rep2.txt")
read.delim(files[1], nrows = 5)
group <- as.factor(c("E11Rep1", "E11Rep2", "E14Rep1", "E14Rep2", "E18Rep1", "E18Rep2", "AdultRep1", "AdultRep2"))
x$samples$group <- group
lane <- as.factor(c("Rep1", "Rep2", "Rep1", "Rep2","Rep1", "Rep2", "Rep1", "Rep2") )
x$samples$lane <- lane
library(Mus.musculus)
geneid <- row.names(x)
genes <- select(Mus.musculus, keys = geneid, columns = c("SYMBOL", "TXCHROM"), keytype = "ENTREZID")
genes <- genes[!duplicated(genes$ENTREZID), ]
x$genes <- genes
cpm <- cpm(x)
lcpm <- cpm(x, log=TRUE)
L <- mean(x$samples$lib.size)*1e-6
M <- median(x$samples$lib.size)*1e-6
c(L, M)
table(rowSums(x$counts==0)==8)
keep.exprs <- filterByExpr(x, group = group)
The output is:
> Warning message: In min(n[n > 1L]) : no non-missing arguments to min;
> returning Inf
I have used x$count
to check the count number and they are not that low:
head(x$counts)
> Samples Tags E11Rep1 E11Rep2 E14Rep1 E14Rep2 E18Rep1 E18Rep2
> Adult_Rep1 Adult_Rep2 497097 199 190 617 889
> 1148 1761 961 884 100503874 82 97 234
> 281 395 504 607 527 100038431 0 0
> 1 8 18 25 19 21 19888 6
> 19 9 17 6 9 13 1 20671
> 1704 1334 1544 2321 1412 2160 1657 1539
> 27395 7984 5541 3778 5846 3126 5332 1736
> 1618
can any one tell me why this happened?
Thank you very much!
There is mistakes in the command:
"group <- as.factor(c("E11Rep1", "E11Rep2", "E14Rep1", "E14Rep2", "E18Rep1", "E18Rep2", "AdultRep1", "AdultRep2"))"
it makes the 8 groups instead of 4 and each of these 8 group has only 1 replicate.Just in case other beginners has the same mistake.
Thank you all the same for reading