Hello Everyone,
I have counts data with 366 samples as columns and more than 50k ensembl ids as rownames. The data is stored in a dataframe "U".
Data looks like following:
S1 S2 S3
ENSG00000000003.13 5745 8742 7257
ENSG00000000005.5 4 1 0
ENSG00000000419.11 1394 2211 615
ENSG00000000457.12 344 1166 692
ENSG00000000460.15 405 480 258
ENSG00000000938.11 144 132 110
table(df$Status)
F2 F3 F4 NE
125 220 20 1
library(edgeR)
group <- factor(paste0(df$Status))
y <- DGEList(U, group=group)
keep <- rowSums(cpm(y) > 0.5) >= 2
table(keep)
summary(keep)
Mode FALSE TRUE NA's
logical 35503 24985 0
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y, method = "TMM")
design2 <- model.matrix(~ 0 + group)
colnames(design2) <- c("G2","G3","G4","NE")
design2
y <- estimateDisp(y, design2, robust=TRUE)
y$common.dispersion
[1] 1.030704
fit <- glmQLFit(y, design2, robust=TRUE)
head(fit$coefficients)
G2 G3 G4 NE
ENSG00000000003.13 -9.614857 -9.514155 -9.219875 -9.388728
ENSG00000000005.5 -17.008674 -16.908788 -16.166986 -16.630297
ENSG00000000419.11 -10.940426 -10.914407 -10.898501 -10.804819
ENSG00000000457.12 -11.582309 -11.493746 -11.423292 -12.203859
ENSG00000000460.15 -12.362215 -12.205290 -11.947569 -12.040664
ENSG00000000938.11 -12.514476 -12.640490 -13.160899 -13.074225
contrast.matrix <- makeContrasts(G2-G3, levels=design2)
contrast.matrix
Contrasts
Levels G2 - G3
EG2 1
EG3 -1
EG4 0
NE 0
qlf <- glmQLFTest(fit, contrast=contrast.matrix)
topTags(qlf)
Coefficient: 1*G2 -1*G3
logFC logCPM F PValue FDR
ENSG00000124232.9 7.351170 0.86857907 278.9784 6.083121e-47 1.519868e-42
ENSG00000102195.8 7.077038 -0.11326609 236.7369 1.532441e-41 1.914402e-37
ENSG00000165066.12 6.034396 -1.42387186 224.2053 7.269869e-40 6.054590e-36
ENSG00000138161.11 3.629804 0.03880203 210.2262 5.937722e-38 3.242534e-34
ENSG00000164434.10 5.533205 -0.92141498 209.9531 6.488962e-38 3.242534e-34
ENSG00000256463.7 5.278677 -2.52333685 201.0082 1.152294e-36 4.798346e-33
ENSG00000233887.1 4.054302 -4.49118571 200.4813 3.120264e-36 1.113711e-32
ENSG00000196431.3 4.767828 -1.79054718 192.6636 1.759004e-35 5.173259e-32
ENSG00000280109.1 5.872673 0.25250281 192.4883 1.863491e-35 5.173259e-32
ENSG00000180053.7 3.349894 -4.85962856 193.4720 2.948655e-35 7.367214e-32
summary(decideTests(qlf))
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
dims [product 99940] do not match the length of object [20]
Can anyone tell me what is this error. Did I go wrong anywhere?
Thank you
Code looks fine to me. Check if
decideTestsDGE
works, in which case it may be an S3 dispatch error. If that also fails, make a minimum working example that other people can run to reproduce the error.Hi Aaron,
Thanks for the reply. I used like following:
summary(decideTestsDGE(qlf))
[,1]
-1 949
0 22517
1 1519
It means 949 downregulated genes and 1519 upregulated genes b/w G2-G3 comparison. Am I right? And in topTags(qlf) Which are differentially expressed genes? On what basis should I consider them as differentially expressed?
And how to save that "qlf" into file?