Question

I need help with EdgeR!

0

Entering edit mode

marta.lagana • 0

@d6f6ebdc

Last seen 16 months ago

France

Hello!

Please, I need help… I do not understand what is going on with my code. Let me explain to you : I have a RNAseq results organized in gene counts.

In my experimental design I have a total of 18 samples, that must compared in 3 different groups, but not between groups. To be more clear, I have 18 samples and 6 conditions. I don’t care about the first 6 samples and I focus only on the second 12 samples.

So in these second samples (columns from 8 to 13 and from 14 to 19) I need to compare the samples from 8 to 13 between them, and the samples from 14 to 19 between them, but not comparison between group 8to13 and group 14to19 should be done.

So, what I do does not work and I don’t understand why.

Here my code, simplified...

# charge the file, “ a gene count file”
RNAseq = read_xlsx("path/gene_count.xlsx")
head(RNAseq)

####GROUP 8-13, Where samples 8,9 and 10 are triplicates and samples 11,12,13 are triplicates#####
group<- factor(c(1,1,1,2,2,2))

D <- DGEList(counts=RNAseq[,8:13], genes=RNAseq[2], group = factor(group)) #faire l'analyse
keep <- filterByExpr(D)
D <- D[keep, keep.lib.sizes = FALSE]
t = as.matrix(D$samples$lib.size)
rownames(t) = rownames(D$samples)
D <- calcNormFactors(D, method = "TMM")
design <- model.matrix(~group)
DGM <- estimateDisp(D, design)
fit1 <- glmQLFit(DGM, design)
qlf1 <-glmQLFTest(fit1)
outputEDGER <- topTags(qlf1, n=Inf)
MYRESULT <- as.data.frame(outputEDGER)
write_xlsx(MYRESULT, "mypath/results.xlsx")

Please help a molecular biologist to understand!! I am becoming crazy, the output I obtain is no pvalue differences and I know that it is false because I have already the results analysed and there are differences, but I need to obtain the same….

I am hopeless, I do not understand :frowning_face: I hope someone can help

Have a nice day!

```

edgeR • 596 views

ADD COMMENT • link updated 16 months ago by ATpoint ★ 4.0k • written 16 months ago by marta.lagana • 0

0

Entering edit mode

For the future I recommend to version-control your code on GitHub to make sure older analysis results are reproducible towards the code you used. Also, it is recommended to track software versions, at minimum by saving the sessionInfo() of the analysis run, better by saving a lockfile produced for example by software like renv or even better by running things inside a container such as Docker. After all your "old" analysis might be wrong and your current analysis results might be correct, who knows without having a reproducible analysis document at hand. Same would go for wetlab experiments not documented in the labbook -- it's bad practice.

ADD REPLY • link 16 months ago ATpoint ★ 4.0k

score 0 · Answer 1 · 2022-12-18

edgeR analyses all your data at once. It is not recommended to extract small subsets of your data (six samples at a time) and anayse them separately and doing so will reduce your ability to detect differential expression.

Other than that, it's not possible to troubleshoot any possible problems from the information that you give. The analysis looks correct in principle if the data is correct. If your data has already been analysed by something else, and you have results from them, can you not check what analysis was done the first time? Does the analysis need to be redone at all?