Hello,
I am performing an RNAseq analysis using the limma package.
From the same cell culture 3 samples(lane) were extracted, these samples were analysed at 4 different time points(Time).
No treatment was applied to any of the samples.
The goal of the project is to compare the 4 time points to each other, initially with a DEG.
The result of my analysis produces far too many differentially expressed genes with very low adj_p_values (even more than 100 genes with an adj_p_value < 0.0001).
I am also having difficulty understanding how I should consider the replicates, whether as technical or biological replicates.
Am I doing something wrong in setting up my model?
This is an example of my data:
Samples | Lane | Time_Point |
---|---|---|
S1 | A | T1 |
S2 | B | T1 |
S3 | C | T1 |
S4 | A | T2 |
S5 | B | T2 |
S6 | C | T2 |
S7 | A | T3 |
S8 | B | T3 |
S9 | C | T3 |
S10 | A | T4 |
S11 | B | T4 |
S12 | C | T4 |
This is my code:
dge = DGEList(counts = exp.f, genes = rownames(exp.f))
### Preprocessing
dge$counts <- edgeR::cpm(dge, log = F)
dge = calcNormFactors(dge, method = "TMM")
### Define model
f = as.factor(dataset$Time)
l = as.factor(dataset$Lane)
design = model.matrix(~0+f)
colnames(design) <- c(levels(f))
v = voomWithQualityWeights(dge, design = design, plot = TRUE)
corfit <- duplicateCorrelation(v, design, block = l, ndups = 1)
v = voomWithQualityWeights(dge, design = design, plot = TRUE, correlation = corfit$consensus.correlation)
corfit <- duplicateCorrelation(v, design, block = l, ndups = 1)
### Fit model
fit = lmFit(v, design, block = l, correlation = corfit$consensus.correlation)
contrast.matrix = makeContrasts(T4-T3,T3-T2,T2-T1,T4-T1, levels = design)
fit2 = contrasts.fit(fit, contrast.matrix)
fit2 = eBayes(fit2)
Thanks,
Alessandro
Hi gordon,
Sorry for only replying now.
Thank you very much for your reply, I have tried to do as you suggested.
If I can give you more information about what you are not clear about the experiment I can rewrite it to better understand if my method is correct.
Thank you very much.