Hi,
This question is about RNAseq analysis in EdgeR
to identify common and specific differentially expressed genes. I have 3 different Donor's in-vitro cultured tissue with two different infection status, one with infected virus (high dose 6hr) and another with un-infected (baseline 0hr). This was sequenced using RNAseq, then aligned and quantified. I now have a gene counts file ready to import into EdgeR RNAseq analysis pipeline. Using this data, I am interested in what are the common vs specific DEGs in response to virus per donor?
- Identifying common DEGs in response to virus between infected vs. un-infected samples from 3 different donors?
- Following, this perform donor specific analysis or response to virus. To pull out specific response per donor?
Sample info
#> Samples Donor Time Status
#> 1 S1 D1 0hr Uninfected
#> 2 S2 D1 6hr Infected
#> 3 S3 D2 0hr Uninfected
#> 4 S4 D2 6hr Infected
#> 5 S5 D3 0hr Uninfected
#> 6 S6 D3 6hr Infected
1. Identifying common DEGs in response to virus between infected vs. un-infected samples from 3 different donors?
library(edgeR)
group.Status <- factor(Sample_info$Status)
y <- DGEList(counts = gene_counts, group = group.Status, remove.zeros = TRUE)
keep <- filterByExpr(y)
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y, method = "TMM")
cpm_with_log2 = cpm(y, prior.count=2, log=TRUE)
## Create design file
Donor_ID <- factor(Sample_info$Donor)
Status <- factor(Sample_info$Status, levels=c("Uninfected", "Infected"))
design <- model.matrix(~Donor_ID+Status)
## Dispersion estimation
y <- estimateDisp(y,design, robust=TRUE)
# Fit the model
fit <- glmQLFit(y,design, robust = TRUE)
## To detect genes that are differentially expressed in Infected vs Uninfected:
qlf.Infected_vs_Uninfected <- glmQLFTest(fit, coef=4)
topTags(qlf.Infected_vs_Uninfected, n=10, adjust.method = "BH", sort.by = "PValue", p.value = 1)
2. Following, this perform donor specific analysis or response to virus. To pull out specific response per donor?
To perform donor specific analysis, is there a way to extract DEGs specific to donor from the above glmQLFTest
?
OR, probably, just change the design formula to: design <- model.matrix(~0 + Donor_ID + Status + Donor_ID:Status)
Best Regards,
Toufiq
Gordon Smyth thank you very much for the inputs and suggestions. This approach seems to be very useful and address the questions that we are looking at the moment.