Question: DeSEQ2- What is the difference between Gene Clustering and using LRT for Time Series experiment WITHOUT treatments
0
3 months ago by
kaiser.karim0 wrote:

Hi, I am trying to analyse a time series experiment of neurons differentiated from human stem cells, to understand the differentiation process. I have sampled my cells at the following time points post-differentiation

Day 0, 6h, 12h, 24h, 36h, Day 2, Day 3, Day 4, Day 14 and Day 21. The reason for the gap between Day 4 and subsequent time points is because fate commitment happens by day 4 and several functional events occur at around Day 14 and Day 21. I have performed RNAseq and ATACseq for each timepoint. Firstly, with RNAseq, I am trying to figure out what is the right way to identify gene clusters across the time timepoints.

1) Should I use the LRT with a reduced design as described by Michael Love et al in https://bioconductor.org/packages/3.7/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments In which chase, I did the following:

timepoint <- factor(c(rep("D00H00", 3), rep("D00H06", 3), rep("D00H12", 3), rep("D01H00", 3), rep("D01H12", 3), rep("D02H00", 3), rep("D03H00", 3), rep("D04H00", 3), rep("D14H00", 3), rep("D21H00", 3)))
ddsMat<-DESeqDataSetFromMatrix(countData=RNAseq_genecounts_matrix, colData=coldata, design=~timepoint)
ddsMat <- ddsMat[ rowSums(counts(ddsMat)) > 1, ]
ddsLRT <- DESeq(ddsMat, test = "LRT", reduced = ~1)
resLRT <- results(ddsLRT)
betas <- coef(ddsLRT)
mat <- betas[topGenes, -1]
thr <- 3
mat[mat < -thr] <- -thr
mat[mat > thr] <- thr
pheatmap(mat, breaks=seq(from=-thr, to=thr, length=101),
cluster_col=FALSE)


2) or should I do gene clustering using transformed values as described in the following: http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#gene-clustering

rld<- rlog(ddsMat, blind= FALSE)
topVarGenes <- head(order(rowVars(assay(rld)), decreasing = TRUE), 1000)
mat  <- assay(rld)[ topVarGenes, ]
mat  <- mat - rowMeans(mat)
pdf ("plots/GeneCluster_50.pdf")
pheatmap(mat, clustering_distance_cols=sdist, clustering_distance_rows=sdist)


Ultimately, it would be great if you could explain the differences between the two approaches. I have looked at several other related posts on this matter, but can't seem to understand this difference. I know this a big ask, so I greatly appreciate any help you can offer!

modified 3 months ago by Michael Love24k • written 3 months ago by kaiser.karim0
Answer: DeSEQ2- What is the difference between Gene Clustering and using LRT for Time Se
1
3 months ago by
Michael Love24k
United States
Michael Love24k wrote:

The first one focuses on statistical significance while the second uses the variance stabilized data alone and so includes a different set of genes. There’s not really a “correct” choice.