Question: DeSEQ2- What is the difference between Gene Clustering and using LRT for Time Series experiment WITHOUT treatments
0
gravatar for kaiser.karim
6 weeks ago by
kaiser.karim0 wrote:

Hi, I am trying to analyse a time series experiment of neurons differentiated from human stem cells, to understand the differentiation process. I have sampled my cells at the following time points post-differentiation

Day 0, 6h, 12h, 24h, 36h, Day 2, Day 3, Day 4, Day 14 and Day 21. The reason for the gap between Day 4 and subsequent time points is because fate commitment happens by day 4 and several functional events occur at around Day 14 and Day 21. I have performed RNAseq and ATACseq for each timepoint. Firstly, with RNAseq, I am trying to figure out what is the right way to identify gene clusters across the time timepoints.

1) Should I use the LRT with a reduced design as described by Michael Love et al in https://bioconductor.org/packages/3.7/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments In which chase, I did the following:

timepoint <- factor(c(rep("D00H00", 3), rep("D00H06", 3), rep("D00H12", 3), rep("D01H00", 3), rep("D01H12", 3), rep("D02H00", 3), rep("D03H00", 3), rep("D04H00", 3), rep("D14H00", 3), rep("D21H00", 3)))
ddsMat<-DESeqDataSetFromMatrix(countData=RNAseq_genecounts_matrix, colData=coldata, design=~timepoint)
ddsMat <- ddsMat[ rowSums(counts(ddsMat)) > 1, ] 
ddsLRT <- DESeq(ddsMat, test = "LRT", reduced = ~1)
resLRT <- results(ddsLRT)
betas <- coef(ddsLRT)
topGenes <- head(order(resLRT$padj),1000)
mat <- betas[topGenes, -1]
thr <- 3 
mat[mat < -thr] <- -thr
mat[mat > thr] <- thr
pheatmap(mat, breaks=seq(from=-thr, to=thr, length=101),
         cluster_col=FALSE)  

2) or should I do gene clustering using transformed values as described in the following: http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#gene-clustering

rld<- rlog(ddsMat, blind= FALSE)
topVarGenes <- head(order(rowVars(assay(rld)), decreasing = TRUE), 1000)
mat  <- assay(rld)[ topVarGenes, ]
mat  <- mat - rowMeans(mat)
pdf ("plots/GeneCluster_50.pdf")
pheatmap(mat, clustering_distance_cols=sdist, clustering_distance_rows=sdist) 

Ultimately, it would be great if you could explain the differences between the two approaches. I have looked at several other related posts on this matter, but can't seem to understand this difference. I know this a big ask, so I greatly appreciate any help you can offer!

ADD COMMENTlink modified 6 weeks ago by Michael Love23k • written 6 weeks ago by kaiser.karim0
Answer: DeSEQ2- What is the difference between Gene Clustering and using LRT for Time Se
1
gravatar for Michael Love
6 weeks ago by
Michael Love23k
United States
Michael Love23k wrote:

The first one focuses on statistical significance while the second uses the variance stabilized data alone and so includes a different set of genes. There’s not really a “correct” choice.

ADD COMMENTlink written 6 weeks ago by Michael Love23k

As simple as that! Thanks Michael!

ADD REPLYlink written 6 weeks ago by kaiser.karim0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour