Hello, everyone! I recently encountered an interesting problem that I asked the Bioinformatics Reddit about, which elicited a split opinion.
I have the following set of samples:
4x conditions: 0, 70, 90, and 100% stenosis
I have three biological samples for each condition. Within each specific biological sample, I separated the upstream region of a blood vessel and the downstream region of a blood vessel at the stenosis point into separate Eppendorf tubes to perform RNA-sequencing.
Question: If I am most interested in exploring the changes in genes between the upstream and downstream for each stenosis condition (e.g. 70% stenosis downstream vs. 70% stenosis upstream), what would my DeSeq2 design be?
Some supported the idea of subsetting my data into all stenosis conditions, and then running design(dds) <- ~ region. However, others believed that this followed an example in the vignette linked here, suggesting that I need to "nest" my specific samples together. This made a little more sense to me, because although my understanding of DESeq2 is ever evolving, it is likely that giving more data to work with builds more power in the system than if I split my data up.
Anyway, I tried this approach. I have this as my col data (I think I set it up correctly?):
> as.data.frame(coldata)
sample name occlusion region replicate replicate.n
0 occlusion downstream 1 0 downstream R1 1
0 occlusion downstream 2 0 downstream R2 2
0 occlusion downstream 3 0 downstream R3 3
70 occlusion downstream 1 70 downstream R1 1
70 occlusion downstream 2 70 downstream R2 2
70 occlusion downstream 3 70 downstream R3 3
90 occlusion downstream 1 90 downstream R1 1
90 occlusion downstream 2 90 downstream R2 2
90 occlusion downstream 3 90 downstream R3 3
100 occlusion downstream 1 100 downstream R1 1
100 occlusion downstream 2 100 downstream R2 2
100 occlusion downstream 3 100 downstream R3 3
0 occlusion upstream 1 0 upstream R1 1
0 occlusion upstream 2 0 upstream R2 2
0 occlusion upstream 3 0 upstream R3 3
70 occlusion upstream 1 70 upstream R1 1
70 occlusion upstream 2 70 upstream R2 2
70 occlusion upstream 3 70 upstream R3 3
90 occlusion upstream 1 90 upstream R1 1
90 occlusion upstream 2 90 upstream R2 2
90 occlusion upstream 3 90 upstream R3 3
100 occlusion upstream 1 100 upstream R1 1
100 occlusion upstream 2 100 upstream R2 2
100 occlusion upstream 3 100 upstream R3 3
And then I have the following as my design:
# Create DESeq2 dataset
dds <- DESeqDataSetFromMatrix(
countData = cts,
colData = coldata,
design = ~ occlusion + occlusion:vessel_n + occlusion:region
# DESeq2 suggests NO PRE-FILTERING, except keep <- rowSums(counts(dds) >=10) >=X; where X = smallest group size
keep <- rowSums(counts(dds) >=10) >=3
dds <- dds[keep, ]
# Run DESeq normalization
dds <- DESeq(dds)
But then I get this error, which states that:
83 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
I used a larger maxit and had no luck. Does anyone have any suggestions on optimizing my design setup? Is there a better approach between the two? I should also note that I am using lfcshrink() with "apeglm" and s-values if that makes any difference downstream. Many thanks for considering my request.

Hi Kevin,
Firstly, thank you so much for not only a very prompt response, but one that is incredibly insightful and helps me learn DESeq2 more. I appreciate it!
After being more aggressive with
maxit, I was still in a state of convergence. I also simplified the design, and the convergence was gone! So then I went to try the LRT instead of Wald, was able to successfully run LRT, and am now onto designing the contrasts for all of my occlusion levels. Also, quality control looks good. The dispersions look great based on what is expected with my pre-filtering step. I did have a follow-up here, though, that I wanted to clear up with you if you don't mind.I read this part of the vignette, and am starting to learn more about contrasts. I understand the comparison you are trying to make via:
However, I am getting an error saying that the ladder part of my contrast request is not existent. I ran:
After doing so, I noticed that I do not have any upstream coefficients. I am assuming this is a problem and I am assuming I cannot get to
lfcshrink90because "occlusion70.region_upstream" is not existent. I am assuming this is an issue with this line:# Run LRT, reduced = model without interaction dds <- DESeq(dds, test="LRT", reduced=~ occlusion + occlusion:replicate, parallel=TRUE)For the life of me, I cannot figure it out. I am going to try again in the morning when I am well rested, but wanted to see if it was something simple I was missing.
Again, this has been great assistance. I really appreciate this so much!