Question

Need help with creating and plotting temporal clustering analysis of DEGs

0

Entering edit mode

Emma • 0

@802f1654

Last seen 5 weeks ago

United States

Hello Bioconductor community,

I have been struggling to find the answer for my analysis. I have two files for the data: one is an Excel file (including time points and the names of the samples, such as "stimulated vs. non-stimulated" controls), and a text file containing gene expression data (RNA-seq). The analysis should be longitudinal because we have a sample size of 25 participants, each with 8 samples (3 stimulated, 3 negative control, and 2 positive control). The duration of the exposure study is 50 days, and if a subject develops a persistent inhibitor, they would need to undergo ITI treatment for 24 weeks.

Our best ability for the DEGs analysis is using DESeq2, including conditions where we analyze both stimulated vs. negative control and stimulated vs. positive control. This results in three groups: Group 1 (developed inhibitor + ITI extension), Group 2 (developed inhibitor), and Group 3 (no inhibitor). The difference between Group 1 and Group 2 is that Group 1 underwent 24 weeks of ITI treatment to eradicate the inhibitor, while the inhibitor in Group 2 was eradicated before the end.

We came across an article that showed the temporal clustering of DEGs, but their data looks different from ours since they don't have the two conditions. That's what I believe to be:

Karim, A. F., Soltis, A. R., Sukumar, G., Königs, C., Ewing, N. P., Dalgard, C. L., Wilkerson, M. D., & Pratt, K. P. (2020). Hemophilia A Inhibitor Subjects Show Unique PBMC Gene Expression Profiles That Include Up-Regulated Innate Immune Modulators. Frontiers in immunology, 11, 1219. https://doi.org/10.3389/fimmu.2020.01219

I am stuck now trying to figure out how to come up with a graph for my own data similar to their Figure 2, as our output looks messy for each time point.

Example of our findings:

Subject 1 (F8 vs. Negative control DEGs):

Baseline: 233 Exposure Day 1: 1165 Exposure Day 5: 817 Exposure Day 10: 72 Exposure Day 40: 787 Exposure Day 50: 515

I hope I am explaining my issue clearly. I am open to any suggestion or any tutorial that could help sort our output.

Thank you!

library (DESeq2)
coldata <- data.frame(row.names = colnames(GeneMat), condition)
#check if col names  = row names ## must say TRUE
all(rownames(coldata) == colnames(GeneMat))
#round the numbers so it could work
integer_matrix <- round(GeneMat)
dds <- DESeqDataSetFromMatrix(countData = integer_matrix, colData = coldata, design = ~condition)
##LRT
dds_lrt <- DESeq(dds, test="LRT", reduced = ~ 1)
res_LRT <- results(dds_lrt)

DESeq2 ctsGE • 168 views

ADD COMMENT • link updated 5 weeks ago by Michael Love 41k • written 5 weeks ago by Emma • 0

score 1 · Answer 1 · 2024-03-18

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 15 hours ago

United States

To create something like their Fig 2 you would use vst followed by hierarchical clustering.

For make the plot you would use manual code.

I also have written some recent tutorial on plotting expression curves from multiple genes with tidyomics:

https://tidyomics.github.io/tidy-ranges-tutorial/rna-seq-eda.html

ADD COMMENT • link 5 weeks ago Michael Love 41k