Search
Question: DESeq2 Gene Expression Set up: Behavior vs Age
0
gravatar for wallace413
22 days ago by
wallace4130
wallace4130 wrote:

Hello All,

I'm something of a neophyte to DESeq2 and want to be sure I'm setting up my analysis appropriately given my sample groups.

First, some background: I have a gene expression dataset which captures five distinct age stages (1 thru 5, 3 reps a piece) as well as 2 clear behavioral states from the earliest and latest age stages (i.e. State A and B from stage 1, 4 reps a piece; State A and B from stage 5, 5 reps a piece). I can thus address questions regarding aging overall as well as the effects of age on behavioral state.

Thus far, I've been splitting this total sample set into separate GLM runs to approach each question independently (i.e. Age: stages 1-5 together in run 1; Behavior: stages 1 and 5, and all behavioral states A and B together in run 2). I am wondering if this is acceptable, or if it wouldn't be more statistically appropriate to combine all samples, from both age and behavior, together in a single GLM, using interaction terms to assign compound conditions to each sample and extracting results from this grander setup.

What follows is an example setup for my 'single-question' GLM addressing age:

total_counts<-read.table(file="TimeCourseIndividualsOnly.txt",head=TRUE,row.names=1)
expt_design <- data.frame(rows = colnames(total_counts),
                          condition = c("Time1", "Time1", "Time1", "Time2", "Time2", "Time2", "Time3", "Time3", "Time3", "Time4", "Time4", "Time4", "Time5", "Time5", "Time5"))

dds <- DESeqDataSetFromMatrix( countData = total_counts, colData = expt_design, design = ~ condition)                
dds <- DESeq(dds)   

colData(dds)

res <- results(dds)

dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)

#ExampleResult_Time1vsTime 2
Time1vsTime2 <- results(dds, contrast=c("condition","Time1","Time2"))

Time1vsTime2 <- as.data.frame(Time1vsTime2)

write.csv(Time1vsTime2, "T01_Time1vsTime2_DESeq2test_09202018.csv", row.names=TRUE)

Thanks very kindly in advance!

ADD COMMENTlink modified 21 days ago by Michael Love19k • written 22 days ago by wallace4130
1
gravatar for Michael Love
21 days ago by
Michael Love19k
United States
Michael Love19k wrote:

Are these two groups of samples (the 15 and the 18) separate batches? Were the libraries prepared separately?

You didn’t mention if the 15 samples are A or B or both. Which are they?

ADD COMMENTlink written 21 days ago by Michael Love19k

Hey Michael,

Thank you very much for your attention and questions! I hope the following will address your questions and better clarify my sample set.

The libraries for all 33 samples were prepared in one go (we sent away for the work). The fifteen samples collected for age were gathered during a steady state at timepoints 1, 2, 3, 4, and 5; so they are neither A nor B. The eighteen behavior-associated samples reflect the gene expression of individuals performing either behavior A or B at either timepoint 1 (N = 4 for each behavior) or timepoint 5 (N=5 for each behavior). As such, individuals that are A or B are also either 1 or 5.

Given this, I believe expression data for individuals collected at timepoints 1 and 5 could be used to address the question of gradual aging, but could also be used to compare against our two conspicuous behavioral states.

Again, hoping this helps, and do let me know if I can provide further info. Thank you again for your time!

ADD REPLYlink written 21 days ago by wallace4130

I’m not sure yet how the extra 1 and 5 samples help because they are A and B while the other samples are some other category (neither A nor B) and so it’s not so easy to lump them in. Presumably there is some difference between A, B and neither A nor B?

ADD REPLYlink written 21 days ago by Michael Love19k

Ah, I see. To be more explicit, we're looking at nestmates of a social insect. We have samples that capture brain gene expression of young females who are just resting in the nest (timepoint 1) as well as young foragers (Young A) and young nest guards (Young B). We've also collected old females (timepoint 5), as well as old foragers (Old A) and old nest guards (Old B). I'm interested in exploring the ways in which age may effect gene expression underlying each behavioral state (i.e. foraging and guarding), and figured part of that process would involve a comparison to age-matched individuals that were not engaged in either task.

ADD REPLYlink written 21 days ago by wallace4130
1

I think it's easiest to analyze the datasets separately here, as you don't want to assume that the difference between resting, foraging and guarding is the same at time 1 and 5, and then you have three missing time points for the resting. It will make the analysis more straightforward, and you have plenty of degrees of freedom to estimate the dispersion (sometimes it is recommended to put all samples together to aid in estimation of dispersion, but here you have many samples).

ADD REPLYlink written 21 days ago by Michael Love19k

Thank you very much, Michael!

ADD REPLYlink written 20 days ago by wallace4130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 430 users visited in the last hour