Question

In DE experiment with multiple contrasts, is it possible to merge only relevant counts column back to the results, versus all (by default) counts columns

0

Entering edit mode

JP Carter ▴ 40

@jp-carter-15371

Last seen 5 months ago

Nashville, TN

Greetings,

Our set up is comprised of 4 comparisons (using Contrast). Our Counts table contains 20 samples (columns), which we normalized as a whole. We loop through the various Contrast options to build up graphs, results and so forth.

When preparing DESeq2 results, we typically merge the counts data back the DE results, something like:

resdata <- merge(as.data.frame(results), as.data.frame(counts(dds, normalized=TRUE)), by="row.names", sort=FALSE)

For any particular contrast, we are only interested in the samples being contrasted/tested. Hence, we don't need to have all 20 columns of counts attached to the results. The above code snippet will merge all 20 samples back onto a results, that effectively tested 3 vs 3 samples (6, in total, not 20).

Is there a way, via DESeq2 (or otherwise) to create a new dataframe for those samples that are being tested in the contrast the results are relevant to? It's easier to interpret the output for the end-user, and it provides the ability to also filter out low/bad count data, using the rowSums approach (which we don't want to do on 20 samples as a whole).

Cheers,

JP

deseq2 • 634 views

ADD COMMENT • link 6.1 years ago JP Carter ▴ 40

score 0 · Answer 1 · 2018-03-29

I think I've come up with a generic (R) solution. Below, assume that the approach is:

results(dds, contrast=c("condition", condition[1], condition[2]))

Notes:

condition[1] and condition[2] are being assigned in a 'for' loop
column search, via grep, is completely dependent on how you've names conditions and samples

# Extract columns relevant to contrast conditions
dds_counts <- dds@assays$data$counts # get counts
results_columns <- as.data.frame(dds@colData@listData$condition) # get condition names, which will match targets for condition[1] and condition[2] values

search1 <- paste(condition[1], " (", sep="") # dds_counts colnames also have '(sample #)'
dds_counts_reduced_1 <- dds_counts[,grep(search1, colnames(dds_counts),fixed=T)]
search2 <- paste(condition[2], " (", sep="") # dds_counts colnames also have '(sample #)'
dds_counts_reduced_2 <- dds_counts[,grep(search2, colnames(dds_counts),fixed=T)]

dds_counts_reduced_merged <- merge(dds_counts_reduced_1, dds_counts_reduced_2, by="row.names")