I'm using deseq2 to find differentially expressed genes across 7 different cancer types. My goal is to obtain a list of genes differentially expressed when comparing each cancer to all others. For instance, comparing BRCA to all others. My question is about setting the "listValues" argument in the results() function. I know the default is set to c(1,-1), but since I want to compare 1 group to 6 groups for each contrast, should I change that to c(1,-1/6)? Example code given below. Also, is this the right way of running deseq for this kind of analysis or should I stick to pairwise comparisons (e.g. BRCA vs. COAD only)?
tissue <- c(rep("BRCA",1222), rep("COAD",521), rep("LIHC",424), rep("LUAD",594), rep("LUSC",551), rep("0V",379), rep("PAAD",182))
sampleTable <- data.frame(sampleName = sampleFiles,
fileName = sampleFiles,
tissue = tissue)
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
design= ~ tissue)
ddsHTSeq <- ddsHTSeq[ rowSums(counts(ddsHTSeq)) > 1, ]
register(MulticoreParam(20))
dds <- DESeq(ddsHTSeq, parallel=TRUE)
res <- results(dds, parallel=TRUE)
results(dds, contrast = list("BRCA", c("LUAD","LUSC", "LIHC", "OV", "COAD", "PAAD")),
listValues=c(1, -1))
Thanks exactly what i was looking for!
Dear Michael,
With reference to this post and the one at: A: lfcshrink in DESeq2 with grouping variable
I want to understand if what I did is correct.
Here is the condition_df:
disease_state condition
HC1_BND Healthy BND
HC1_MN Healthy MN
HC2_BND Healthy BND
HC2_MN Healthy MN
HC3_BND Healthy BND
HC3_MN Healthy MN
HC4_BND Healthy BND
HC4_MN Healthy MN
S1_BND Diseased BND
S1_MN Diseased MN
S2_BND Diseased BND
S2_MN Diseased MN
S3_BND Diseased BND
S3_MN Diseased MN
S4_BND Diseased BND
S4_MN Diseased MN
Now, my goal is to find genes that are different for "Diseased and BND" category as opposed to all others. To accomplish that, I did the following:
dds <- DESeqDataSetFromMatrix(countData = htseq_counts[rownames(condition_df)],
colData = condition_df,
design = ~ disease_state + condition + disease_state:condition)
dds <- DESeq(dds)
Running the following command gives me the bulleted list below
resultsNames(dds)
Then I did this:
result_A <- results(dds, contrast=list( c("disease_state_Healthy_vs_Diseased","condition_MN_vs_BND") ))
Running "
results
" function gave me a few (30) genes that clustered well for Diseased and BND option along with other(50) genes that clustered for other categories.Now my questions for you are:
resultsNames
are generated and why isn't there a combination such as:disease_stateDiseased.conditionBND
- as this is what I would be interested in; or how could I get that?Is there a better way to just get the genes that are differentially expressed in the Diseased_BND category as opposed to any other?
Thanks much,
Ashu
Would you mind posting this as a new question, it helps to organize the conversation thread.
Hi Michael,
I just saw this, Thanks a lot for your reply, Here is the link to my new post:
Need help with DESeq2's contrast feature
Thanks
Ashu