Question: Finding genes that are expressed only in one condition within a contrast
0
9 weeks ago by
charles.foster0 wrote:

Hi,

I have carried out differential expression analyses comparing conditions using DESeq2. Intuitively, I have considered genes to be expressed if they have a count of at least 10 in at least some libraries (sensu Chen et al: https://f1000research.com/articles/5-1438). Hence, I carried out a filtering step before DE analysis using the filterByExpr function of edgeR. In my results, in addition to the pvalues and LFC etc. I have columns with baseMeans for conditions:

Gene    sampleA sampleB baseMeanA_cond1_vs_cond2    baseMeanB_cond1_vs_cond2

Gene1   cond1   cond2   0   70.0618858219621

Gene2   cond1   cond2   0   13.8155035471724

(apologies if the tab-delimited table shows up poorly)

To get these, I did (e.g.):

baseMeanA_cond1_vs_cond2 <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$Tissue == "cond1"]) baseMeanB_cond1_vs_cond2 <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$Tissue == "cond2"])

Now, I am looking to further refine my results to find any genes that are expressed in one condition, and not expressed at all in another. In this case, I do not want to know that Gene1 is upregulated in Condition2 relative to Condition2, but is still expressed in Condition1. I would just like to know that Gene1 is expressed in Condition2, and is not expressed in Condition1.

What would be the best way to do this?

From reading this site and the DESeq2 vignette, I know that the baseMean is "the mean of normalized counts of all samples, normalizing for sequencing depth." However, I'm a bit confused about 1) how my criterion on counts having to be >=10 to be expressed has been factored into the final baseMean results, and 2) how to subset my DE results to get expressed vs not expressed.

Is it as simple as getting all genes where the baseMean for condition1 = 0, and the baseMean for condition2 > 0? Or would it be genes where the baseMean for condition1 < 10, and the baseMean for condition2 >= 10?

Also, if it's easier to do this separately to the DESeq2 results, I'm happy to do so, e.g. by subsetting a matrix of count values or TPM values or TMM values.

Thanks!

Charles

deseq2 filter • 77 views
modified 9 weeks ago by Michael Love23k • written 9 weeks ago by charles.foster0
Answer: Finding genes that are expressed only in one condition within a contrast
1
9 weeks ago by
Michael Love23k
United States
Michael Love23k wrote:

We don’t have a way in DESeq2 to determine what counts or TPM correspond to “expressed” and what to “not expressed”. When students or collaborators want to do this I typically recommend looking at histsograms of abundance (TPM) over all genes.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.