Manually calculating log2 fold change values from DESeq2 normalized counts
1
0
Entering edit mode
@49806f54
Last seen 14 months ago
United States

I need to calculate log2 fold change values for lot of different experimental conditions when compared to their corresponding controls. Just to mention, I am not going to use these for differential expression analysis but for some other downstream analysis like clustering and stuff. Traditionally in my field, counts are normalized by TPM method and then fold change values are calculated by log2(TPM_exp+1)-log2(TPM_control+1) [using 1 or 0.5 as pseudo counts for log transformation]. In my case, I realized that TPM is not a good way to normalize the data as I have few samples with lot of reads mapping to only one or two genes [RNA composition bias]. DESeq2 median of ratios nomalization seems to take care of that issue. So, I prefer using DESeq2 normalization. But I cannot use DESeq2 for getting log2 fold change values because I don't have replicates for some of the experimental conditions and DESeq2 needs replicates to estimate log2 fold change values. So, I want to manually calculate log2 fold change values from DESeq2 normalized counts. So, I am using log2(DESeq2norm_exp+0.5)-log2(DESeq2norm_control+0.5) for calculating log2 fold change values. I am not sure whether it is a good idea or the choice of pseudo-count here is very critical.

The other option I guess is performing VST on raw counts. I believe VST first performs median of ratios normalization (important for me to get rid of RNA composition bias) and then does variance stabilization transformation of the normalized counts. By doing this I believe I can also reduce the inflation of log fold change values for genes with small counts. If this is okay, I believe I have to perform VST not on all samples (different experimental and control samples) together but perform VST on samples which corresponds to a condition and its control separately. Then, finally I can use these VST data for calculating log fold change values for a particular experimental condition relative to its control; And repeat the same for all conditions.

But, naturally here I have very few samples (2-4) which corresponds to a condition and its control - so, should I prefer rlog transformation over VST?

Any comments or help is really appreciated.

DESeq2 DifferentialExpression • 4.1k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 6 minutes ago
United States

If you want to apply median-of-ratios method to TPM you can use estimateSizeFactorsForMatrix. This same question was asked just recently on the Bioc support site.

ADD COMMENT
0
Entering edit mode

Thanks for the suggestion

ADD REPLY

Login before adding your answer.

Traffic: 881 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6