Manually calculating log2 fold change values from DESeq2 normalized counts
1
0
Entering edit mode
@49806f54
Last seen 1 day ago
United States

I need to calculate log2 fold change values for lot of different experimental conditions when compared to their corresponding controls. Just to mention, I am not going to use these for differential expression analysis but for some other downstream analysis like clustering and stuff. Traditionally in my field, counts are normalized by TPM method and then fold change values are calculated by log2(TPM_exp+1)-log2(TPM_control+1) [using 1 or 0.5 as pseudo counts for log transformation]. In my case, I realized that TPM is not a good way to normalize the data as I have few samples with lot of reads mapping to only one or two genes [RNA composition bias]. DESeq2 median of ratios nomalization seems to take care of that issue. So, I prefer using DESeq2 normalization. But I cannot use DESeq2 for getting log2 fold change values because I don't have replicates for some of the experimental conditions and DESeq2 needs replicates to estimate log2 fold change values. So, I want to manually calculate log2 fold change values from DESeq2 normalized counts. So, I am using log2(DESeq2norm_exp+0.5)-log2(DESeq2norm_control+0.5) for calculating log2 fold change values. I am not sure whether it is a good idea or the choice of pseudo-count here is very critical.

The other option I guess is performing VST on raw counts. I believe VST first performs median of ratios normalization (important for me to get rid of RNA composition bias) and then does variance stabilization transformation of the normalized counts. By doing this I believe I can also reduce the inflation of log fold change values for genes with small counts. If this is okay, I believe I have to perform VST not on all samples (different experimental and control samples) together but perform VST on samples which corresponds to a condition and its control separately. Then, finally I can use these VST data for calculating log fold change values for a particular experimental condition relative to its control; And repeat the same for all conditions.

But, naturally here I have very few samples (2-4) which corresponds to a condition and its control - so, should I prefer rlog transformation over VST?

Any comments or help is really appreciated.

DESeq2 DifferentialExpression • 366 views
0
Entering edit mode
1
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

If you want to apply median-of-ratios method to TPM you can use estimateSizeFactorsForMatrix. This same question was asked just recently on the Bioc support site.

0
Entering edit mode

Thanks for the suggestion