DESeq2: FPMs varying based on sample groups used
1
0
Entering edit mode
Jay • 0
@2034d2f5
Last seen 6 months ago
Austria

I recently analysed a transcriptomic dataset with 3 sample groups (4 samples each) and performed a pairwise comparison between the groups. I output the FPMs for the pairwise comparisons and noticed that the same gene in a sample would have a slightly different FPM in each of the comparisons.

example: A gene has the following FPM in one comparison: Sample 1: 476.992795157303, Sample 2: 472.464072441368, Sample 3: 488.11759330905, Sample 4: 461.634140423592

and in the second: Sample 1: 448.229377708702, Sample 2: 449.795231722178, Sample 3: 460.560159012059, Sample 4: 431.705745786016

Is there a reason this would happen? Is it expected?

Example commands run once data was in DESeq:

dds_LactuG <- DESeqDataSetFromMatrix(countData=FCcounts_clean_LactuG, colData=Meta_LactG, design=~condition, tidy = TRUE)
dds_LactuG$condition<-relevel(dds_LactuG$condition, ref="Lactose")

#run DESEQ
dds_LactuG <- DESeq(dds_LactuG)

#get results
res <- results(dds_LactuG)
resOrdered <- res[order(res$pvalue),]

FPM_table <- fpm(dds_LactuG) %>% as.data.frame() %>% rownames_to_column("Geneid")

Thank you for any help!

DESeq2 • 509 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.1k
@atpoint-13662
Last seen 9 hours ago
Germany

If you have different samples in the dds object when you do the normalization then yes, this is expected. Normalization is relative to all involved samples and it will (slightly most of the time) change when you add or subtract samples. Unlike "naive" FPM the DESeq2 version uses the a version that uses the DESeq2 size factors and these, as said, depend on present samples during its calculation. Unlike this, naive per-million scaling would always be the same, but is a poor technique that does not correct for composition bias.

ADD COMMENT
0
Entering edit mode

Thanks for the response I thought that was the situation but wasn't sure.

I would like to get FPMs/FPKMs across all treatments and use those for visuals. Would making a sample set including all samples, with a dummy metadata set, and running DESeq on it to just get FPMs (not using all of the actual results from DESeq) be alright? Is there another way to get one FPM value per sample in these situations?

I would like to use them for visualizations and in supplementary data.

ADD REPLY
0
Entering edit mode

You don't need to subset your data for a pairwise analysis. See the vignette on contrasts. Unless there is a good reason to subset just run DESeq on all samples, then use contrasts for the pairwise analysis and use the FPKMs or any normalized counts from this analysis.

ADD REPLY

Login before adding your answer.

Traffic: 790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6