Behavior of most genes in Ribosome-footprints dataset deviates from expected
1
0
Entering edit mode
Anastasiia • 0
@2c27d47a
Last seen 3 months ago
Germany

Hello,

I have the RNA and Ribo-seq dataset for a biological process for 5 time points in 2 replicates.

I am performing 3 LRT tests, for RNA, Ribo-seq and translation efficiency (TE = Ribo / RNA + pseudo-count) respectively.

dds_rna <- DESeqDataSetFromMatrix(countData = counts_rna,
                                  colData = metadata_rna,
                                  design = ~replicate + stage)
dds_rna <- DESeq(dds_rna, test="LRT", reduced = ~replicate)

dds_rpf <- DESeqDataSetFromMatrix(countData = counts_rpf,
                                  colData = metadata_rpf,
                                  design = ~replicate + stage)
dds_rpf <- DESeq(dds_rpf, test="LRT", reduced = ~replicate)

dds_te <- DESeqDataSetFromMatrix(countData = counts,
                                  colData = metadata,
                                  design = ~replicate + type + stage + stage:type)
dds_te <- DESeq(dds, test="LRT", reduced = ~replicate  + type + stage)

Looking on TPM counts of some genes, that interested me I noticed a problem. Genes, I assumed to be housekeeping, had decreasing TPM from first to last time point. Some of them DESeq2 detected as differentially expressed. On RPF data these trend was not clear.

Also I checked the genes with stable TPM, and they had significant adjusted p-value. On the other hand, genes with p-value > 0.9 had decrease from first to last sample around 4 fold.

Then I calculated ratio between maximum and minimum TPM value for every gene and plotted its distribution. The graph had peak around 4. I deleted tRNA genes from my analysis, and repeated everything. DESeq2 p-values did not change much. However visual TPM trend decreased a bit. Mode of ratio also dropped to approximately 2. Still it seems a lot for me. Although, I do not know if this ratio calculation was statistically feasible.

Additional problem, my samples are the combination of two tissues, that are difficult to separate. For one of them, that is actually interesting, the bulk decrease in RNA levels over time was reported.

Therefore, my question.

Can such behavior be an error of DESeq2? Or is it a problem of sample preparation? Or did I do a mistake in preprocessing? Does it always necessary to discard tRNA? Can it be the reason? Or is it problem of TPM normalization that does not encounter library composition? Should I use DESeq2 size factor normilization for visualizing the results? Or may be it is a real biological effect?

On the other hand, I confused about housekeeping genes choice. Should I take the genes that decrease over time, as general behavior? Or find genes that are stable as much as possible, according to definition?

This was a long post. Hope for your help and comments!

DESeq2 timecourse time • 239 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

Genes, I assumed to be housekeeping, had decreasing TPM from first to last time point.

You can specifically set controlGenes in the estimateSizeFactors step. Search the support site for other related Qs.

ADD COMMENT

Login before adding your answer.

Traffic: 441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6