The editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: some questions about using the rlog function of DEseq2
gravatar for 137737756
7 months ago by
1377377560 wrote:

I am now tring to use the TCGA HTseq_counts data to do some annlysis.

I plan to follow the process below.

First. Use the RNA_seq dataset which containing tumor and normal tissue to find differently expressed genes by DESeq2::DESeq.

Second, because i only want to normalize the tumor data, so i use DESeq2::rolg to transform the data only containing tumor data.

Third, I will use the rolg transformed tumor data to do survial analysis and unsupervised clustering analysis.

Four, after clustering the rolg transformed tumor data, the tumor will be clustered be some groups, at this time, i may want to separately compare the tumor groups each other, and i will to compare the each group to the normal tissue. So, i want to back to the original data which containing the tumor and normal tissue, and set the tumor to some groups, and use DESeq2::DESeq to do differently analysis. 

Here is my question:

1. Is there some misstakes of my process?

2. I worry about the data which contains normal tissue will disturb the transform of the tumor data, so i use the data only contains tumor data to do rlog transform,  because i am thinking that the survival analysis and clustering analysis is regardless of the normal tissue data. And is rlog transform have a different result to a dataset A and dataset B whcih is part of A?

3. When i go to analysis the differently expressed genes between the tumor group. Is that correct to back to use the original data which is not transformed to do DESeq2::DESeq?

Thank you, I hope you can give me advice.

deseq2 rna-seq • 165 views
ADD COMMENTlink modified 7 months ago by Michael Love22k • written 7 months ago by 1377377560
Answer: some questions about using the rlog function of DEseq2
gravatar for Michael Love
7 months ago by
Michael Love22k
United States
Michael Love22k wrote:

I would recommend the VST for transforming especially large datasets. Anyway if you run rlog() on a large dataset, it will give this warning. The vst() function will be much faster than rlog() and it turns out to be more robust. If you have large differences between groups, I'd recommend vst(dds, blind=FALSE) as discussed in the vignette. You can include the normal samples, it shouldn't be a problem.

In general, yes we recommend variance stabilized data for e.g. calculating sample distances or PCA, and the original counts for differential analysis.


ADD COMMENTlink written 7 months ago by Michael Love22k

Thank you for constructive opinion. I will try to use VST transform. 

ADD REPLYlink written 7 months ago by 1377377560
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 177 users visited in the last hour