I am now tring to use the TCGA HTseq_counts data to do some annlysis.
I plan to follow the process below.
First. Use the RNA_seq dataset which containing tumor and normal tissue to find differently expressed genes by DESeq2::DESeq.
Second, because i only want to normalize the tumor data, so i use DESeq2::rolg to transform the data only containing tumor data.
Third, I will use the rolg transformed tumor data to do survial analysis and unsupervised clustering analysis.
Four, after clustering the rolg transformed tumor data, the tumor will be clustered be some groups, at this time, i may want to separately compare the tumor groups each other, and i will to compare the each group to the normal tissue. So, i want to back to the original data which containing the tumor and normal tissue, and set the tumor to some groups, and use DESeq2::DESeq to do differently analysis.
Here is my question:
1. Is there some misstakes of my process?
2. I worry about the data which contains normal tissue will disturb the transform of the tumor data, so i use the data only contains tumor data to do rlog transform, because i am thinking that the survival analysis and clustering analysis is regardless of the normal tissue data. And is rlog transform have a different result to a dataset A and dataset B whcih is part of A?
3. When i go to analysis the differently expressed genes between the tumor group. Is that correct to back to use the original data which is not transformed to do DESeq2::DESeq?
Thank you, I hope you can give me advice.
Thank you for constructive opinion. I will try to use VST transform.