What is the correct way to split genex_high and genex_low groups for DE analysis?
1
0
Entering edit mode
rohanphn • 0
@c7686fef
Last seen 12 months ago
Brazil

Hey guys. Good afternoon.

I would like to separate the TCGA-STAD data into two groups, one with high expression for gene x, and one with low expression for gene x. I would like to separate these groups according to quartiles, taking the upper and lower quartiles. Then, I would perform the DESeq2 analysis between these two groups, since I hypothesize that they have different biological characteristics.

However, I am in doubt on which data I should perform this separation into quartiles, whether it is in the raw data, in the TPM normalized data, or in the data normalized by DESeq2.

I thought of it this way. As DESeq2 only accepts a group design as input, I would create a "fake" variable, randomly placing number 1 or 2, to use as a group variable just to get the normalized data, since the design does not affect normalization. After obtaining the normalized data, I would separate the quartiles and the patients that are part of each quartile. It would then use that information to filter the raw data and use it as a design to run DESeq2. However, I feel this feels wrong.

Could anyone give me some suggestion? I couldn't find a thread about it.

Thanks in advance.

DESeq2 RNASeq • 559 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

We don't have any statistical procedures for detecting high or low expression. You might try taking the abundance (TPM) and performing model-based clustering on the log of abundance, across samples.

ADD COMMENT

Login before adding your answer.

Traffic: 874 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6