Question: Going crazy with normalizing TCGA raw rsem gene count
2.6 years ago
United States
ezz0 wrote:

1-I obtained the RNAseqV2 raw counts, as another post suggested not to use the RSEM.GENE.NORMALIZED as they still contain irregularity as seen on the diag.boxplot

2- I use different methods of normalization to be able to start the clustering analysis.

3-quantile normalization in the preprocess core package, EDAseq withinlanenormalizaetion function, DESeq rlog using design~1,EDgeR COM and calnormfactor were used and all have different values. I don't know which one to use and if I can use quantile normalization for the normalized RSEM gene counts directly.

4-After clustering for data exploration and obtaining , for example, 3 groups can I renormalize (BASED ON THE NEW GROUPING) and assess differential gene expression. Simply cause the conditions or design are not yet known at the time of initial normalization, and I will depend on clustering to create these groupings.


Your help is greatly appreciated.

written 2.6 years ago by ezz0
2.6 years ago
Dario Strbenac1.4k
Dario Strbenac1.4k wrote:

DGEclust is a software package that clusters read-counts, and then uses the clusters to do differential expression analysis. One problem with the method is that it does not normalise read-counts for gene length, so it only works correctly for CAGE-seq read-counts. I don't know how the journal's reviewers didn't notice such a serious problem with the method.

written 2.6 years ago by Dario Strbenac1.4k
