Question

Using geNetClassifier with RNA seq data

0

Entering edit mode

nhaus ▴ 70

@789c70a6

Last seen 6 months ago

Switzerland

Hello,

I am currently trying out geNetClassifier to build a classifier for bulk RNA seq. However, I am somewhat unsure how exactly I should preprocess my counts before providing it to the geNetClassifier method. In the vignette it says:

Note that since the ranking is built though package EBarrays, the data in the expression set should be normalized intensity values (positive and on raw scale, not on a logarithmic scale).

I am using RNA seq instead of a microarray so do not have intensity values. In their accompanying publication I read:

The preprocessed RNA-Seq expression data matrices containing the reads per kilobase per million mapped reads (RPKM) were downloaded from the TCGA data portal and were log2 transformed (log2(RPKM+1)) prior to be analysed with geNetClassifier.

So now I am unsure what exactly I should do. Right now I am using VST transformed counts on which I ran limma::removeBatchEffect. If I understand the VST transformation correctly, it results in normalized log transformed counts.This count matrix I incorporate into an eset which I then use to run geNetClassifier.

Is this approach correct or should I use a different normalization?

I would appreciate any comments or tips for other packages/methods that I could use to classify samples based on whole genome RNAseq!

geNetClassifier Normalization • 760 views

ADD COMMENT • link 3.0 years ago nhaus ▴ 70