Using geNetClassifier with RNA seq data
Entering edit mode
nhaus • 0
Last seen 3 months ago


I am currently trying out geNetClassifier to build a classifier for bulk RNA seq. However, I am somewhat unsure how exactly I should preprocess my counts before providing it to the geNetClassifier method. In the vignette it says:

Note that since the ranking is built though package EBarrays, the data in the expression set should be normalized intensity values (positive and on raw scale, not on a logarithmic scale).

I am using RNA seq instead of a microarray so do not have intensity values. In their accompanying publication I read:

The preprocessed RNA-Seq expression data matrices containing the reads per kilobase per million mapped reads (RPKM) were downloaded from the TCGA data portal and were log2 transformed (log2(RPKM+1)) prior to be analysed with geNetClassifier.

So now I am unsure what exactly I should do. Right now I am using VST transformed counts on which I ran limma::removeBatchEffect. If I understand the VST transformation correctly, it results in normalized log transformed counts.This count matrix I incorporate into an eset which I then use to run geNetClassifier.

Is this approach correct or should I use a different normalization?

I would appreciate any comments or tips for other packages/methods that I could use to classify samples based on whole genome RNAseq!

geNetClassifier Normalization • 131 views

Login before adding your answer.

Traffic: 379 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6