question about limma removeBatchEffect and feature selection using elasticnet.
Entering edit mode
hxlei613 • 0
Last seen 9 months ago

Hi there:

I have hundreds of samples with reads counted on genebody or 2kb sliding windows. I'd like to use these matrix to build binary classification and prediction models using elasti net.

I plan to do the following things:

First, do normalization (logCPM) in total samples to check wether batch exists then split the matrix into training set and test set.

Second, use limma to find different genes or windows in training set.

Third, select important features in different genes and windows with elasticnet in training set to construct binary classification models and make predictions in test set.

However, in step1, t-SNE plots showed that there was obvious batch effect. limma could handle 'batch' by including batch as a covariate in a design formula for the purpose of differential expression analysis. If I include batch in design formula while doing differential expression analysis in training set, what value should I use for model construction and prediction? Could I use results of removeBatchEffect() to build model?

Thanks, Xinlei Hu

limma • 269 views
Entering edit mode

Hi Xinlei, can you provide more evidence to support this statement? - "However, in step1, t-SNE plots showed that there was obvious batch effect." Perhaps an MDS plot or PCA bi-plot would be better to gauge a batch effect, not tSNE.

Further, if there exists a batch effect, and you also want to separate the data into training and test, then could this observed batch effect be related to how you ultimately wish to divide the samples into training and test? Are the samples from different studies?; were they processed using different library preparation protocols?

We do not have all sample metadata at our disposal here; so, we can only guess what you are observing.


Login before adding your answer.

Traffic: 153 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6