[DESeq2] Normalization of External Test Dataset for Machine Learning (ML) Application
2
0
Entering edit mode
@4bea8831
Last seen 6 weeks ago
South Korea

Dear Community,

My current research involves developing a ML-based predictor model, for which I have chosen DESeq2 for normalization. I would appreciate any advice regarding some challenges I am facing.

In my study, I trained the model on RNA from blood samples of healthy donors (which has been validated by an additional healthy cohort). I then tested the model using RNA from virally infected patients to quantify the degree of change"

For normalization, I used all the samples (both training and test) together in order to account for global RNA perturbations caused by infection, as suggested by our prior studies. Given this, I used all genes as "control genes," and normalizing only the healthy donors wasn't a viable option for me.

However, I am now encountering issues with using external datasets. Normalizing these datasets, with their own RNA compositions, separately for the test seems nonsensical. Alternatively, combining them with my current dataset and redoing the normalization would change the model (both for this and future data).

I would be very grateful for any suggestions to resolve this problem.

Cheers,

Alan

DESeq2 • 1.4k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

DESeq2 has a way of fixing the reference pseudo-sample used for normalization.

In estimateSizeFactors man page, you will see an argument geoMeans. If you provide the geometric means of the original data (you can compute this with log -> rowMeans -> exp), it will apply that when scaling the new data. You can leave the -Inf from log, it will turn into 0 which is correct (any row with a zero isn't used in median ratio method).

ADD COMMENT
0
Entering edit mode

Thanks for the advice Michael. I will try this with my future test datasets.

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States

Your question is off-topic for this support site. You might try over on biostars.org instead.

0
Entering edit mode

Sorry - I requested this to be posted here, I will answer.

ADD REPLY
0
Entering edit mode

Hi James,

I will try to leave only relevant questions on bioconductor forum hereinafter.

Thanks for the note.

Cheers,

Alan

ADD REPLY
0
Entering edit mode

I request the post here, so not a problem.

ADD REPLY

Login before adding your answer.

Traffic: 545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6