Normalization of RNA seq data before DESeq2 and PCA in case of strong batch effects
1
0
Entering edit mode
marslena • 0
@b99e3575
Last seen 21 months ago
India

I have a dataset having 4 rna seq healthy tissue samples prepared with unstranded Illumina library and another dataset with 8 rna seq healthy tissue samples prepared with reverse stranded Illumina library. I wish to combine these 2 datasets together and run DGE with a third dataset having 100 tumor tissue samples. But DESEQ2 takes in raw counts as input. Can I normalize each dataset seaprately and then merge them together to run DESeq2?

Also, should I also run ComBat or sva after DESeq2 and then use that file as input to make PCA, t-SNE etc?

DESeq2 Normalization sva • 800 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 5 hours ago
Germany

The vignette recommends to include any "correction" factors, be it categorical batch effect covariates or factors derived from the likes of svaseq/RUVSeq into the design. The general rule is that a dataset to be analysed in the analysis must be prepared together (in the lab) to avoid batch effects, and if multiple batches are present then each batch should contain samples of all groups. What you cannot do is take control from batch1 and tumor from batch2 and combine this, as this is called "fully confounded". As for PCA and other analysis the vignette recommends vst/rlog and you can regress batch effects (if not nested with experimental groups) with something like limma::removeBatchEffects(). Please be sure to read previous posts, this has been asked many times before and the support site is not intended to do hands-on guidance through your analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 486 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6