Question

RNA-Seq normalization and batch effect removal

0

Entering edit mode

Alexandre • 0

@f6a18207

Last seen 21 months ago

Canada

Our lab as cumulated many RNA-Seq experiments with the same cell line over the years. We want to take all the control experiments done to estimate the expression levels of genes in this cell line. Fastq files for all these 8 experiments were processed using Hisat2 and featureCounts to get gene counts. Replicates from each exp vary from 1 to 3. To complicate matters, half the exp were performed using ribodepletion and half using polyA enrichment.

Almost all documentation online is about differential expression. In this case, we simply want to estimate the mean expression levels of every genes (and also maybe the standard deviation to estimate how much it varies).

Can we use polyA and ribodepletion experiments together? Literature suggests this should be avoided. If not, which one is best, polyA or ribo?
What would be the best way to normalize this data? I imported the raw counts in DESEq2 and used the vst (or rlog) function. Should I use these normalized counts to compute the FPKM for each gene?
When should I remove the batch effect from this data, before or after loading it in DESeq2?

Thanks!

RNASeq BatchEffect DESeq2 • 694 views

ADD COMMENT • link updated 22 months ago by Michael Love 41k • written 22 months ago by Alexandre • 0

score 1 · Answer 1 · 2022-06-10

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 8 hours ago

United States

If you are talking about expression level alone, I think you don't need to use DESeq2. You can work with the TPM. You can remove batch effect with removeBatchEffect from limma.

ADD COMMENT • link 22 months ago Michael Love 41k