Regressing effect of treatment on RNA-seq expected count from rsem.
1
0
Entering edit mode
hrishi27n ▴ 20
@hrishi27n-11821
Last seen 3 months ago
United States

Hello All,

I have RNA-seq data collected by sequencing around 30 individuals(similar phenotypes) and my goal is to  group/cluster these patients based upon their expected counts obtained from rsem.  Most of these patients are on a few different medications and some are untreated, my PCA shows a clear separation between these patients based upon their medication types. I was wondering if there was some way of regressing the medication effect, so that both medicated and unmedicated individuals look relatively consistent. Any input or suggestion is highly appreciated.

Thanks.

RNA rna-seq science bioinformatics • 1.1k views
3
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

You will want to first convert those data to something more amenable (than counts) for clustering. Alternatives include voom or cpm in edgeR, or rlog or vst in DESeq2. I won't go into arguments for or against any of those choices other than to say that they exist.

Once you have converted using the tool of your choice, you could use either removeBatchEffect from limma, or you could use ComBat from sva to regress out the medication effect.

0
Entering edit mode

Thank you James.

0
Entering edit mode

James, so I used both removeBatchEffect and ComBat separately to see what worked better for me, it seems that removeBatchEffect regressed most of the medication effect.  Just to be sure and to see if we can improve this, I am pasting my code snippet below.  For removeBatchEffect is it necessary to provide a design matrix? considering the fact that I don't have any grouping and any other condition worth regressing?

 myDGElist <-DGEList(counts=CountFrame) # CountFrame is my rsem dataFrame
myNormalized <- calcNormFactors(myDGElist)
design <- model.matrix(~1, data=myPheno) # myPheno includes treatment and other information
v <-voom(myNormalized)
treatment <- myPheno$Medications combatProcess <- ComBat(dat=v$E,batch=treatment,design,par.prior=TRUE, prior.plots=FALSE)
usinglimma <- removeBatchEffect(v\$E, treatment)

0
Entering edit mode

From ?removeBatchEffect:

 design: optional design matrix relating to treatment conditions to be
preserved
0
Entering edit mode

James, thank you for responding.

My medication vector is something like below, after running removeBatchEffect it seems from the PCA that the medication effect is gone but the untreated points have also switched a little bit which I think should not have happened. Is there a way to prevent this from happening? (Also, does removeBatchEffect protects the biological variability?)

meds <-c("medication1","medication2",...."untreated","untreated","medicationX","untreated")

0
Entering edit mode

All removeBatchEffect does is regress out the mean expression for each of the batches you specify. It cannot 'protect' anything, as it doesn't know what should or should not be protected. If you have a poorly designed experiment, then removeBatchEffect may do things you don't like, which is why it is important to have a well designed experiment.

You can use a design matrix to attempt to preserve treatment conditions, but if the treatment conditions are confounded with the batch effects, then you have problems that no amount of statistical wizardry will be able to fix.