Question: Using RUVs with known batches
gravatar for Frederik Ziebell
10 weeks ago by
EMBL Heidelberg
Frederik Ziebell0 wrote:

I have a large (bulk) RNA-seq data set with ~1500 samples, i.e. ~30 multiplexing runs (library prep and sequencing) with in total ~500 different conditions in triplicates (conditions are somehow randomized across runs). The ultimate goal is to test all 500 conditions against the wildtype-controls (which are present in many but not all multiplexing runs), taking into account 1) the batch effect originating from the multiplexing runs and 2) an additional unwanted biological source of variation for which we have negative controls: Our cells are haploid by default but tend to become diploid, even the WT-controls. That's why we also have known haploid WT-controls and known diploid WT-controls.

Can I use RUVs with the known haploid and diploid WT-controls as negative controls to account for the diploidization effect and the multiplexing runs?

My approaches so far:

A) Running RUVs() with 'condition' as indicator for the replicate samples, while the known haploid and diploid WT-controls have the same condition:

RUVs(x=counts(dds), cIdx=rownames(dds), k=10, scIdx=makeGroups(dds$condition))

This results in the first 8 latent factors being correlated with multiplexing runs, and factors 9 and 10 nicely separating known haploid from known diploid WT-controls.

B) Running RUVs() on batch-corrected vst-transformed data

vsd <- vst(dds)
assay(vsd) %<>% limma::removeBatchEffect(vsd$run)
RUVs(x=assay(vsd), cIdx=rownames(vsd), k=10, scIdx=makeGroups(vsd$condition), isLog = TRUE)

Here, the first 6 latent factors separate a few strong phenotypes, while factor 7 captures the diploidization.

What is a good design for differential testing?

  • A) with design ~condition + run + factor 9 + factor10
  • A) with design ~condition + factor1 + ... + factor10
  • B) with design ~condition + run + factor7
  • something else
deseq2 ruvseq • 150 views
ADD COMMENTlink modified 10 weeks ago by Michael Love25k • written 10 weeks ago by Frederik Ziebell0
Answer: Using RUVs with known batches
gravatar for Michael Love
10 weeks ago by
Michael Love25k
United States
Michael Love25k wrote:

hi Frederik,

I don't have any specific recommendations from the DESeq2 side on whether it's better to have RUV detect the batches, or to remove them so that RUV detects variation on batch-corrected data. In either case, you will provide batch or something RUV detects that is highly correlated to batch in the design. Off the top of my head, I can't anticipate how these would differ in practice.

Given that you are at EMBL Heidelberg, you may want to connect with Wolfgang Huber's group or Bernd Klaus who can give some pointers on working with large scale RNA-seq datasets.

ADD COMMENTlink written 10 weeks ago by Michael Love25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 315 users visited in the last hour