Question

limma + voom combine voomWithQualityWeights and duplicateCorrelation

0

Entering edit mode

Tyler Sagendorf ▴ 10

@df6c68e9

Last seen 12 months ago

United States

Suppose I want to combine voomWithQualityWeights and duplicateCorrelation. Would the modification of voomWithQualityWeights below be the recommended approach? I only added lines for duplicateCorrelation (based on using duplicateCorrelation with limma+voom for RNA-seq data) and included the correlation as part of the output list to be used with lmFit later, but I wasn't sure about which set of weights to pass to each instance of duplicateCorrelation:

The array weights (aw, what I am currently using)
The voom observation-level weights (v$weights, the default when weights = NULL)
The product of the two t(aw * t(v$weights))

Note: voomWithCorrelationAndWeights(..., ndups = 0) (no blocking) is equivalent to voomWithQualityWeights() with the addition of the correlation (value of 0) in the output.

voomWithCorrelationAndWeights <-
  function(counts, design = NULL, lib.size = NULL, normalize.method = "none",
           plot = FALSE, span = 0.5, var.design = NULL, var.group = NULL,
           method = "genebygene", maxiter = 50, tol = 1e-05,
           trace = FALSE, col = NULL,
           # from duplicateCorrelation
           ndups = 2, spacing = 1, trim = 0.15, block = NULL,
           ...)
  {
    if (plot) {
      oldpar <- par(mfrow = c(1, 2))
      on.exit(par(oldpar))
    }

    # Iter 1
    v <- voom(counts, design = design, lib.size = lib.size,
              normalize.method = normalize.method, plot = FALSE, span = span,
              ...)
    aw <- arrayWeights(v, design = design, method = method,
                       maxiter = maxiter, tol = tol,
                       var.design = var.design, var.group = var.group)
    dupecor <- duplicateCorrelation(v, design = design, block = block,
                                    ndups = ndups, spacing = spacing, trim = trim,
                                    weights = aw) # update v$weights or use aw here?

    # Iter 2
    v <- voom(counts, design = design, lib.size = lib.size,
              normalize.method = normalize.method, plot = plot, span = span,
              # Changes from first iteration
              block = block, correlation = dupecor$consensus.correlation,
              weights = aw,
              ...)
    aw <- arrayWeights(v, design = design, method = method,
                       maxiter = maxiter, tol = tol, trace = trace,
                       var.design = var.design, var.group = var.group)
    dupecor <- duplicateCorrelation(v, design = design, block = block,
                                    ndups = ndups, spacing = spacing, trim = trim,
                                    weights = aw) # same question as above

    # Update weights and add correlation to output (needed for lmFit)
    v$weights <- t(aw * t(v$weights))
    v$targets$sample.weights <- aw
    v$correlation <- dupecor$consensus.correlation

    if (plot) {
      barplot(aw, names = 1:length(aw), main = "Sample-specific weights",
              ylab = "Weight", xlab = "Sample", col = col)
      abline(h = 1, col = 2, lty = 2)
    }

    return(v)
  }

Highly simplified version of how this function may be used:

# dge = DGEList object, design = design matrix, block = blocking factor, 
# var.group = each level of this factor is assigned a different sample weight
v <- voomWithCorrelationAndWeights(dge, design = design, 
                                   block = block, var.group = var.group)
fit <- lmFit(v, design = design, block = block, correlation = v$correlation)
fit.smooth <- eBayes(fit)

Edit: added voom tag and changed counts in last code chunk to dge.

voom RNASeq limma • 1.4k views

ADD COMMENT • link 23 months ago Tyler Sagendorf ▴ 10

score 2 · Accepted Answer · 2023-11-16

2

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

Just use edgeR::voomLmFit() with sample.weights=TRUE and block=yourblockfactor and it will do it all for you. It does what you're trying to implement, and in addition takes account of any groups with all-zero counts. The function will also accept var.group or var.design instead of sample.weights.

The reason why voomLmFit is part of edgeR rather than part of limma is because it needs to use edgeR functionality to identify the counts with exact zero fitted values.

ADD COMMENT • link 23 months ago Gordon Smyth 53k

0

Entering edit mode

Thanks, Gordon! Exactly what I was looking for. I'm putting together a series of wrapper functions for some limma proteomics data analysis pipelines that my workplace commonly uses or is likely to use in the future, and this will simplify my (probably over-engineered) code quite a bit.

ADD REPLY • link 23 months ago Tyler Sagendorf ▴ 10

0

Entering edit mode

What sort of proteomics data?

Please be mindful that voom is designed for sequencing data and for RNA-seq in particular. We have never tested voom on proteomics data and I would not expect it to be ideal for that sort of data.

ADD REPLY • link 23 months ago Gordon Smyth 53k

0

Entering edit mode

I apologize for the confusion. I meant that the wrapper functions will mainly handle TMT proteomics data, since that is what myself and colleagues encounter most often, but it will also include the ability to analyze RNA-Seq data. It will be similar to what I've done in this limma_full function, but only the MArrayLM object from eBayes will be output, and I will be including more robust checks for user input, among other improvements. The portion of the code that analyzes RNA-Seq data follows what is outlined in Law et al. (2018), but I will switch to using voomLmFit.

ADD REPLY • link 23 months ago Tyler Sagendorf ▴ 10