Question

RUV4 for microarrays

0

Entering edit mode

Antonio • 0

@f228e3c4

Last seen 2.9 years ago

Spain

Hi everyone, I am trying to remove an unknown batch effect. After reading several articles and posts, I found RUV (remove unwanted variables). In the first place, I used the RUVnormalise package with the use of 11 housekeeping genes as negatives controls. However, the batch effect remains. Then, I focused on the package called ruv and the function RUV4. I observed that I could get the experimental control genes from my data and applying with them the correction.

When I applied RUV4 function doesn't return an adjusted data matrix, instead of this, it returns a list of different values. I don't know how to obtain the adjusted data matrix to carry out the downstream analysis, such as DE.

Can anyone help me?


# Log transformation
exprlog <- log2(exprbygene)


res <- diffExprAnalysis(dat = exprlog,ed = ed,condition = "title")
sum(res$`CONTROL-RIF`$adj.P.Val>0.999) # 30

control_genes <- rownames(res$`CONTROL-RIF`[res$`CONTROL-RIF`$adj.P.Val>0.999,])

# Normalization with RUV

cIdx <- which(rownames(exprlog)%in%control_genes)
k <- 3

design <- model.matrix(~0 + ed$title)

nsY <- RUV4(Y = t(exprlog), ctl = cIdx, X = ed$title, k = 1)

RUV4 RUV Microarray • 1.3k views

ADD COMMENT • link 3.0 years ago Antonio • 0

score 1 · Answer 1 · 2021-04-29

The ruv package isn't a Bioconductor package, but maybe it's Bioconductor-adjacent? Technically since it's a CRAN package you should be asking on R-help or biostars or whatever. Anyway, this is an example of the need to perform a close reading of the help pages for any package you might want to use - the information is there, but it's usually pretty terse and every word counts.

You are asking how to get the adjusted data matrix for downstream analysis, while apparently not understanding that RUV4 is carrying out the analysis for you. From ?RUV4

Arguments:

       Y: The data.  A m by n matrix, where m is the number of samples
          and n is the number of features.

       X: The factor(s) of interest.  A m by p matrix, where m is the
          number of samples and p is the number of factors of interest.
          Very often p = 1.  Factors and dataframes are also
          permissible, and converted to a matrix by 'design.matrix'.

## and further down

Details:

     Implements the RUV-4 algorithm as described in Gagnon-Bartsch,
     Jacob, and Speed (2013), using the SVD as the factor analysis
     routine.  Unwanted factors W are estimated using control genes.  Y
     is then regressed on the variables X, Z, and W.

Which pretty clearly states that this function does the regression for you? But ruv could use a vignette because RUV4 returns a not completely useful object. If you look at ?ruv_summary it becomes somewhat clearer:

RUV Summary

Description:

     Post-process and summarize the results of call to RUV2, RUV4,
     RUVinv, or RUVrinv.

Usage:

     ruv_summary(Y, fit, rowinfo=NULL, colinfo=NULL, colsubset=NULL, sort.by="F.p", 
                 var.type=c("ebayes", "standard", "pooled"),
                 p.type=c("standard", "rsvar", "evar"), min.p.cutoff=10e-25)

## and further down

Details:

     This function post-processes the results of a call to
     RUV2/4/inv/rinv and then nicely summarizes the output.  The
     post-processing step primarily consists of a call to
     variance_adjust, which computes various adjustments to variances,
     t-statistics, and and p-values.  See variance_adjust for details.
     The 'var.type' and 'p.type' options determine which of these
     adjustments are used.  An additional post-processing step is that
     the column means of the 'Y' matrix are computed, both before and
     after the call to 'RUV1' (if 'eta' was specified).

     After post-processing, the results are summarized into a list
     containing 4 objects: 1) the data matrix 'Y'; 2) a dataframe 'R'
     containing information about the rows (samples); 3) a dataframe
     'C' containing information about the columns (features, e.g.
     genes), and 4) a list 'misc' of other information returned by
     RUV2/4/inv/rinv.

There are other functions in ruv that are presumably useful for doing things, but I leave it to you to do your own further exploration.

score 0 · Answer 2 · 2021-05-05

0