Question

Obtaining clean data after adjusting for batch effects using SVA

0

Entering edit mode

Momeneh Foroutan ▴ 10

@momeneh-foroutan-7398

Last seen 8.2 years ago

Australia

Hi all (and Andrew Jaffe),

I know there is a post related to this topic here "Back-estimating batch variables from SVA for ComBat?" but I actually have a question about the answer given by Andrew to that post.

Andrew has kindly suggested using the below function to obtain the clean data adjusted for surrogate variables:

cleaningY = function(y, mod, svaobj) {
 X=cbind(mod,svaobj$sv) 
Hat=solve(t(X)%*%X)%*%t(X) 
beta=(Hat%*%t(y)) 
P=ncol(mod) 
cleany=y-t(as.matrix(X[,-c(1:P)])%*%beta[-c(1:P),]) 
return(cleany) 
} 
# and implement it like this: 
mod = model.matrix(~[whatever your model is]) # specify the model 
svaobj = sva(y, mod) # y is your expression matrix 
cleany = cleaningY(y,mod,svaobj)

So my question is about the sva() function in the above example. why did not he give mod0 and n.sv to sva() for generating svaobj? it makes a huge difference in case of my data set. the num.sv() function estimated two surrogate variables for my data, and I suppose that I should run sva() in this way:

svobj = sva(m, mod, mod0, n.sv = n.sv)   ## m is the expression matrix

While if I run sva() without assigning mod0 and n.sv, it gives me 77 surrogate variables! Isn't that we must give it mod0 because we need a null model to compare to the model matrix being used to fit the data?

Thanks in advance for any explanation.

sva • 1.7k views

ADD COMMENT • link 8.2 years ago Momeneh Foroutan ▴ 10

0

Entering edit mode

Hi,

I have the exact same question. Any updates on this?

Thanks.

ADD REPLY • link 6.1 years ago wamiqsaifi • 0