Problem with sva() probably stemming from model.matrix()
0
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hi, I am trying to do surrogate variable analysis. I strongly suspect that I have not set the model matrices up correctly with model.matrix(), but I can't see what I am doing wrong. My code looks like this : #!/usr/bin/Rscript library(Biobase) library(sva) library(limma) library(bladderbatch) library(lumi) # ICSfc - gender, ethnicity and smoking as covariates in SVA ################################################# setwd('/Volumes/cuChangeDS2413/Data/Methylation/scratch') exprsFile <- "testMQs.csv" pDataFile <- "Alcohol27k_new_n309rev2.csv" outFile <- "Alcohol27k_n309icsfc_genderethnicity_bcdata13.csv" pdfFile <- "batchCorrect_icsfc_genderethnicity13.pdf" options(warn=1) options(showNcalls=400) # Read the data. exprs <- as.matrix (read.table(file=exprsFile, header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE, row.names=1, as.is=TRUE)) pData <- read.table(pDataFile, header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE, row.names=1, as.is=TRUE) mod <- model.matrix(~as.integer(icsfc), data=pData ) print(quote=F,"MODEL mod :") head(mod) Nsv = num.sv(exprs, mod,method="leek") # This is the null model. We are including the adjustment variables (covariates), namely ethnicity, gender and smoking print("MODEL mod0 :") mod0 = model.matrix(~as.factor(ethnic) + ~as.factor(gender) + ~as.integer(tlfb_avgc), data=pData ) head(mod0) svobj = sva(exprs, mod, mod0, method="irw", n.sv=Nsv) ------- And the output I get is : [1] MODEL mod : (Intercept) as.integer(icsfc) M87101936 1 31 M87101983 1 30 M87102549 1 20 M87102857 1 33 M87103240 1 16 M87103605 1 33 [1] "MODEL mod0 :" (Intercept) as.factor(ethnic)2 as.factor(ethnic)3 as.factor(ethnic)4 M87101936 1 0 0 1 M87101983 1 0 0 1 M87102549 1 0 0 0 M87102857 1 0 0 0 M87103240 1 0 0 0 M87103605 1 0 0 0 as.factor(ethnic)5 as.factor(ethnic)6 as.factor(gender)2 M87101936 0 0 0 M87101983 0 0 1 M87102549 0 0 0 M87102857 0 1 0 M87103240 0 0 0 M87103605 0 1 0 as.integer(tlfb_avgc) M87101936 0 M87101983 0 M87102549 7 M87102857 0 M87103240 8 M87103605 3 Number of significant surrogate variables is: 36 Iteration (out of 5 ):Warning in pf(q, df1, df2, lower.tail, log.p) : NaNs produced Error in density.default(x, adjust = adj) : 'x' contains missing values Calls: sva ... irwsva.build -> edge.lfdr -> density -> density.default Execution halted -- As you can see, it crashes when it gets to the call to sva(). I'm thinking I have not set up the models correctly, can anyone see what is wrong? I apologize as I'm quite new to this part of R. If anyone could suggest a good place to read about model.matrix(), that might help, too. Thanks, Niles Oien, University of Colorado at Boulder -- output of sessionInfo(): [1] MODEL mod : (Intercept) as.integer(icsfc) M87101936 1 31 M87101983 1 30 M87102549 1 20 M87102857 1 33 M87103240 1 16 M87103605 1 33 [1] "MODEL mod0 :" (Intercept) as.factor(ethnic)2 as.factor(ethnic)3 as.factor(ethnic)4 M87101936 1 0 0 1 M87101983 1 0 0 1 M87102549 1 0 0 0 M87102857 1 0 0 0 M87103240 1 0 0 0 M87103605 1 0 0 0 as.factor(ethnic)5 as.factor(ethnic)6 as.factor(gender)2 M87101936 0 0 0 M87101983 0 0 1 M87102549 0 0 0 M87102857 0 1 0 M87103240 0 0 0 M87103605 0 1 0 as.integer(tlfb_avgc) M87101936 0 M87101983 0 M87102549 7 M87102857 0 M87103240 8 M87103605 3 Number of significant surrogate variables is: 36 Iteration (out of 5 ):Warning in pf(q, df1, df2, lower.tail, log.p) : NaNs produced Error in density.default(x, adjust = adj) : 'x' contains missing values Calls: sva ... irwsva.build -> edge.lfdr -> density -> density.default Execution halted -- Sent via the guest posting facility at bioconductor.org.
sva sva • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6