Has anyone successfully used sva packages?
1
0
Entering edit mode
@fabrice-tourre-4394
Last seen 10.2 years ago
Dear list, Fellow the paper "Tackling the widespread and critical impact of batch effects in high-throughput data" and the code at http://rafalab.jhsph.edu/batch/, I used sva in my analysis. But I always get more than 90% of gene with p.adjust<0.05 (compared in two cell line). When I try limma package without sva, it give me about 3000 genes with p.adjust<0.05. It seems limma is reasonable. But it seems that limma can only remove tow batch each time using removeBatchEffect. removeBatchEffect(x,batch,batch2=NULL,design=matrix(1,ncol(x),1)) Does anyone have suggestions? library(preprocessCore) #cel=read.csv("celfies/cel_files.txt",header=F,skip=1) tab=read.table("output/ExpressionInput/groups.dis.txt",as.is=TRUE) mat<-read.table("gene/apt/rma-sketch.summary.txt",check.names = FALSE,header=T,row.names=1) colnames(mat)->c rownames(mat)->r merge(c,tab,by.y=1,by.x=1)->b tab<-b mat<-normalize.quantiles(as.matrix(mat)) colnames(mat)<-c rownames(mat)<-r dates=vector("character",ncol(mat)) for(i in seq(along=dates)){ tmp=affyio::read.celfile.header(file.path("celfies",tab[i,1]), info="full")$ScanDate dates[i]=strsplit(tmp,"T")[[1]][1] } dates=as.Date(dates) batch=as.numeric(dates-min(dates)) batch=as.numeric(cut(batch,c(unique(batch),1))) batch[is.na(batch)]=1 library(corpcor) library(affy) library("sva") library(limma) f<-factor(tab[,3]) mod = model.matrix(~f) mod0=model.matrix(~1,data=tab) colnames(mod)<-levels(f) n.sv = num.sv(mat,mod,method="leek") svobj = sva(mat,mod,mod0,n.sv=n.sv) modSv = cbind(mod,svobj$sv) mod0Sv = cbind(mod0,svobj$sv) pValuesSv = f.pvalue(mat,modSv,mod0Sv) qValuesSv = p.adjust(pValuesSv,method="BH") fit = lmFit(edata,modSv) contrast.matrix<- ???? How to construct this matrix? fitContrasts = contrasts.fit(fit,contrast.matrix) eb = eBayes(fitContrasts) topTableF(eb, adjust="BH") head(tab) x V2 V3 1 2A6_Hex_07Jul_09.CEL 1 dis 2 2A7_Hex_07Jul_09.CEL 1 dis 3 2A8_Hex_07Jul_09.CEL 1 dis 4 2A9_Hex_07Jul_09.CEL 1 norm 5 2AA_Hex_07Jul_09.CEL 1 norm 6 2AB_Hex_07Jul_09.CEL 1 norm
limma sva limma sva • 1.7k views
ADD COMMENT
0
Entering edit mode
@jhuatgenorg-5011
Last seen 10.2 years ago
Depending on the cell lines you are using, two cell lines could be very different. Can you do some test on each batch (without any batch effect removal), and see how many differentially expressed genes? Also I assume you have similar proportion of two cell lines each every batch, i.e., fully random design. No batch effects removal is perfect, you should minimize the effects by randomization. Jianping > ------------------------------ > > Message: 10 > Date: Thu, 15 Dec 2011 15:53:47 +0800 > From: Fabrice Tourre <fabrice.ciup at="" gmail.com=""> > To: Bioconductor mailing list <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Has anyone successfully used sva packages? > Message-ID: > <can31xkcf+f0qxnof0hwqxlanvotpxhhc=g03jbhwdm2vjqp8nw at="" mail.gmail.com=""> > Content-Type: text/plain; charset="ISO-8859-1" > > Dear list, > > Fellow the paper "Tackling the widespread and critical impact of batch > effects in high-throughput data" and the code at > http://rafalab.jhsph.edu/batch/, I used sva in my analysis. But I > always get more than 90% of gene with p.adjust<0.05 (compared in two > cell line). When I try limma package without sva, it give me about > 3000 genes with p.adjust<0.05. It seems limma is reasonable. But it > seems that limma can only remove tow batch each time using > removeBatchEffect. > > removeBatchEffect(x,batch,batch2=NULL,design=matrix(1,ncol(x),1)) > > Does anyone have suggestions? > > library(preprocessCore) > #cel=read.csv("celfies/cel_files.txt",header=F,skip=1) > tab=read.table("output/ExpressionInput/groups.dis.txt",as.is=TRUE) > mat<-read.table("gene/apt/rma-sketch.summary.txt",check.names = > FALSE,header=T,row.names=1) > colnames(mat)->c > rownames(mat)->r > merge(c,tab,by.y=1,by.x=1)->b > tab<-b > mat<-normalize.quantiles(as.matrix(mat)) > colnames(mat)<-c > rownames(mat)<-r > dates=vector("character",ncol(mat)) > for(i in seq(along=dates)){ > tmp=affyio::read.celfile.header(file.path("celfies",tab[i,1]), info="full")$ScanDate > dates[i]=strsplit(tmp,"T")[[1]][1] > } > dates=as.Date(dates) > batch=as.numeric(dates-min(dates)) > batch=as.numeric(cut(batch,c(unique(batch),1))) > batch[is.na(batch)]=1 > > library(corpcor) > library(affy) > library("sva") > library(limma) > f<-factor(tab[,3]) > mod = model.matrix(~f) > mod0=model.matrix(~1,data=tab) > colnames(mod)<-levels(f) > n.sv = num.sv(mat,mod,method="leek") > svobj = sva(mat,mod,mod0,n.sv=n.sv) > modSv = cbind(mod,svobj$sv) > mod0Sv = cbind(mod0,svobj$sv) > pValuesSv = f.pvalue(mat,modSv,mod0Sv) > qValuesSv = p.adjust(pValuesSv,method="BH") > > fit = lmFit(edata,modSv) > contrast.matrix<- ???? How to construct this matrix? > fitContrasts = contrasts.fit(fit,contrast.matrix) > eb = eBayes(fitContrasts) > topTableF(eb, adjust="BH") > > > > > head(tab) > x V2 V3 > 1 2A6_Hex_07Jul_09.CEL 1 dis > 2 2A7_Hex_07Jul_09.CEL 1 dis > 3 2A8_Hex_07Jul_09.CEL 1 dis > 4 2A9_Hex_07Jul_09.CEL 1 norm > 5 2AA_Hex_07Jul_09.CEL 1 norm > 6 2AB_Hex_07Jul_09.CEL 1 norm > > > > ------------------------------ > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > End of Bioconductor Digest, Vol 106, Issue 14 > *********************************************
ADD COMMENT

Login before adding your answer.

Traffic: 859 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6