DESeq(normalize using all samples?)
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
I have a file about readcount values with eight samples(A1,A2,B1,B2,C1,C2,D1,D2),I want to know the differential genes between A and B.Normally ,I should extract sample A1,A2,B1,B2 from the file.Now I use all samples to normalize the readcounts and fit the model ,I find that I find more DE genes.I want to know if my code is true and why? In this part I find 322 genes using all samples, while I find 77 genes using specfic samples. -- output of sessionInfo(): ###each### ###analysie with specific coloums### library('DESeq') x=read.delim("readcount.xls",row.names=1) x=round(x[,1:4]) group=factor(c("A","A","B","B")) cds <- newCountDataSet(x, group) cds <- estimateSizeFactors(cds) cds <- estimateDispersions(cds) res <- nbinomTest(cds,'A','B') a<-subset(res,padj<0.05) dim(a) write.table(a[,1],"each.txt",quote=F,col.names=F,row.names=F) ###union### ###analysis with all coloums### library('DESeq') x=read.delim("readcount.xls",row.names=1) x=round(x) group=c("A","A","B","B","C","C","D","D") cds <- newCountDataSet(x, group) cds <- estimateSizeFactors(cds) cds <- estimateDispersions(cds) res <- nbinomTest(cds,'A','B') a<-subset(res,padj<0.05) dim(a) write.table(a[,1],"union.txt",quote=F,col.names=F,row.names=F) -- Sent via the guest posting facility at bioconductor.org.
• 1.4k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States
hi Hui Zhao, I can explain why this happens, but it's hard to say which result is 'true'. Including other samples in this case affects the number of genes passing an FDR threshold, mostly through the estimation of dispersion. If the other samples tend to have counts with small within-group variance, then the estimate of dispersion for each gene will be reduced. If the other samples had larger within-group variance, you would expect the opposite effect: higher estimates of dispersion and less genes passing an FDR threshold. We recommend using all samples to estimate the dispersion, as generally more samples reduces the variance of estimators. The model assumes though that the dispersion parameter for a given gene is the same across the groups. Mike On Sun, Dec 8, 2013 at 10:38 PM, Hui Zhao [guest] <guest@bioconductor.org>wrote: > > I have a file about readcount values with eight > samples(A1,A2,B1,B2,C1,C2,D1,D2),I want to know the differential genes > between A and B.Normally ,I should extract sample A1,A2,B1,B2 from the > file.Now I use all samples to normalize the readcounts and fit the model ,I > find that I find more DE genes.I want to know if my code is true and why? > In this part I find 322 genes using all samples, while I find 77 genes > using specfic samples. > > > -- output of sessionInfo(): > > ###each### > ###analysie with specific coloums### > library('DESeq') > x=read.delim("readcount.xls",row.names=1) > x=round(x[,1:4]) > group=factor(c("A","A","B","B")) > cds <- newCountDataSet(x, group) > cds <- estimateSizeFactors(cds) > cds <- estimateDispersions(cds) > res <- nbinomTest(cds,'A','B') > a<-subset(res,padj<0.05) > dim(a) > write.table(a[,1],"each.txt",quote=F,col.names=F,row.names=F) > > ###union### > ###analysis with all coloums### > library('DESeq') > x=read.delim("readcount.xls",row.names=1) > x=round(x) > group=c("A","A","B","B","C","C","D","D") > cds <- newCountDataSet(x, group) > cds <- estimateSizeFactors(cds) > cds <- estimateDispersions(cds) > res <- nbinomTest(cds,'A','B') > a<-subset(res,padj<0.05) > dim(a) > write.table(a[,1],"union.txt",quote=F,col.names=F,row.names=F) > > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6