DESeq Question
1
0
Entering edit mode
OD ▴ 40
@od-5180
Last seen 8.8 years ago
United States
Hello All, I am trying to use DESeq, a sample of my data looks like below (X.1 & X.2 are replicates, Y1 & Y2 are replicates). *GeneNo X.1 X.2 Y.1 Y.2* gene00001 4 3 3 1 gene00002 17 45 39 30 gene00003 25 40 29 24 gene00004 8 8 9 3 gene00009 1 0 0 0 gene00011 1 2 4 2 gene00012 0 0 0 0 gene00014 17 7 8 21 I used the following commands to find the differentially expressed genes. I have two questions, 1. Are the following commands correct ? or am i missing something? 2. How can i determine which genes are differentially expressed? induced? suppressed? unchanged? library( "DESeq" ) exampleFile = system.file( "extra/X_Y_Reads.txt", package="DESeq" ) countsTable <- read.delim( exampleFile, header=TRUE, stringsAsFactors=TRUE ) rownames( countsTable ) <- countsTable$gene countsTable <- countsTable[ , -1 ] conds <- c( "X","X","Y","Y") cds <- newCountDataSet( countsTable, conds ) libsizes <- clibsizes <- c(X.1=18143070, X.2=13544150, Y.1=17853990, Y.2=15038501) sizeFactors(cds) <- libsizes[-1] cds <- estimateSizeFactors( cds ) sizeFactors( cds ) cds <- estimateDispersions( cds ) res <- nbinomTest( cds, "X", "Y" ) plot(res$baseMean,res$log2FoldChange,log="x", pch=20, cex=.1,col = ifelse( res$padj < .001, "red", "black" ) ) resSig <- res[ res$padj < .001, ] head( resSig[ order(resSig$pval), ] ) write.table(res,"C:\\Users\\OMD\\Desktop\\XY_DEG.txt") -- Thank you, [[alternative HTML version deleted]]
DESeq DESeq • 1.2k views
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.7 years ago
Zentrum für Molekularbiologie, Universi…
Hi On 03/21/2012 12:26 AM, Omar Darwish wrote: > I used the following commands to find the differentially expressed genes. > I have two questions, > > 1. Are the following commands correct ? or am i missing something? Yes, except for two oddities: > libsizes<- clibsizes<- c(X.1=18143070, X.2=13544150, Y.1=17853990, > Y.2=15038501) > sizeFactors(cds)<- libsizes[-1] > cds<- estimateSizeFactors( cds ) Why do you set the size factors manually, and the use 'estimateSizefactors' to overwrite this information? The first two commands are gratuitous (and don't even work becaus of the "[-1]"). Omit them. > resSig<- res[ res$padj< .001, ] Do your really want an FDR of only 0.1%? This is extremely stringent. Do you have any reason for this unusual choice? > 2. How can i determine which genes are differentially expressed? > induced? suppressed? unchanged? Those with and adjusted p value (column "padj") below your chosen FDR threshold are significant, and the column "log2FoldChange" tells you the log2 fold change and hence whether they are up- or downregulated. Simon
ADD COMMENT
0
Entering edit mode
Thanks Simon, Well, so i omitted the two commands you mentioned, my commands now: library( "DESeq" ) exampleFile = system.file( "extra/X_Y_Reads.txt", package="DESeq" ) countsTable <- read.delim( exampleFile, header=TRUE, stringsAsFactors=TRUE ) rownames( countsTable ) <- countsTable$gene countsTable <- countsTable[ , -1 ] conds <- c( "X","X","Y","Y") cds <- newCountDataSet( countsTable, conds ) cds <- estimateSizeFactors( cds ) sizeFactors( cds ) cds <- estimateDispersions( cds ) res <- nbinomTest( cds, "X", "Y" ) plot(res$baseMean,res$log2FoldChange,log="x", pch=20, cex=.1,col = ifelse( res$padj < .001, "red", "black" ) ) resSig <- res[ res$padj < .001, ] head( resSig[ order(resSig$pval), ] ) write.table(res,"C:\\Users\\OMD\\Desktop\\XY_DEG.txt") Regarding the FDR, honestly I'm not really sure what is the best cut off value to use, I picked 0.001 based on what I saw in one of the examples published in the DESeq paper! Any recommendations! I have one more question, in the output file i can see only integer ids (1,2,3,4 ... etc) instead of the (gene0001, gene0023 ... etc ), I am not sure if these integers match the gene numbers as in my input Regards, On Wed, Mar 21, 2012 at 10:18 AM, Simon Anders <anders@embl.de> wrote: > Hi > > > On 03/21/2012 12:26 AM, Omar Darwish wrote: > >> I used the following commands to find the differentially expressed genes. >> I have two questions, >> >> 1. Are the following commands correct ? or am i missing something? >> > > Yes, except for two oddities: > > > > libsizes<- clibsizes<- c(X.1=18143070, X.2=13544150, Y.1=17853990, > > Y.2=15038501) > > sizeFactors(cds)<- libsizes[-1] > > cds<- estimateSizeFactors( cds ) > > Why do you set the size factors manually, and the use > 'estimateSizefactors' to overwrite this information? The first two commands > are gratuitous (and don't even work becaus of the "[-1]"). Omit them. > > > > resSig<- res[ res$padj< .001, ] > > Do your really want an FDR of only 0.1%? This is extremely stringent. Do > you have any reason for this unusual choice? > > 2. How can i determine which genes are differentially expressed? >> induced? suppressed? unchanged? >> > > Those with and adjusted p value (column "padj") below your chosen FDR > threshold are significant, and the column "log2FoldChange" tells you the > log2 fold change and hence whether they are up- or downregulated. > > Simon > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -- Regards, Omar [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6