estimation of size factors in DESeq2 analysis
1
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 3 months ago
Germany
Hi all, in relation to my mail from January this year, I followed Simon's advice to do my analyses in DESeq2 instead of DESeq. I am working on an RNASeq from c. elegans. I have mapped the data with the ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts to counts the reads (both with the defaults parameters). I have two conditions, control and a knock-out with each three replica. Now I am trying to find differentially regulated genes between the two conditions using DESeq2. This is the script I am using to read my raw count table into DESeq2: featureCountTable <- read.table("featureCountTable_RawCounts.txt", sep="\t", quote=F) colData <- data.frame(row.names=names(featureCountTable), condition = c(rep("wt",3), rep("cpb3", 3))) cds <- DESeqDataSetFromMatrix ( countData = featureCountTable, colData = colData, design = ~ condition ) fit = DESeq(cds) res = results(fit) But I am getting the same problem with DESeq2 as I have got with DESeq. When I ran the DESeq command I get a warning: Warning messages: 1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced 2: step size truncated due to divergence So again I have tried to change the fitType. fit = DESeq(cds, fitType="local") Which than came back without any warnings. The two dispersion plots can be found here <http: s23.postimg.org="" pvopnmtxj="" desesq2_local.png=""> (local fit) and here <http: s23.postimg.org="" uk4pitj47="" desesq2_parametric.png=""> (default/parametric fit). The red line goes through the point-cloud in both cases (as Simon defined a good fit in the last communication, I wish it would have bin so easy :-) . In the local fit type there a more outliers and the right end of the slope is going up again. I am not sure whether or not this is a good thing or not. So, my question is - which of the two options is better? I understand, that in general the parametric (default) option is better, but here it gives me a warning, so that something in the fit calculations is not good. How can I understand theses plots? Thanks for the help Assa [[alternative HTML version deleted]]
RNASeq DESeq DESeq2 RNASeq DESeq DESeq2 • 1.6k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 15 hours ago
United States
hi Assa, I started to answer you at biostars, I think our messages crossed https://www.biostars.org/p/105192/ On Wed, Jul 2, 2014 at 8:37 AM, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: > Hi all, > > in relation to my mail from January this year, I followed Simon's advice to > do my analyses in DESeq2 instead of DESeq. > > I am working on an RNASeq from c. elegans. I have mapped the data with the > ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts > to counts the reads (both with the defaults parameters). > > I have two conditions, control and a knock-out with each three replica. Now > I am trying to find differentially regulated genes between the two > conditions using DESeq2. > > This is the script I am using to read my raw count table into DESeq2: > > featureCountTable <- read.table("featureCountTable_RawCounts.txt", > sep="\t", quote=F) > > colData <- data.frame(row.names=names(featureCountTable), condition = > c(rep("wt",3), rep("cpb3", 3))) > > cds <- DESeqDataSetFromMatrix ( > countData = featureCountTable, > colData = colData, > design = ~ condition > ) > > fit = DESeq(cds) > res = results(fit) > > But I am getting the same problem with DESeq2 as I have got with DESeq. > When I ran the DESeq command I get a warning: > Warning messages: > 1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced > 2: step size truncated due to divergence > > So again I have tried to change the fitType. > fit = DESeq(cds, fitType="local") > > Which than came back without any warnings. > The two dispersion plots can be found here > <http: s23.postimg.org="" pvopnmtxj="" desesq2_local.png=""> (local fit) and here > <http: s23.postimg.org="" uk4pitj47="" desesq2_parametric.png=""> > (default/parametric fit). The red line goes through the point-cloud in both > cases (as Simon defined a good fit in the last communication, I wish it > would have bin so easy :-) . > In the local fit type there a more outliers and the right end of the slope > is going up again. I am not sure whether or not this is a good thing or not. > > So, my question is - which of the two options is better? > I understand, that in general the parametric (default) option is better, > but here it gives me a warning, so that something in the fit calculations > is not good. > > How can I understand theses plots? > > Thanks for the help > Assa > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6