estimation of size factors in DESeq2 analysis

0

Entering edit mode

Assa Yeroslaviz ★ 1.5k

@assa-yeroslaviz-1597

Last seen 3 months ago

Germany

Hi all, in relation to my mail from January this year, I followed Simon's advice to do my analyses in DESeq2 instead of DESeq. I am working on an RNASeq from c. elegans. I have mapped the data with the ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts to counts the reads (both with the defaults parameters). I have two conditions, control and a knock-out with each three replica. Now I am trying to find differentially regulated genes between the two conditions using DESeq2. This is the script I am using to read my raw count table into DESeq2: featureCountTable <- read.table("featureCountTable_RawCounts.txt", sep="\t", quote=F) colData <- data.frame(row.names=names(featureCountTable), condition = c(rep("wt",3), rep("cpb3", 3))) cds <- DESeqDataSetFromMatrix ( countData = featureCountTable, colData = colData, design = ~ condition ) fit = DESeq(cds) res = results(fit) But I am getting the same problem with DESeq2 as I have got with DESeq. When I ran the DESeq command I get a warning: Warning messages: 1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced 2: step size truncated due to divergence So again I have tried to change the fitType. fit = DESeq(cds, fitType="local") Which than came back without any warnings. The two dispersion plots can be found here <http: s23.postimg.org="" pvopnmtxj="" desesq2_local.png=""> (local fit) and here <http: s23.postimg.org="" uk4pitj47="" desesq2_parametric.png=""> (default/parametric fit). The red line goes through the point-cloud in both cases (as Simon defined a good fit in the last communication, I wish it would have bin so easy :-) . In the local fit type there a more outliers and the right end of the slope is going up again. I am not sure whether or not this is a good thing or not. So, my question is - which of the two options is better? I understand, that in general the parametric (default) option is better, but here it gives me a warning, so that something in the fit calculations is not good. How can I understand theses plots? Thanks for the help Assa [[alternative HTML version deleted]]

RNASeq DESeq DESeq2 RNASeq DESeq DESeq2 • 1.6k views

ADD COMMENT • link updated 9.8 years ago by Michael Love 41k • written 9.8 years ago by Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 15 hours ago

United States

hi Assa, I started to answer you at biostars, I think our messages crossed https://www.biostars.org/p/105192/ On Wed, Jul 2, 2014 at 8:37 AM, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: > Hi all, > > in relation to my mail from January this year, I followed Simon's advice to > do my analyses in DESeq2 instead of DESeq. > > I am working on an RNASeq from c. elegans. I have mapped the data with the > ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts > to counts the reads (both with the defaults parameters). > > I have two conditions, control and a knock-out with each three replica. Now > I am trying to find differentially regulated genes between the two > conditions using DESeq2. > > This is the script I am using to read my raw count table into DESeq2: > > featureCountTable <- read.table("featureCountTable_RawCounts.txt", > sep="\t", quote=F) > > colData <- data.frame(row.names=names(featureCountTable), condition = > c(rep("wt",3), rep("cpb3", 3))) > > cds <- DESeqDataSetFromMatrix ( > countData = featureCountTable, > colData = colData, > design = ~ condition > ) > > fit = DESeq(cds) > res = results(fit) > > But I am getting the same problem with DESeq2 as I have got with DESeq. > When I ran the DESeq command I get a warning: > Warning messages: > 1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced > 2: step size truncated due to divergence > > So again I have tried to change the fitType. > fit = DESeq(cds, fitType="local") > > Which than came back without any warnings. > The two dispersion plots can be found here > <http: s23.postimg.org="" pvopnmtxj="" desesq2_local.png=""> (local fit) and here > <http: s23.postimg.org="" uk4pitj47="" desesq2_parametric.png=""> > (default/parametric fit). The red line goes through the point-cloud in both > cases (as Simon defined a good fit in the last communication, I wish it > would have bin so easy :-) . > In the local fit type there a more outliers and the right end of the slope > is going up again. I am not sure whether or not this is a good thing or not. > > So, my question is - which of the two options is better? > I understand, that in general the parametric (default) option is better, > but here it gives me a warning, so that something in the fit calculations > is not good. > > How can I understand theses plots? > > Thanks for the help > Assa > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 9.8 years ago Michael Love 41k

Login before adding your answer.