DESeq normalization

0

Entering edit mode

sudeep s ▴ 60

@sudeep-s-5416

Last seen 9.7 years ago

Hi all, I was checking DESeq normalization success for my samples following the suggestions in this post : https://stat.ethz.ch/pipermail/bioconductor/2010-October/035933.html. I observed that filtering down to the genes that are present in both case and control samples improved normalization (visualized by plotting as per the post), but for a few samples ,say 3 samples these procedures did not help, and I tried calculating shorth estimator (again as per the post) and this did n't help either. My question is what should I follow for the normalization of these samples ? Regards, Sudeep. [[alternative HTML version deleted]]

Normalization DESeq Normalization DESeq • 969 views

ADD COMMENT • link 11.8 years ago sudeep s ▴ 60

0

Entering edit mode

Simon Anders ★ 3.7k

@simon-anders-3855

Last seen 3.8 years ago

Zentrum für Molekularbiologie, Universi…

Hi Sudeep On 2012-08-01 11:47, sudeep s wrote: > I was checking DESeq normalization success for my samples following > the suggestions in this post : > https://stat.ethz.ch/pipermail/bioconductor/2010-October/035933.html. > I observed that filtering down to the genes that are present in both > case and control samples improved normalization (visualized by > plotting as per the post), but for a few samples ,say 3 samples > these procedures did not help, and I tried calculating shorth > estimator (again as per the post) and this did n't help either. My > question is what should I follow for the normalization of these > samples ? You would need to tell us what exactly you did not like about the histograms. Anyway, I revised my opinion a bit, and I now think that MA plots are mroe helpful. Starting as in the post that you cited, plot an MA plot instead of a histogram, with plot( geomeans, log2( counts(cds)[,j] / geomeans ), log="x", pch=".", col="#00000060" ) abline( h=log2( sizeFactors(cds)[ j ] ), col="red" ) Simon

ADD COMMENT • link 11.8 years ago Simon Anders ★ 3.7k

0

Entering edit mode

sudeep s ▴ 60

@sudeep-s-5416

Last seen 9.7 years ago

Hi Simon, Thank you for the reply As you said in the post, I was plotting to see if the size factor was hitting the median, in some samples (3), the size factor was way off the median and hence I tried shorth, but this didn't help either, and I tried filtering down the genes (filtering genes on row sums for 10,15,20, 25...) but again in the plot size factor was way off, so I was wondering what I should follow. regards, Sudeep. ________________________________ From: Simon Anders <anders@embl.de> To: bioconductor@r-project.org Sent: Tuesday, 7 August 2012 12:59 PM Subject: Re: [BioC] DESeq normalization Hi Sudeep On 2012-08-01 11:47, sudeep s wrote: > I was checking DESeq normalization success for my samples following > the suggestions in this post : > https://stat.ethz.ch/pipermail/bioconductor/2010-October/035933.html. > I observed that filtering down to the genes that are present in both > case and control samples improved normalization (visualized by > plotting as per the post), but for a few samples ,say 3 samples > these procedures did not help, and I tried calculating shorth > estimator (again as per the post) and this did n't help either. My > question is what should I follow for the normalization of these > samples ? You would need to tell us what exactly you did not like about the histograms. Anyway, I revised my opinion a bit, and I now think that MA plots are mroe helpful. Starting as in the post that you cited, plot an MA plot instead of a histogram, with plot( geomeans, log2( counts(cds)[,j] / geomeans ), log="x", pch=".", col="#00000060" ) abline( h=log2( sizeFactors(cds)[ j ] ), col="red" ) Simon _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]

ADD COMMENT • link 11.8 years ago sudeep s ▴ 60

0

Entering edit mode

Hi Sudeep On 2012-08-08 08:28, sudeep s wrote: > As you said in the post, I was plotting to see if the size factor was > hitting the median, in some samples (3), the size factor was way off > the median and hence I tried shorth, but this didn't help either, and > I tried filtering down the genes (filtering genes on row sums for > 10,15,20, 25...) but again in the plot size factor was way off, so I > was wondering what I should follow. I guess you'd need to post your plots to the list. It's hard to advise without seeing them. Simon

ADD REPLY • link 11.8 years ago Simon Anders ★ 3.7k

0

Entering edit mode

Hi Simon, Thank you for your interest, I was able to trace back the 'normalization problem' to some bad quality reads that were not trimmed off from the fastq file, and by the way your suggestion to use MA plot instead of normalization was very helpful, gave me a much clear idea on the read count mappings. Regards, Sudeep. ________________________________ From: Simon Anders <anders@embl.de> To: bioconductor@r-project.org Sent: Wednesday, 8 August 2012 12:08 PM Subject: Re: [BioC] DESeq normalization Hi Sudeep On 2012-08-08 08:28, sudeep s wrote: > As you said in the post, I was plotting to see if the size factor was > hitting the median, in some samples (3), the size factor was way off > the median and hence I tried shorth, but this didn't help either, and > I tried filtering down the genes (filtering genes on row sums for > 10,15,20, 25...) but again in the plot size factor was way off, so I > was wondering what I should follow. I guess you'd need to post your plots to the list. It's hard to advise without seeing them. Simon _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]

ADD REPLY • link 11.8 years ago sudeep s ▴ 60

Login before adding your answer.