DESeq normalization
2
0
Entering edit mode
sudeep s ▴ 60
@sudeep-s-5416
Last seen 9.7 years ago
Hi all, I was checking DESeq normalization success for my samples following the suggestions in this post : https://stat.ethz.ch/pipermail/bioconductor/2010-October/035933.html. I observed that filtering down to the genes that are present in both case and control samples improved normalization  (visualized by plotting as per the post), but for a few samples ,say  3 samples these procedures did not help, and I tried calculating shorth estimator (again as per the post) and this did n't help either. My question is what should I follow for the normalization of these samples ? Regards, Sudeep. [[alternative HTML version deleted]]
Normalization DESeq Normalization DESeq • 969 views
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.8 years ago
Zentrum für Molekularbiologie, Universi…
Hi Sudeep On 2012-08-01 11:47, sudeep s wrote: > I was checking DESeq normalization success for my samples following > the suggestions in this post : > https://stat.ethz.ch/pipermail/bioconductor/2010-October/035933.html. > I observed that filtering down to the genes that are present in both > case and control samples improved normalization (visualized by > plotting as per the post), but for a few samples ,say 3 samples > these procedures did not help, and I tried calculating shorth > estimator (again as per the post) and this did n't help either. My > question is what should I follow for the normalization of these > samples ? You would need to tell us what exactly you did not like about the histograms. Anyway, I revised my opinion a bit, and I now think that MA plots are mroe helpful. Starting as in the post that you cited, plot an MA plot instead of a histogram, with plot( geomeans, log2( counts(cds)[,j] / geomeans ), log="x", pch=".", col="#00000060" ) abline( h=log2( sizeFactors(cds)[ j ] ), col="red" ) Simon
ADD COMMENT
0
Entering edit mode
sudeep s ▴ 60
@sudeep-s-5416
Last seen 9.7 years ago
Hi Simon, Thank you for the reply As you said in the post, I was plotting to see if the size factor was hitting the median, in some samples (3), the size factor was way off the median and hence I tried shorth, but this didn't help either, and I tried filtering down the genes (filtering genes on row sums for 10,15,20, 25...) but again in the plot size factor was way off, so I was wondering what I should follow. regards, Sudeep. ________________________________ From: Simon Anders <anders@embl.de> To: bioconductor@r-project.org Sent: Tuesday, 7 August 2012 12:59 PM Subject: Re: [BioC] DESeq normalization Hi Sudeep On 2012-08-01 11:47, sudeep s wrote: > I was checking DESeq normalization success for my samples following > the suggestions in this post : > https://stat.ethz.ch/pipermail/bioconductor/2010-October/035933.html. > I observed that filtering down to the genes that are present in both > case and control samples improved normalization  (visualized by > plotting as per the post), but for a few samples ,say  3 samples > these procedures did not help, and I tried calculating shorth > estimator (again as per the post) and this did n't help either. My > question is what should I follow for the normalization of these > samples ? You would need to tell us what exactly you did not like about the histograms. Anyway, I revised my opinion a bit, and I now think that MA plots are mroe helpful. Starting as in the post that you cited, plot an MA plot instead of a histogram, with plot( geomeans, log2( counts(cds)[,j] / geomeans ), log="x",   pch=".", col="#00000060" ) abline( h=log2( sizeFactors(cds)[ j ] ), col="red" )   Simon _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Sudeep On 2012-08-08 08:28, sudeep s wrote: > As you said in the post, I was plotting to see if the size factor was > hitting the median, in some samples (3), the size factor was way off > the median and hence I tried shorth, but this didn't help either, and > I tried filtering down the genes (filtering genes on row sums for > 10,15,20, 25...) but again in the plot size factor was way off, so I > was wondering what I should follow. I guess you'd need to post your plots to the list. It's hard to advise without seeing them. Simon
ADD REPLY
0
Entering edit mode
Hi Simon, Thank you for your interest, I was able to trace back the 'normalization problem' to some bad quality reads that were not trimmed off from the fastq file, and by the way your suggestion to use MA plot instead of normalization was very helpful, gave me a much clear idea on the read count mappings. Regards, Sudeep. ________________________________ From: Simon Anders <anders@embl.de> To: bioconductor@r-project.org Sent: Wednesday, 8 August 2012 12:08 PM Subject: Re: [BioC] DESeq normalization Hi Sudeep On 2012-08-08 08:28, sudeep s wrote: > As you said in the post, I was plotting to see if the size factor was > hitting the median, in some samples (3), the size factor was way off > the median and hence I tried shorth, but this didn't help either, and > I tried filtering down the genes (filtering genes on row sums for > 10,15,20, 25...) but again in the plot size factor was way off, so I > was wondering what I should follow. I guess you'd need to post your plots to the list. It's hard to advise without seeing them.   Simon _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 321 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6