preprocess core for quantile normalization
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
I have to run preprocess core for qantile normalization of my data. Data has protein name in column and peak intensity in row. I have data of different time point. each time point has three replicates. I did all protein normalization for now but two time points are not behaving as expected. So I would like to do quantile normalization using for the data and this I don't want between replicates but between time points. Please advise me how to do this. I can run R if you will explain me to matrix preparation. Best regards, Yashwant -- output of sessionInfo(): no -- Sent via the guest posting facility at bioconductor.org.
Normalization Normalization • 1.6k views
ADD COMMENT
0
Entering edit mode
Tim Triche ★ 4.2k
@tim-triche-3561
Last seen 3.6 years ago
United States
if you have different numbers of peaks, bins, whatever, you will need to do kernel smoothing to get the appropriate distribution of quantiles, then normalize to that. Pan Du implemented this in the 'lumi' package, FWIW. Personally I think it's a clever enough idea (kernel-smoothed quantile normalization) that you should read it anyways, but... If you just want to normalize equal numbers of bins, it is pretty straightforward, e.g. suppose you have 100 runs of 100 bins/peaks/whatever (key point being they all have the same number of things per run to normalize ranks of): library(preprocessCore) par(mfrow=c(1,2)) foo <- matrix(rpois(10000, rgamma(1.5, 90)), ncol=100) plot(density(foo[,1]), xlab='reads', main='before') for(i in 2:100) lines(density(foo[,i])) bar <- normalize.quantiles(foo) plot(density(bar[,1]), xlab='reads', main='after') for(i in 2:100) lines(density(bar[,i])) sessionInfo() ## R version 3.0.0 (2013-04-03) ## Platform: x86_64-unknown-linux-gnu (64-bit) ## ## attached base packages: ## [1] stats graphics grDevices datasets utils methods base ## ## other attached packages: ## [1] preprocessCore_1.23.0 BiocInstaller_1.11.1 gtools_2.7.1 ## [4] devtools_1.2 ## ## loaded via a namespace (and not attached): ## [1] digest_0.6.3 evaluate_0.4.3 httr_0.2 memoise_0.1 parallel_3.0.0 ## [6] RCurl_1.95-4.1 stringr_0.6.2 tools_3.0.0 whisker_0.3-2 On Tue, Jun 4, 2013 at 6:21 AM, Yashwant Kumar [guest] < guest@bioconductor.org> wrote: > > I have to run preprocess core for qantile normalization of my data. Data > has protein name in column and peak intensity in row. > > I have data of different time point. each time point has three replicates. > I did all protein normalization for now but two time points are not > behaving as expected. So I would like to do quantile normalization using > for the data and this I don't want between replicates but between time > points. > > Please advise me how to do this. I can run R if you will explain me to > matrix preparation. > > Best regards, > Yashwant > > -- output of sessionInfo(): > > no > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Incidentally, this would be a really dumb idea (qnorm'ing read counts) if you wanted to do sensitive differential (expression/DNAse/ChIP/whatever) because it destroys count information. But that's another matter. On Tue, Jun 4, 2013 at 9:46 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > if you have different numbers of peaks, bins, whatever, you will need to > do kernel smoothing to get the appropriate distribution of quantiles, then > normalize to that. Pan Du implemented this in the 'lumi' package, FWIW. > Personally I think it's a clever enough idea (kernel-smoothed quantile > normalization) that you should read it anyways, but... > > If you just want to normalize equal numbers of bins, it is pretty > straightforward, e.g. suppose you have 100 runs of 100 bins/peaks/whatever > (key point being they all have the same number of things per run to > normalize ranks of): > > library(preprocessCore) > > par(mfrow=c(1,2)) > foo <- matrix(rpois(10000, rgamma(1.5, 90)), ncol=100) > plot(density(foo[,1]), xlab='reads', main='before') > for(i in 2:100) lines(density(foo[,i])) > > bar <- normalize.quantiles(foo) > plot(density(bar[,1]), xlab='reads', main='after') > for(i in 2:100) lines(density(bar[,i])) > > > sessionInfo() > ## R version 3.0.0 (2013-04-03) > ## Platform: x86_64-unknown-linux-gnu (64-bit) > ## > ## attached base packages: > ## [1] stats graphics grDevices datasets utils methods base > > ## > ## other attached packages: > ## [1] preprocessCore_1.23.0 BiocInstaller_1.11.1 gtools_2.7.1 > ## [4] devtools_1.2 > ## > ## loaded via a namespace (and not attached): > ## [1] digest_0.6.3 evaluate_0.4.3 httr_0.2 memoise_0.1 > parallel_3.0.0 > ## [6] RCurl_1.95-4.1 stringr_0.6.2 tools_3.0.0 whisker_0.3-2 > > > > On Tue, Jun 4, 2013 at 6:21 AM, Yashwant Kumar [guest] < > guest@bioconductor.org> wrote: > >> >> I have to run preprocess core for qantile normalization of my data. Data >> has protein name in column and peak intensity in row. >> >> I have data of different time point. each time point has three >> replicates. I did all protein normalization for now but two time points are >> not behaving as expected. So I would like to do quantile normalization >> using for the data and this I don't want between replicates but between >> time points. >> >> Please advise me how to do this. I can run R if you will explain me to >> matrix preparation. >> >> Best regards, >> Yashwant >> >> -- output of sessionInfo(): >> >> no >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6