HT qPCR - error in scale rank invariant
2
0
Entering edit mode
@andreia-fonseca-3796
Last seen 7.9 years ago
Dear all, I am analysing qPCR data from the Exiqon where I have one card per sample, in each card I have one observation for each miRNA. I have in total 8 cards, 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each card has one endogenous gene, which I wouldn't like to use to normalize Ct values because is being affected by the type of treatment. So I would like to use scale.rank. I am getting the following error: sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") Error in smooth.spline(ref[i.set], data[i.set]) : need at least four unique 'x' values Does this mean I don't have enough replicates? thanks for the help Andreia -- -------------------------------------------- Andreia J. Amaral Unidade de Imunologia Clínica Instituto de Medicina Molecular Universidade de Lisboa email: andreiaamaral@fm.ul.pt andreia.fonseca@gmail.com [[alternative HTML version deleted]]
miRNA qPCR miRNA qPCR • 1.7k views
ADD COMMENT
0
Entering edit mode
Heidi Dvinge ★ 2.0k
@heidi-dvinge-2195
Last seen 10.3 years ago
Dear Andreia, > Dear all, > > I am analysing qPCR data from the Exiqon where I have one card per sample, > in each card I have one observation for each miRNA. I have in total 8 > cards, > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each card has > one endogenous gene, which I wouldn't like to use to normalize Ct values > because is being affected by the type of treatment. So I would like to use > scale.rank. > I am getting the following error: > > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") > Error in smooth.spline(ref[i.set], data[i.set]) : > need at least four unique 'x' values > It sounds like there aren't enough rank-invariant genes across your 8 cards. If that's the case, then this is admittedly not the most useful error message, and it should be changed. What does it say when you run traceback() following the error? The parameter "scale.rank.samples" in normalizeCtData() will let you set how many of the samples each gene has to be rank-invariant across in order to be excluded. Per default this is the number of samples-1. You can try lowering that number, although keeping in mind that the lower it is, the less robust your resulting rank-invariant genes are. If your samples are all highly variable across all genes, it might not be possible for you to use this normalisation method. If this does not seem to be the problem, something else might be going on with the function. In that case, please report back here and I can perhaps have a look at your data. I have been considering adding an additional parameter to normalizeCtData, so that genes just have to be rank-invariant within a certain interval, e.g. be located within -/+5 of each other on the ranked list. For rather low-throughput qPCR cards that could mess things up though. HTH \Heidi > Does this mean I don't have enough replicates? > > thanks for the help > > Andreia > > -- > -------------------------------------------- > Andreia J. Amaral > Unidade de Imunologia Cl?nica > Instituto de Medicina Molecular > Universidade de Lisboa > email: andreiaamaral at fm.ul.pt > andreia.fonseca at gmail.com > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Dear Heidi, thanks for the quick reply, after traceback() I get traceback() 5: stop("need at least four unique 'x' values") 4: smooth.spline(ref[i.set], data[i.set]) 3: FUN(newX[, i], ...) 2: apply(data, 2, normalize.invariantset, ref = ref.data) 1: normalizeCtDataraw.cat, norm = "scale.rank") information about the session sessionInfo() R version 2.11.1 (2010-05-31) i386-apple-darwin9.8.0 locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] statmod_1.4.8 HTqPCR_1.2.0 limma_3.4.4 RColorBrewer_1.0-2 Biobase_2.8.0 loaded via a namespace (and not attached): [1] affy_1.26.1 affyio_1.16.0 gdata_2.7.2 gplots_2.8.0 gtools_2.6.2 preprocessCore_1.10.0 On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi@ebi.ac.uk> wrote: > Dear Andreia, > > > Dear all, > > > > I am analysing qPCR data from the Exiqon where I have one card per > sample, > > in each card I have one observation for each miRNA. I have in total 8 > > cards, > > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each card has > > one endogenous gene, which I wouldn't like to use to normalize Ct values > > because is being affected by the type of treatment. So I would like to > use > > scale.rank. > > I am getting the following error: > > > > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") > > Error in smooth.spline(ref[i.set], data[i.set]) : > > need at least four unique 'x' values > > > It sounds like there aren't enough rank-invariant genes across your 8 > cards. If that's the case, then this is admittedly not the most useful > error message, and it should be changed. What does it say when you run > traceback() following the error? > > The parameter "scale.rank.samples" in normalizeCtData() will let you set > how many of the samples each gene has to be rank-invariant across in order > to be excluded. Per default this is the number of samples-1. You can try > lowering that number, although keeping in mind that the lower it is, the > less robust your resulting rank-invariant genes are. If your samples are > all highly variable across all genes, it might not be possible for you to > use this normalisation method. > > If this does not seem to be the problem, something else might be going on > with the function. In that case, please report back here and I can perhaps > have a look at your data. > > I have been considering adding an additional parameter to normalizeCtData, > so that genes just have to be rank-invariant within a certain interval, > e.g. be located within -/+5 of each other on the ranked list. For rather > low-throughput qPCR cards that could mess things up though. > > HTH > \Heidi > > > Does this mean I don't have enough replicates? > > > > thanks for the help > > > > Andreia > > > > -- > > -------------------------------------------- > > Andreia J. Amaral > > Unidade de Imunologia Clínica > > Instituto de Medicina Molecular > > Universidade de Lisboa > > email: andreiaamaral@fm.ul.pt > > andreia.fonseca@gmail.com > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- -------------------------------------------- Andreia J. Amaral Unidade de Imunologia Clínica Instituto de Medicina Molecular Universidade de Lisboa email: andreiaamaral@fm.ul.pt andreia.fonseca@gmail.com [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hello Andreia, I can reproduce the error you get if I say: > data(qPCRraw) > temp <- normalizeCtData(qPCRraw, norm="scale.rankinvariant") Scaling Ct values Using rank invariant genes: Gene1 Gene29 Scaling factors: 1.00 1.06 1.00 1.03 1.00 1.00 # Select just the first genes so that Gene29 is excluded > normalizeCtData(qPCRraw[1:10,], norm="scale.rankinvariant") Error in smooth.spline(ref[i.set], data[i.set]) : need at least four unique 'x' values After looking into the code, the problem occur when there's only a single (or no) rank invariant genes between any individual sample and the reference sample (the mean or median across all samples). At least two rank-invariant genes are required between the reference and each sample. I'll make a note of this in the help file. This means that a rank-invariant method is not going to be robust enough for your normalisation. Instead, you'll have to go with ddCt or quantile. In the future there might be other options available in HTqPCR (e.g. scale by arithmetic or geometric mean) depending on demand. The likely cause of this is that your samples are quite different. Have you tried investigating them with e.g. plotCtCor or clusterCt to see if they group as expected, or if there's any marked difference in the distribution of Ct values (plotCtDensity)? Even a relatively harsh method such as quantile normalisation might be suitable for you data. Cheers \Heidi > Dear Heidi, > > thanks for the quick reply, > > after traceback() I get > > traceback() > 5: stop("need at least four unique 'x' values") > 4: smooth.spline(ref[i.set], data[i.set]) > 3: FUN(newX[, i], ...) > 2: apply(data, 2, normalize.invariantset, ref = ref.data) > 1: normalizeCtDataraw.cat, norm = "scale.rank") > > information about the session > sessionInfo() > R version 2.11.1 (2010-05-31) > i386-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] statmod_1.4.8 HTqPCR_1.2.0 limma_3.4.4 > RColorBrewer_1.0-2 Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] affy_1.26.1 affyio_1.16.0 gdata_2.7.2 > gplots_2.8.0 gtools_2.6.2 preprocessCore_1.10.0 > > On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi at="" ebi.ac.uk=""> wrote: > >> Dear Andreia, >> >> > Dear all, >> > >> > I am analysing qPCR data from the Exiqon where I have one card per >> sample, >> > in each card I have one observation for each miRNA. I have in total 8 >> > cards, >> > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each card >> has >> > one endogenous gene, which I wouldn't like to use to normalize Ct >> values >> > because is being affected by the type of treatment. So I would like to >> use >> > scale.rank. >> > I am getting the following error: >> > >> > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") >> > Error in smooth.spline(ref[i.set], data[i.set]) : >> > need at least four unique 'x' values >> > >> It sounds like there aren't enough rank-invariant genes across your 8 >> cards. If that's the case, then this is admittedly not the most useful >> error message, and it should be changed. What does it say when you run >> traceback() following the error? >> >> The parameter "scale.rank.samples" in normalizeCtData() will let you set >> how many of the samples each gene has to be rank-invariant across in >> order >> to be excluded. Per default this is the number of samples-1. You can try >> lowering that number, although keeping in mind that the lower it is, the >> less robust your resulting rank-invariant genes are. If your samples are >> all highly variable across all genes, it might not be possible for you >> to >> use this normalisation method. >> >> If this does not seem to be the problem, something else might be going >> on >> with the function. In that case, please report back here and I can >> perhaps >> have a look at your data. >> >> I have been considering adding an additional parameter to >> normalizeCtData, >> so that genes just have to be rank-invariant within a certain interval, >> e.g. be located within -/+5 of each other on the ranked list. For rather >> low-throughput qPCR cards that could mess things up though. >> >> HTH >> \Heidi >> >> > Does this mean I don't have enough replicates? >> > >> > thanks for the help >> > >> > Andreia >> > >> > -- >> > -------------------------------------------- >> > Andreia J. Amaral >> > Unidade de Imunologia Cl?nica >> > Instituto de Medicina Molecular >> > Universidade de Lisboa >> > email: andreiaamaral at fm.ul.pt >> > andreia.fonseca at gmail.com >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at r-project.org >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> > > > -- > -------------------------------------------- > Andreia J. Amaral > Unidade de Imunologia Cl?nica > Instituto de Medicina Molecular > Universidade de Lisboa > email: andreiaamaral at fm.ul.pt > andreia.fonseca at gmail.com >
ADD REPLY
0
Entering edit mode
Dear Heidi, thanks for your reply. Indeed I am comparing cell types which have huge differences between miRNAs profiles and unfortunately the qPCR assay only has one endogenous gene which is being affected by cell type and therefore dCt method is not adequate. I have tried quantile. The reason why I wanted to find another method is because has you can see in the distribution of Ct values, the cells S1 have many miRs which are not expressed and that I am analyzing as Ct=40. So these cells are very different and with quantile some differences will not pop up in the analyses because I am forcing it to have a distribution similar do the other cells. Still I think that is approach is conservative, given that some differences do appear as you can see in the files after quantile normalization. Implementing other methods that could deal with this problems of working with cell types which have different behavior like my case and lacking endogenous genes to normalize could be a suggestion to your package. Kind regards, Andreia PS: in attach are two files with the correlations and data distribution. On Mon, Jan 24, 2011 at 11:11 PM, Heidi Dvinge <heidi at="" ebi.ac.uk=""> wrote: > Hello Andreia, > > I can reproduce the error you get if I say: > > > data(qPCRraw) > > temp <- normalizeCtData(qPCRraw, norm="scale.rankinvariant") > Scaling Ct values > Using rank invariant genes: Gene1 Gene29 > Scaling factors: 1.00 1.06 1.00 1.03 1.00 1.00 > # Select just the first genes so that Gene29 is excluded > > normalizeCtData(qPCRraw[1:10,], norm="scale.rankinvariant") > Error in smooth.spline(ref[i.set], data[i.set]) : > need at least four unique 'x' values > > After looking into the code, the problem occur when there's only a single > (or no) rank invariant genes between any individual sample and the > reference sample (the mean or median across all samples). At least two > rank-invariant genes are required between the reference and each sample. > I'll make a note of this in the help file. > > This means that a rank-invariant method is not going to be robust enough > for your normalisation. Instead, you'll have to go with ddCt or quantile. > In the future there might be other options available in HTqPCR (e.g. scale > by arithmetic or geometric mean) depending on demand. > > The likely cause of this is that your samples are quite different. Have > you tried investigating them with e.g. plotCtCor or clusterCt to see if > they group as expected, or if there's any marked difference in the > distribution of Ct values (plotCtDensity)? Even a relatively harsh method > such as quantile normalisation might be suitable for you data. > > Cheers > \Heidi > > > > Dear Heidi, > > > > thanks for the quick reply, > > > > after traceback() I get > > > > traceback() > > 5: stop("need at least four unique 'x' values") > > 4: smooth.spline(ref[i.set], data[i.set]) > > 3: FUN(newX[, i], ...) > > 2: apply(data, 2, normalize.invariantset, ref = ref.data) > > 1: normalizeCtDataraw.cat, norm = "scale.rank") > > > > information about the session > > sessionInfo() > > R version 2.11.1 (2010-05-31) > > i386-apple-darwin9.8.0 > > > > locale: > > [1] C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] statmod_1.4.8 HTqPCR_1.2.0 limma_3.4.4 > > RColorBrewer_1.0-2 Biobase_2.8.0 > > > > loaded via a namespace (and not attached): > > [1] affy_1.26.1 affyio_1.16.0 gdata_2.7.2 > > gplots_2.8.0 gtools_2.6.2 preprocessCore_1.10.0 > > > > On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi at="" ebi.ac.uk=""> wrote: > > > >> Dear Andreia, > >> > >> > Dear all, > >> > > >> > I am analysing qPCR data from the Exiqon where I have one card per > >> sample, > >> > in each card I have one observation for each miRNA. I have in total 8 > >> > cards, > >> > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each card > >> has > >> > one endogenous gene, which I wouldn't like to use to normalize Ct > >> values > >> > because is being affected by the type of treatment. So I would like to > >> use > >> > scale.rank. > >> > I am getting the following error: > >> > > >> > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") > >> > Error in smooth.spline(ref[i.set], data[i.set]) : > >> > need at least four unique 'x' values > >> > > >> It sounds like there aren't enough rank-invariant genes across your 8 > >> cards. If that's the case, then this is admittedly not the most useful > >> error message, and it should be changed. What does it say when you run > >> traceback() following the error? > >> > >> The parameter "scale.rank.samples" in normalizeCtData() will let you set > >> how many of the samples each gene has to be rank-invariant across in > >> order > >> to be excluded. Per default this is the number of samples-1. You can try > >> lowering that number, although keeping in mind that the lower it is, the > >> less robust your resulting rank-invariant genes are. If your samples are > >> all highly variable across all genes, it might not be possible for you > >> to > >> use this normalisation method. > >> > >> If this does not seem to be the problem, something else might be going > >> on > >> with the function. In that case, please report back here and I can > >> perhaps > >> have a look at your data. > >> > >> I have been considering adding an additional parameter to > >> normalizeCtData, > >> so that genes just have to be rank-invariant within a certain interval, > >> e.g. be located within -/+5 of each other on the ranked list. For rather > >> low-throughput qPCR cards that could mess things up though. > >> > >> HTH > >> \Heidi > >> > >> > Does this mean I don't have enough replicates? > >> > > >> > thanks for the help > >> > > >> > Andreia > >> > > >> > -- > >> > -------------------------------------------- > >> > Andreia J. Amaral > >> > Unidade de Imunologia Cl?nica > >> > Instituto de Medicina Molecular > >> > Universidade de Lisboa > >> > email: andreiaamaral at fm.ul.pt > >> > andreia.fonseca at gmail.com > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > _______________________________________________ > >> > Bioconductor mailing list > >> > Bioconductor at r-project.org > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > Search the archives: > >> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >> > >> > > > > > > -- > > -------------------------------------------- > > Andreia J. Amaral > > Unidade de Imunologia Cl?nica > > Instituto de Medicina Molecular > > Universidade de Lisboa > > email: andreiaamaral at fm.ul.pt > > andreia.fonseca at gmail.com > > > > > -- -------------------------------------------- Andreia J. Amaral Unidade de Imunologia Cl?nica Instituto de Medicina Molecular Universidade de Lisboa email: andreiaamaral at fm.ul.pt andreia.fonseca at gmail.com -------------- next part -------------- A non-text attachment was scrubbed... Name: correlation_between_qnorm_data_cell_code.pdf Type: application/pdf Size: 414972 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20110125="" 7545a2fc="" attachment-0002.pdf=""> -------------- next part -------------- A non-text attachment was scrubbed... Name: correlation_between_raw_data_cellcode.pdf Type: application/pdf Size: 415066 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20110125="" 7545a2fc="" attachment-0003.pdf="">
ADD REPLY
0
Entering edit mode
Heidi Dvinge ★ 2.0k
@heidi-dvinge-2195
Last seen 10.3 years ago
Hi Andreia, if your samples are indeed very different, then that's why a rank invariant scaling fails. Quantile normalisation might be quite conservative, but at least it seems to bring the C3 sample together with the other C samples, based on your plots. Depending on how adventurous you feel, you can also try some other scaling/normalisation methods yourself. For example, this article http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718498/ recommends scaling to the mean of all Ct values when dealing with miRNA qPCR values. Such a method might not work if you expect large overall differences in expression level between your samples. However, it's easy to implement and test this. Say for example that you want to use the geometric mean of all expressed genes (Ct>35), and use the first sample as a reference, you could do something like this: # Load some example data data(qPCRraw) # Define plotting function (or just use the individual commands directly) my.norm.method <- function(q, Ct.max=35, ref=1) { # Get the data data <- exprs(q) # For each column, calculate the geometric mean of Ct values<ct.max geo.mean="" <-="" apply(data,="" 2,="" function(x)="" {="" xx="" <-="" log2(subset(x,="" x<ct.max))="" 2^mean(xx)})="" #="" calculate="" the="" scaling="" factor="" geo.scale="" <-="" geo.mean="" geo.mean[ref]="" #="" adjust="" the="" data="" accordingly="" data.norm="" <-="" t(t(data)="" *="" geo.scale)="" #="" return="" the="" normalised="" object="" exprs(q)="" <-="" data.norm="" q="" }="" #="" normalise="" q.norm="" <-="" my.norm.method(qpcrraw)="" #="" plot="" raw="" versus="" normalised="" data="" plot(exprs(qpcrraw),="" exprs(q.norm),="" col="rep(1:n.samples(q.norm)," each="n.wells(q.norm)))" #="" followed="" by="" the="" usual="" qc="" and="" sanity="" check="" of="" your="" data...="" if="" you="" decide="" to="" give="" that="" (or="" something="" similar)="" a="" go,="" i'd="" be="" interested="" in="" hearing="" whether="" it="" works="" for="" your="" data="" or="" not.="" cheers="" \heidi=""> Dear Heidi, > > thanks for your reply. Indeed I am comparing cell types which have huge > differences between miRNAs profiles and unfortunately the qPCR assay only > has one endogenous gene which is being affected by cell type and therefore > dCt method is not adequate. I have tried quantile. The reason why I wanted > to find another method is because has you can see in the distribution of > Ct > values, the cells S1 have many miRs which are not expressed and that I am > analyzing as Ct=40. So these cells are very different and with quantile > some > differences will not pop up in the analyses because I am forcing it to > have > a distribution similar do the other cells. Still I think that is approach > is > conservative, given that some differences do appear as you can see in the > files after quantile normalization. Implementing other methods that could > deal with this problems of working with cell types which have different > behavior like my case and lacking endogenous genes to normalize could be a > suggestion to your package. > Kind regards, > Andreia > > PS: in attach are two files with the correlations and data distribution. > > On Mon, Jan 24, 2011 at 11:11 PM, Heidi Dvinge <heidi at="" ebi.ac.uk=""> wrote: > >> Hello Andreia, >> >> I can reproduce the error you get if I say: >> >> > data(qPCRraw) >> > temp <- normalizeCtData(qPCRraw, norm="scale.rankinvariant") >> Scaling Ct values >> Using rank invariant genes: Gene1 Gene29 >> Scaling factors: 1.00 1.06 1.00 1.03 1.00 1.00 >> # Select just the first genes so that Gene29 is excluded >> > normalizeCtData(qPCRraw[1:10,], norm="scale.rankinvariant") >> Error in smooth.spline(ref[i.set], data[i.set]) : >> need at least four unique 'x' values >> >> After looking into the code, the problem occur when there's only a >> single >> (or no) rank invariant genes between any individual sample and the >> reference sample (the mean or median across all samples). At least two >> rank-invariant genes are required between the reference and each sample. >> I'll make a note of this in the help file. >> >> This means that a rank-invariant method is not going to be robust enough >> for your normalisation. Instead, you'll have to go with ddCt or >> quantile. >> In the future there might be other options available in HTqPCR (e.g. >> scale >> by arithmetic or geometric mean) depending on demand. >> >> The likely cause of this is that your samples are quite different. Have >> you tried investigating them with e.g. plotCtCor or clusterCt to see if >> they group as expected, or if there's any marked difference in the >> distribution of Ct values (plotCtDensity)? Even a relatively harsh >> method >> such as quantile normalisation might be suitable for you data. >> >> Cheers >> \Heidi >> >> >> > Dear Heidi, >> > >> > thanks for the quick reply, >> > >> > after traceback() I get >> > >> > traceback() >> > 5: stop("need at least four unique 'x' values") >> > 4: smooth.spline(ref[i.set], data[i.set]) >> > 3: FUN(newX[, i], ...) >> > 2: apply(data, 2, normalize.invariantset, ref = ref.data) >> > 1: normalizeCtDataraw.cat, norm = "scale.rank") >> > >> > information about the session >> > sessionInfo() >> > R version 2.11.1 (2010-05-31) >> > i386-apple-darwin9.8.0 >> > >> > locale: >> > [1] C >> > >> > attached base packages: >> > [1] stats graphics grDevices utils datasets methods base >> > >> > other attached packages: >> > [1] statmod_1.4.8 HTqPCR_1.2.0 limma_3.4.4 >> > RColorBrewer_1.0-2 Biobase_2.8.0 >> > >> > loaded via a namespace (and not attached): >> > [1] affy_1.26.1 affyio_1.16.0 gdata_2.7.2 >> > gplots_2.8.0 gtools_2.6.2 preprocessCore_1.10.0 >> > >> > On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi at="" ebi.ac.uk=""> wrote: >> > >> >> Dear Andreia, >> >> >> >> > Dear all, >> >> > >> >> > I am analysing qPCR data from the Exiqon where I have one card per >> >> sample, >> >> > in each card I have one observation for each miRNA. I have in total >> 8 >> >> > cards, >> >> > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each >> card >> >> has >> >> > one endogenous gene, which I wouldn't like to use to normalize Ct >> >> values >> >> > because is being affected by the type of treatment. So I would like >> to >> >> use >> >> > scale.rank. >> >> > I am getting the following error: >> >> > >> >> > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") >> >> > Error in smooth.spline(ref[i.set], data[i.set]) : >> >> > need at least four unique 'x' values >> >> > >> >> It sounds like there aren't enough rank-invariant genes across your 8 >> >> cards. If that's the case, then this is admittedly not the most >> useful >> >> error message, and it should be changed. What does it say when you >> run >> >> traceback() following the error? >> >> >> >> The parameter "scale.rank.samples" in normalizeCtData() will let you >> set >> >> how many of the samples each gene has to be rank-invariant across in >> >> order >> >> to be excluded. Per default this is the number of samples-1. You can >> try >> >> lowering that number, although keeping in mind that the lower it is, >> the >> >> less robust your resulting rank-invariant genes are. If your samples >> are >> >> all highly variable across all genes, it might not be possible for >> you >> >> to >> >> use this normalisation method. >> >> >> >> If this does not seem to be the problem, something else might be >> going >> >> on >> >> with the function. In that case, please report back here and I can >> >> perhaps >> >> have a look at your data. >> >> >> >> I have been considering adding an additional parameter to >> >> normalizeCtData, >> >> so that genes just have to be rank-invariant within a certain >> interval, >> >> e.g. be located within -/+5 of each other on the ranked list. For >> rather >> >> low-throughput qPCR cards that could mess things up though. >> >> >> >> HTH >> >> \Heidi >> >> >> >> > Does this mean I don't have enough replicates? >> >> > >> >> > thanks for the help >> >> > >> >> > Andreia >> >> > >> >> > -- >> >> > -------------------------------------------- >> >> > Andreia J. Amaral >> >> > Unidade de Imunologia Cl?nica >> >> > Instituto de Medicina Molecular >> >> > Universidade de Lisboa >> >> > email: andreiaamaral at fm.ul.pt >> >> > andreia.fonseca at gmail.com >> >> > >> >> > [[alternative HTML version deleted]] >> >> > >> >> > _______________________________________________ >> >> > Bioconductor mailing list >> >> > Bioconductor at r-project.org >> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> > Search the archives: >> >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> >> >> >> > >> > >> > -- >> > -------------------------------------------- >> > Andreia J. Amaral >> > Unidade de Imunologia Cl?nica >> > Instituto de Medicina Molecular >> > Universidade de Lisboa >> > email: andreiaamaral at fm.ul.pt >> > andreia.fonseca at gmail.com >> > >> >> >> > > > -- > -------------------------------------------- > Andreia J. Amaral > Unidade de Imunologia Cl?nica > Instituto de Medicina Molecular > Universidade de Lisboa > email: andreiaamaral at fm.ul.pt > andreia.fonseca at gmail.com >
ADD COMMENT
0
Entering edit mode
Hi Heidi, thanks for the tips. I will try and give you feed back. Kind regards, Andreia On Tue, Jan 25, 2011 at 7:23 PM, Heidi Dvinge <heidi@ebi.ac.uk> wrote: > Hi Andreia, > > if your samples are indeed very different, then that's why a rank > invariant scaling fails. Quantile normalisation might be quite > conservative, but at least it seems to bring the C3 sample together with > the other C samples, based on your plots. > > Depending on how adventurous you feel, you can also try some other > scaling/normalisation methods yourself. For example, this article > http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718498/ recommends scaling to > the mean of all Ct values when dealing with miRNA qPCR values. > > Such a method might not work if you expect large overall differences in > expression level between your samples. However, it's easy to implement and > test this. Say for example that you want to use the geometric mean of all > expressed genes (Ct>35), and use the first sample as a reference, you > could do something like this: > > # Load some example data > data(qPCRraw) > > # Define plotting function (or just use the individual commands directly) > my.norm.method <- function(q, Ct.max=35, ref=1) > { > # Get the data > data <- exprs(q) > # For each column, calculate the geometric mean of Ct values<ct.max> geo.mean <- apply(data, 2, function(x) { > xx <- log2(subset(x, x<ct.max))> 2^mean(xx)}) > # Calculate the scaling factor > geo.scale <- geo.mean/geo.mean[ref] > # Adjust the data accordingly > data.norm <- t(t(data) * geo.scale) > # Return the normalised object > exprs(q) <- data.norm > q > } > > # Normalise > q.norm <- my.norm.method(qPCRraw) > > # Plot raw versus normalised data > plot(exprs(qPCRraw), exprs(q.norm), col=rep(1:n.samples(q.norm), > each=n.wells(q.norm))) > # Followed by the usual QC and sanity check of your data... > > If you decide to give that (or something similar) a go, I'd be interested > in hearing whether it works for your data or not. > > Cheers > \Heidi > > > > > > Dear Heidi, > > > > thanks for your reply. Indeed I am comparing cell types which have huge > > differences between miRNAs profiles and unfortunately the qPCR assay only > > has one endogenous gene which is being affected by cell type and > therefore > > dCt method is not adequate. I have tried quantile. The reason why I > wanted > > to find another method is because has you can see in the distribution of > > Ct > > values, the cells S1 have many miRs which are not expressed and that I am > > analyzing as Ct=40. So these cells are very different and with quantile > > some > > differences will not pop up in the analyses because I am forcing it to > > have > > a distribution similar do the other cells. Still I think that is approach > > is > > conservative, given that some differences do appear as you can see in the > > files after quantile normalization. Implementing other methods that could > > deal with this problems of working with cell types which have different > > behavior like my case and lacking endogenous genes to normalize could be > a > > suggestion to your package. > > Kind regards, > > Andreia > > > > PS: in attach are two files with the correlations and data distribution. > > > > On Mon, Jan 24, 2011 at 11:11 PM, Heidi Dvinge <heidi@ebi.ac.uk> wrote: > > > >> Hello Andreia, > >> > >> I can reproduce the error you get if I say: > >> > >> > data(qPCRraw) > >> > temp <- normalizeCtData(qPCRraw, norm="scale.rankinvariant") > >> Scaling Ct values > >> Using rank invariant genes: Gene1 Gene29 > >> Scaling factors: 1.00 1.06 1.00 1.03 1.00 1.00 > >> # Select just the first genes so that Gene29 is excluded > >> > normalizeCtData(qPCRraw[1:10,], norm="scale.rankinvariant") > >> Error in smooth.spline(ref[i.set], data[i.set]) : > >> need at least four unique 'x' values > >> > >> After looking into the code, the problem occur when there's only a > >> single > >> (or no) rank invariant genes between any individual sample and the > >> reference sample (the mean or median across all samples). At least two > >> rank-invariant genes are required between the reference and each sample. > >> I'll make a note of this in the help file. > >> > >> This means that a rank-invariant method is not going to be robust enough > >> for your normalisation. Instead, you'll have to go with ddCt or > >> quantile. > >> In the future there might be other options available in HTqPCR (e.g. > >> scale > >> by arithmetic or geometric mean) depending on demand. > >> > >> The likely cause of this is that your samples are quite different. Have > >> you tried investigating them with e.g. plotCtCor or clusterCt to see if > >> they group as expected, or if there's any marked difference in the > >> distribution of Ct values (plotCtDensity)? Even a relatively harsh > >> method > >> such as quantile normalisation might be suitable for you data. > >> > >> Cheers > >> \Heidi > >> > >> > >> > Dear Heidi, > >> > > >> > thanks for the quick reply, > >> > > >> > after traceback() I get > >> > > >> > traceback() > >> > 5: stop("need at least four unique 'x' values") > >> > 4: smooth.spline(ref[i.set], data[i.set]) > >> > 3: FUN(newX[, i], ...) > >> > 2: apply(data, 2, normalize.invariantset, ref = ref.data) > >> > 1: normalizeCtDataraw.cat, norm = "scale.rank") > >> > > >> > information about the session > >> > sessionInfo() > >> > R version 2.11.1 (2010-05-31) > >> > i386-apple-darwin9.8.0 > >> > > >> > locale: > >> > [1] C > >> > > >> > attached base packages: > >> > [1] stats graphics grDevices utils datasets methods base > >> > > >> > other attached packages: > >> > [1] statmod_1.4.8 HTqPCR_1.2.0 limma_3.4.4 > >> > RColorBrewer_1.0-2 Biobase_2.8.0 > >> > > >> > loaded via a namespace (and not attached): > >> > [1] affy_1.26.1 affyio_1.16.0 gdata_2.7.2 > >> > gplots_2.8.0 gtools_2.6.2 preprocessCore_1.10.0 > >> > > >> > On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi@ebi.ac.uk> > wrote: > >> > > >> >> Dear Andreia, > >> >> > >> >> > Dear all, > >> >> > > >> >> > I am analysing qPCR data from the Exiqon where I have one card per > >> >> sample, > >> >> > in each card I have one observation for each miRNA. I have in total > >> 8 > >> >> > cards, > >> >> > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each > >> card > >> >> has > >> >> > one endogenous gene, which I wouldn't like to use to normalize Ct > >> >> values > >> >> > because is being affected by the type of treatment. So I would like > >> to > >> >> use > >> >> > scale.rank. > >> >> > I am getting the following error: > >> >> > > >> >> > sr.norm <- normalizeCtDataraw.cat, norm = "scale.rank") > >> >> > Error in smooth.spline(ref[i.set], data[i.set]) : > >> >> > need at least four unique 'x' values > >> >> > > >> >> It sounds like there aren't enough rank-invariant genes across your 8 > >> >> cards. If that's the case, then this is admittedly not the most > >> useful > >> >> error message, and it should be changed. What does it say when you > >> run > >> >> traceback() following the error? > >> >> > >> >> The parameter "scale.rank.samples" in normalizeCtData() will let you > >> set > >> >> how many of the samples each gene has to be rank-invariant across in > >> >> order > >> >> to be excluded. Per default this is the number of samples-1. You can > >> try > >> >> lowering that number, although keeping in mind that the lower it is, > >> the > >> >> less robust your resulting rank-invariant genes are. If your samples > >> are > >> >> all highly variable across all genes, it might not be possible for > >> you > >> >> to > >> >> use this normalisation method. > >> >> > >> >> If this does not seem to be the problem, something else might be > >> going > >> >> on > >> >> with the function. In that case, please report back here and I can > >> >> perhaps > >> >> have a look at your data. > >> >> > >> >> I have been considering adding an additional parameter to > >> >> normalizeCtData, > >> >> so that genes just have to be rank-invariant within a certain > >> interval, > >> >> e.g. be located within -/+5 of each other on the ranked list. For > >> rather > >> >> low-throughput qPCR cards that could mess things up though. > >> >> > >> >> HTH > >> >> \Heidi > >> >> > >> >> > Does this mean I don't have enough replicates? > >> >> > > >> >> > thanks for the help > >> >> > > >> >> > Andreia > >> >> > > >> >> > -- > >> >> > -------------------------------------------- > >> >> > Andreia J. Amaral > >> >> > Unidade de Imunologia Clínica > >> >> > Instituto de Medicina Molecular > >> >> > Universidade de Lisboa > >> >> > email: andreiaamaral@fm.ul.pt > >> >> > andreia.fonseca@gmail.com > >> >> > > >> >> > [[alternative HTML version deleted]] > >> >> > > >> >> > _______________________________________________ > >> >> > Bioconductor mailing list > >> >> > Bioconductor@r-project.org > >> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> >> > Search the archives: > >> >> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> >> > >> >> > >> >> > >> > > >> > > >> > -- > >> > -------------------------------------------- > >> > Andreia J. Amaral > >> > Unidade de Imunologia Clínica > >> > Instituto de Medicina Molecular > >> > Universidade de Lisboa > >> > email: andreiaamaral@fm.ul.pt > >> > andreia.fonseca@gmail.com > >> > > >> > >> > >> > > > > > > -- > > -------------------------------------------- > > Andreia J. Amaral > > Unidade de Imunologia Clínica > > Instituto de Medicina Molecular > > Universidade de Lisboa > > email: andreiaamaral@fm.ul.pt > > andreia.fonseca@gmail.com > > > > > -- -------------------------------------------- Andreia J. Amaral Unidade de Imunologia Clínica Instituto de Medicina Molecular Universidade de Lisboa email: andreiaamaral@fm.ul.pt andreia.fonseca@gmail.com [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 403 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6