HTqPCR normalization issues - third posting
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Dear all, this is our third posting without a real reply so we wonder if this package is actually not maintained anymore ? if yes, it would be useful for us to know... We are using HTqPCR to analyze a set of cards which we trasformed in this format, which is accepted by HtQPCR: 2 Run05 41 Passed sample 41 ABCC5 Target 30 3 Run05 41 Passed sample 41 ADM Target 31.3 4 Run05 41 Passed sample 41 CEBPB Target 29.8 5 Run05 41 Passed sample 41 CSF1R Target 31.2 6 Run05 41 Passed sample 41 CXCL16 Target 26.9 7 Run05 41 Passed sample 41 CYC1 Target 25.7 [...] The total number of files and groups is as follows - summarized in the file "Elenco_1.txt" which is used below: File Group 41.txt Sano 39.txt Sano 37.txt Sano 35.txt Sano 43.txt Sano 34.txt Sano 44.txt Sano 38.txt Sano 48.txt Sano 40.txt Sano 47.txt Sano 6.txt Non Responder DISEASE 26.txt Non Responder DISEASE 2.txt Non Responder DISEASE 69.txt Non Responder DISEASE 68.txt Non Responder DISEASE 5.txt Non Responder DISEASE 71.txt Responder DISEASE 3.txt Responder DISEASE 17.txt Responder DISEASE 1.txt Responder DISEASE 19.txt Responder DISEASE The comparison is DISEASE vs non DISEASE, but what leaves us dubious is the normalization part. Note that sample 41 is the *first* of the list. Here is the code up to the dump of the normalized values matrices: library("HTqPCR") path <- ("whatever/") files <- read.delim (file.path(path, "Elenco_1.txt")) files filelist <- as.character(files$File) filelist raw <- readCtData(files = filelist, path = path, n.features=46, type=7, flag=NULL, feature=6, Ct=8, header=FALSE, n.data=1) featureNames (raw) raw.cat <- setCategory(raw, Ct.max=36, Ct.min=9, replicates=FALSE, quantile=0.9, groups =files$Group, verbose=TRUE) s.norm <- normalizeCtDataraw.cat, norm="scale.rank") exprs(s.norm) write.table(exprs(s.norm),file="Ct norm scaling.txt") g.norm <- normalizeCtDataraw.cat, norm="geometric.mean") exprs(g.norm) write.table(exprs(g.norm),file="Ct norm media geometrica.txt") Now if we look at the content of the two expression value files, it looks like that the first column (corresponding to the first sample) is always unchanged, while all the others have been normalized. In this case the first dataset is sample 41 so you can check comparing between the corresponding column above and below what is happening. We do not include here all the columns; however, you can see that all the samples *except the first (number 41)* have all their values normalized Ct norm scaling: 41 39 37 35 43 34 44 38 ABCC5 30 27.37706161 26.47393365 29.7721327 31.20189573 26.39260664 26.32436019 27.54274882 ADM 31.3 30.36540284 28.51753555 32.31241706 34.40473934 26.29800948 29.82796209 28.60208531 CEBPB 29.8 28.53383886 26.65971564 27.84151659 30.06540284 27.3385782 27.36597156 26.29080569 CSF1R 31.2 27.66625592 28.05308057 37.18976303 36.98767773 31.0278673 34.56255924 29.75772512 CXCL16 26.9 27.56985782 24.15165877 30.28018957 28.82559242 25.91962085 26.89251185 26.96492891 Ct norm geometric 41 39 37 35 43 34 44 38 ABCC5 30 27.73443878 26.93934246 29.88113261 30.76352197 26.51166676 26.8989347 27.49219508 ADM 31.3 30.76178949 29.01887064 32.4307173 33.92136694 26.41664286 30.47900874 28.5495872 CEBPB 29.8 28.90631647 27.12839047 27.94344824 29.64299633 27.46190571 27.96328103 26.24254985 CSF1R 31.2 28.0274082 28.5462506 37.32591991 36.46801611 31.16783762 35.31694663 29.70310587 CXCL16 26.9 27.92975172 24.57624224 30.39104955 28.42060473 26.03654728 27.47948724 26.91543574 This looks odd - why the first sample seems to be taken as a 'reference' for both normalization methods and hence is left unchanged ? This happens with ANY normalization procedure selected. Another (related ?) oddity is that in the final differential analysis result the same sample ID is always reported in the feature.pos field, as you can see below: genes feature.pos t.test p.value adj.p.value 22 NUCB1 41 -1.998838921 0.077900837 0.251381346 8 ERH 41 -1.958143348 0.091329532 0.251381346 16 MAFB 41 -1.887142703 0.09421993 0.251381346 28 RNF130 41 -1.904866754 0.099644523 0.251381346 3 CEBPB 41 -1.853176708 0.103563968 0.251381346 18 MSR1 41 -1.80887129 0.10432619 0.251381346 Are we doing something wrong in the data input or subsequent elaboration here? can we actually trust these normalizations? Many thanks in advance - kind regards Alessandro & Elena -- output of sessionInfo(): R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] HTqPCR_1.14.0 limma_3.16.8 RColorBrewer_1.0-5 Biobase_2.20.1 [5] BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] affy_1.38.1 affyio_1.28.0 BiocInstaller_1.10.3 [4] gdata_2.13.2 gplots_2.11.3 gtools_3.0.0 [7] preprocessCore_1.22.0 stats4_3.0.1 zlibbioc_1.6.0 -- Sent via the guest posting facility at bioconductor.org.
Normalization HTqPCR Normalization HTqPCR • 1.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States
Hi Allesandro, I believe this package is still maintained, and it is unfortunate that you have not received a reply. The expectation is that package maintainers will subscribe (and pay attention) to the Bioc listserv, but the list is fairly high traffic, so it never hurts to add a CC to the maintainer as well (which I have done for you). Best, Jim On Thursday, October 10, 2013 8:35:06 AM, Alessandro Guffanti [guest] wrote: > > Dear all, this is our third posting without a real reply so we wonder if this package is actually not maintained anymore ? if yes, it would be useful for us to know... > > > We are using HTqPCR to analyze a set of cards which we trasformed in this format, which is accepted by HtQPCR: > > 2 Run05 41 Passed sample 41 ABCC5 Target 30 > 3 Run05 41 Passed sample 41 ADM Target 31.3 > 4 Run05 41 Passed sample 41 CEBPB Target 29.8 > 5 Run05 41 Passed sample 41 CSF1R Target 31.2 > 6 Run05 41 Passed sample 41 CXCL16 Target 26.9 > 7 Run05 41 Passed sample 41 CYC1 Target 25.7 > > [...] > > The total number of files and groups is as follows - summarized in the file "Elenco_1.txt" which is used below: > > File Group > 41.txt Sano > 39.txt Sano > 37.txt Sano > 35.txt Sano > 43.txt Sano > 34.txt Sano > 44.txt Sano > 38.txt Sano > 48.txt Sano > 40.txt Sano > 47.txt Sano > 6.txt Non Responder DISEASE > 26.txt Non Responder DISEASE > 2.txt Non Responder DISEASE > 69.txt Non Responder DISEASE > 68.txt Non Responder DISEASE > 5.txt Non Responder DISEASE > 71.txt Responder DISEASE > 3.txt Responder DISEASE > 17.txt Responder DISEASE > 1.txt Responder DISEASE > 19.txt Responder DISEASE > > The comparison is DISEASE vs non DISEASE, but what leaves us dubious is the normalization part. > Note that sample 41 is the *first* of the list. > > Here is the code up to the dump of the normalized values matrices: > > library("HTqPCR") > path <- ("whatever/") > files <- read.delim (file.path(path, "Elenco_1.txt")) > files > filelist <- as.character(files$File) > filelist > raw <- readCtData(files = filelist, path = path, n.features=46, type=7, flag=NULL, feature=6, Ct=8, header=FALSE, n.data=1) > featureNames (raw) > raw.cat <- setCategory(raw, Ct.max=36, Ct.min=9, replicates=FALSE, quantile=0.9, groups =files$Group, verbose=TRUE) > > s.norm <- normalizeCtDataraw.cat, norm="scale.rank") > exprs(s.norm) > write.table(exprs(s.norm),file="Ct norm scaling.txt") > > g.norm <- normalizeCtDataraw.cat, norm="geometric.mean") > exprs(g.norm) > write.table(exprs(g.norm),file="Ct norm media geometrica.txt") > > Now if we look at the content of the two expression value files, it looks like that the first column > (corresponding to the first sample) is always unchanged, while all the others have been normalized. > > In this case the first dataset is sample 41 so you can check comparing between the corresponding column > above and below what is happening. > > We do not include here all the columns; however, you can see that all the samples *except the first (number 41)* have all their values normalized > > Ct norm scaling: > > 41 39 37 35 43 34 44 38 > ABCC5 30 27.37706161 26.47393365 29.7721327 31.20189573 26.39260664 26.32436019 27.54274882 > ADM 31.3 30.36540284 28.51753555 32.31241706 34.40473934 26.29800948 29.82796209 28.60208531 > CEBPB 29.8 28.53383886 26.65971564 27.84151659 30.06540284 27.3385782 27.36597156 26.29080569 > CSF1R 31.2 27.66625592 28.05308057 37.18976303 36.98767773 31.0278673 34.56255924 29.75772512 > CXCL16 26.9 27.56985782 24.15165877 30.28018957 28.82559242 25.91962085 26.89251185 26.96492891 > Ct norm geometric > > 41 39 37 35 43 34 44 38 > ABCC5 30 27.73443878 26.93934246 29.88113261 30.76352197 26.51166676 26.8989347 27.49219508 > ADM 31.3 30.76178949 29.01887064 32.4307173 33.92136694 26.41664286 30.47900874 28.5495872 > CEBPB 29.8 28.90631647 27.12839047 27.94344824 29.64299633 27.46190571 27.96328103 26.24254985 > CSF1R 31.2 28.0274082 28.5462506 37.32591991 36.46801611 31.16783762 35.31694663 29.70310587 > CXCL16 26.9 27.92975172 24.57624224 30.39104955 28.42060473 26.03654728 27.47948724 26.91543574 > > This looks odd - why the first sample seems to be taken as a 'reference' for both normalization methods and hence is left unchanged ? > > This happens with ANY normalization procedure selected. > > Another (related ?) oddity is that in the final differential analysis result the same sample ID is always reported > in the feature.pos field, as you can see below: > > genes feature.pos t.test p.value adj.p.value > 22 NUCB1 41 -1.998838921 0.077900837 0.251381346 > 8 ERH 41 -1.958143348 0.091329532 0.251381346 > 16 MAFB 41 -1.887142703 0.09421993 0.251381346 > 28 RNF130 41 -1.904866754 0.099644523 0.251381346 > 3 CEBPB 41 -1.853176708 0.103563968 0.251381346 > 18 MSR1 41 -1.80887129 0.10432619 0.251381346 > > Are we doing something wrong in the data input or subsequent elaboration here? can we actually trust these normalizations? > > Many thanks in advance - kind regards > > Alessandro & Elena > > > > > -- output of sessionInfo(): > > > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] HTqPCR_1.14.0 limma_3.16.8 RColorBrewer_1.0-5 Biobase_2.20.1 > [5] BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] affy_1.38.1 affyio_1.28.0 BiocInstaller_1.10.3 > [4] gdata_2.13.2 gplots_2.11.3 gtools_3.0.0 > [7] preprocessCore_1.22.0 stats4_3.0.1 zlibbioc_1.6.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Thanks, much appreciated ! It would be important for us to understand wether we are doing something fundamental wrong, or if there actually is a bug on the software (happens), because we are using heavily this package for validating NGS gene expression analysis findings.. Thanks you so much for the excellent work ! Keep in touch Alessandro & Elena On 10/10/2013 3:56 PM, James W. MacDonald wrote: > Hi Allesandro, > > I believe this package is still maintained, and it is unfortunate that > you have not received a reply. The expectation is that package > maintainers will subscribe (and pay attention) to the Bioc listserv, > but the list is fairly high traffic, so it never hurts to add a CC to > the maintainer as well (which I have done for you). > > Best, > > Jim > > > > On Thursday, October 10, 2013 8:35:06 AM, Alessandro Guffanti [guest] > wrote: >> >> Dear all, this is our third posting without a real reply so we wonder >> if this package is actually not maintained anymore ? if yes, it would >> be useful for us to know... >> >> >> We are using HTqPCR to analyze a set of cards which we trasformed in >> this format, which is accepted by HtQPCR: >> >> 2 Run05 41 Passed sample 41 ABCC5 Target 30 >> 3 Run05 41 Passed sample 41 ADM Target 31.3 >> 4 Run05 41 Passed sample 41 CEBPB Target 29.8 >> 5 Run05 41 Passed sample 41 CSF1R Target 31.2 >> 6 Run05 41 Passed sample 41 CXCL16 Target 26.9 >> 7 Run05 41 Passed sample 41 CYC1 Target 25.7 >> >> [...] >> >> The total number of files and groups is as follows - summarized in >> the file "Elenco_1.txt" which is used below: >> >> File Group >> 41.txt Sano >> 39.txt Sano >> 37.txt Sano >> 35.txt Sano >> 43.txt Sano >> 34.txt Sano >> 44.txt Sano >> 38.txt Sano >> 48.txt Sano >> 40.txt Sano >> 47.txt Sano >> 6.txt Non Responder DISEASE >> 26.txt Non Responder DISEASE >> 2.txt Non Responder DISEASE >> 69.txt Non Responder DISEASE >> 68.txt Non Responder DISEASE >> 5.txt Non Responder DISEASE >> 71.txt Responder DISEASE >> 3.txt Responder DISEASE >> 17.txt Responder DISEASE >> 1.txt Responder DISEASE >> 19.txt Responder DISEASE >> >> The comparison is DISEASE vs non DISEASE, but what leaves us >> dubious is the normalization part. >> Note that sample 41 is the *first* of the list. >> >> Here is the code up to the dump of the normalized values matrices: >> >> library("HTqPCR") >> path <- ("whatever/") >> files <- read.delim (file.path(path, "Elenco_1.txt")) >> files >> filelist <- as.character(files$File) >> filelist >> raw <- readCtData(files = filelist, path = path, n.features=46, >> type=7, flag=NULL, feature=6, Ct=8, header=FALSE, n.data=1) >> featureNames (raw) >> raw.cat <- setCategory(raw, Ct.max=36, Ct.min=9, replicates=FALSE, >> quantile=0.9, groups =files$Group, verbose=TRUE) >> >> s.norm <- normalizeCtDataraw.cat, norm="scale.rank") >> exprs(s.norm) >> write.table(exprs(s.norm),file="Ct norm scaling.txt") >> >> g.norm <- normalizeCtDataraw.cat, norm="geometric.mean") >> exprs(g.norm) >> write.table(exprs(g.norm),file="Ct norm media geometrica.txt") >> >> Now if we look at the content of the two expression value files, it >> looks like that the first column >> (corresponding to the first sample) is always unchanged, while all >> the others have been normalized. >> >> In this case the first dataset is sample 41 so you can check >> comparing between the corresponding column >> above and below what is happening. >> >> We do not include here all the columns; however, you can see that >> all the samples *except the first (number 41)* have all their values >> normalized >> >> Ct norm scaling: >> >> 41 39 37 35 43 34 44 38 >> ABCC5 30 27.37706161 26.47393365 29.7721327 >> 31.20189573 26.39260664 26.32436019 27.54274882 >> ADM 31.3 30.36540284 28.51753555 32.31241706 >> 34.40473934 26.29800948 29.82796209 28.60208531 >> CEBPB 29.8 28.53383886 26.65971564 27.84151659 >> 30.06540284 27.3385782 27.36597156 26.29080569 >> CSF1R 31.2 27.66625592 28.05308057 37.18976303 >> 36.98767773 31.0278673 34.56255924 29.75772512 >> CXCL16 26.9 27.56985782 24.15165877 30.28018957 >> 28.82559242 25.91962085 26.89251185 26.96492891 >> Ct norm geometric >> >> 41 39 37 35 43 34 44 38 >> ABCC5 30 27.73443878 26.93934246 29.88113261 >> 30.76352197 26.51166676 26.8989347 27.49219508 >> ADM 31.3 30.76178949 29.01887064 32.4307173 >> 33.92136694 26.41664286 30.47900874 28.5495872 >> CEBPB 29.8 28.90631647 27.12839047 27.94344824 >> 29.64299633 27.46190571 27.96328103 26.24254985 >> CSF1R 31.2 28.0274082 28.5462506 37.32591991 >> 36.46801611 31.16783762 35.31694663 29.70310587 >> CXCL16 26.9 27.92975172 24.57624224 30.39104955 >> 28.42060473 26.03654728 27.47948724 26.91543574 >> >> This looks odd - why the first sample seems to be taken as a >> 'reference' for both normalization methods and hence is left unchanged ? >> >> This happens with ANY normalization procedure selected. >> >> Another (related ?) oddity is that in the final differential >> analysis result the same sample ID is always reported >> in the feature.pos field, as you can see below: >> >> genes feature.pos t.test p.value adj.p.value >> 22 NUCB1 41 -1.998838921 0.077900837 0.251381346 >> 8 ERH 41 -1.958143348 0.091329532 0.251381346 >> 16 MAFB 41 -1.887142703 0.09421993 0.251381346 >> 28 RNF130 41 -1.904866754 0.099644523 0.251381346 >> 3 CEBPB 41 -1.853176708 0.103563968 0.251381346 >> 18 MSR1 41 -1.80887129 0.10432619 0.251381346 >> >> Are we doing something wrong in the data input or subsequent >> elaboration here? can we actually trust these normalizations? >> >> Many thanks in advance - kind regards >> >> Alessandro & Elena >> >> >> >> >> -- output of sessionInfo(): >> >> >> R version 3.0.1 (2013-05-16) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] HTqPCR_1.14.0 limma_3.16.8 RColorBrewer_1.0-5 >> Biobase_2.20.1 >> [5] BiocGenerics_0.6.0 >> >> loaded via a namespace (and not attached): >> [1] affy_1.38.1 affyio_1.28.0 BiocInstaller_1.10.3 >> [4] gdata_2.13.2 gplots_2.11.3 gtools_3.0.0 >> [7] preprocessCore_1.22.0 stats4_3.0.1 zlibbioc_1.6.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 -- Alessandro Guffanti Alessandro Guffanti Head, Bioinformatics *Genomnia srl* Via Nerviano, 31/B – 20020 Lainate (MI) Tel. +39-0293305.702 / Fax +39-0293305.777 www.genomnia.com <http: www.genomnia.com=""> alessandro.guffanti@genomnia.com <mailto:alessandro.guffanti@genomnia.com> *P* *Per cortesia, prima di stampare questa e-mail pensate all'ambiente.* * Please consider the environment before printing this mail note.* ----------------------------------------------------------- Il Contenuto del presente messaggio potrebbe contenere informazioni confidenziali a favore dei soli destinatari del messaggio stesso. Qualora riceviate per errore questo messaggio siete pregati di cancellarlo dalla memoria del computer e di contattare i numeri sopra indicati. Ogni utilizzo o ritrasmissione dei contenuti del messaggio da parte di soggetti diversi dai destinatari è da considerarsi vietato ed abusivo. The information transmitted is intended only for the per...{{dropped:10}}
0
Entering edit mode
On 10/10/2013 07:40, alessandro.guffanti at genomnia.com wrote: > Thanks, much appreciated ! > > It would be important for us to understand wether we are doing > something fundamental wrong, or if there actually is a bug on the > software (happens), because we are using heavily this package for > validating NGS gene expression analysis findings.. > > Thanks you so much for the excellent work ! > > Keep in touch > > Alessandro & Elena > > On 10/10/2013 3:56 PM, James W. MacDonald wrote: > >> Hi Allesandro, >> >> I believe this package is still maintained, and it is unfortunate >> that you have not received a reply. The expectation is that package >> maintainers will subscribe (and pay attention) to the Bioc listserv, >> but the list is fairly high traffic, so it never hurts to add a CC to >> the maintainer as well (which I have done for you). >> >> Best, >> >> Jim >> >> On Thursday, October 10, 2013 8:35:06 AM, Alessandro Guffanti [guest] >> wrote: >> >>> Dear all, this is our third posting without a real reply so we >>> wonder if this package is actually not maintained anymore ? if yes, >>> it would be useful for us to know... >>> >>> We are using HTqPCR to analyze a set of cards which we trasformed in >>> this format, which is accepted by HtQPCR: >>> >>> ? 2??? Run05??? 41??? Passed??? sample 41??? ABCC5??? Target??? 30 >>> ? 3??? Run05??? 41??? Passed??? sample 41??? ADM??? Target??? 31.3 >>> ? 4??? Run05??? 41??? Passed??? sample 41??? CEBPB??? Target??? 29.8 >>> ? 5??? Run05??? 41??? Passed??? sample 41??? CSF1R??? Target??? 31.2 >>> ? 6??? Run05??? 41??? Passed??? sample 41??? CXCL16??? Target??? >>> 26.9 >>> ? 7??? Run05??? 41??? Passed??? sample 41??? CYC1??? Target??? 25.7 >>> >>> ? [...] >>> >>> ? The total number of files and groups is as follows - summarized in >>> the file "Elenco_1.txt" which is used below: >>> >>> ? File??? Group >>> ? 41.txt??? Sano >>> ? 39.txt??? Sano >>> ? 37.txt??? Sano >>> ? 35.txt??? Sano >>> ? 43.txt??? Sano >>> ? 34.txt??? Sano >>> ? 44.txt??? Sano >>> ? 38.txt??? Sano >>> ? 48.txt??? Sano >>> ? 40.txt??? Sano >>> ? 47.txt??? Sano >>> ? 6.txt??? Non Responder DISEASE >>> ? 26.txt??? Non Responder DISEASE >>> ? 2.txt??? Non Responder DISEASE >>> ? 69.txt??? Non Responder DISEASE >>> ? 68.txt??? Non Responder DISEASE >>> ? 5.txt??? Non Responder DISEASE >>> ? 71.txt??? Responder DISEASE >>> ? 3.txt??? Responder DISEASE >>> ? 17.txt??? Responder DISEASE >>> ? 1.txt??? Responder DISEASE >>> ? 19.txt??? Responder DISEASE >>> >>> ? The comparison is DISEASE vs non DISEASE, but what leaves us >>> dubious is the normalization part. >>> ? Note that sample 41 is the *first* of the list. >>> >>> ? Here is the code up to the dump of the normalized values matrices: >>> >>> ? library("HTqPCR") >>> ? path <- ("whatever/") >>> ? files <- read.delim (file.path(path, "Elenco_1.txt")) >>> ? files >>> ? filelist <- as.character(files$File) >>> ? filelist >>> ? raw <- readCtData(files = filelist, path = path, n.features=46, >>> type=7, flag=NULL, feature=6, Ct=8, header=FALSE, n.data=1) >>> ? featureNames (raw) >>> ? raw.cat <- setCategory(raw, Ct.max=36, Ct.min=9, replicates=FALSE, >>> quantile=0.9, groups =files$Group, verbose=TRUE) >>> >>> ? s.norm <- normalizeCtDataraw.cat, norm="scale.rank") >>> ? exprs(s.norm) >>> ? write.table(exprs(s.norm),file="Ct norm scaling.txt") >>> >>> ? g.norm <- normalizeCtDataraw.cat, norm="geometric.mean") >>> ? exprs(g.norm) >>> ? write.table(exprs(g.norm),file="Ct norm media geometrica.txt") >>> >>> ? Now if we look at the content of the two expression value files, >>> it looks like that the first column >>> ? (corresponding to the first sample) is always unchanged, while all >>> the others have been normalized. >>> >>> ? In this case the first dataset is sample 41 so you can check >>> comparing between the corresponding column >>> ? above and below what is happening. >>> >>> ? We do not include here all the columns; however, you can see that >>> all the samples *except the first (number 41)* have all their values >>> normalized >>> >>> ? Ct norm scaling: >>> >>> ????? 41??? 39??? 37??? 35??? 43??? 34??? 44??? 38 >>> ? ABCC5??? 30??? 27.37706161??? 26.47393365??? 29.7721327??? >>> 31.20189573??? 26.39260664??? 26.32436019??? 27.54274882 >>> ? ADM??? 31.3??? 30.36540284??? 28.51753555??? 32.31241706??? >>> 34.40473934??? 26.29800948??? 29.82796209??? 28.60208531 >>> ? CEBPB??? 29.8??? 28.53383886??? 26.65971564??? 27.84151659??? >>> 30.06540284??? 27.3385782??? 27.36597156??? 26.29080569 >>> ? CSF1R??? 31.2??? 27.66625592??? 28.05308057??? 37.18976303??? >>> 36.98767773??? 31.0278673??? 34.56255924??? 29.75772512 >>> ? CXCL16??? 26.9??? 27.56985782??? 24.15165877??? 30.28018957??? >>> 28.82559242??? 25.91962085??? 26.89251185??? 26.96492891 >>> ?? Ct norm geometric >>> >>> ????? 41??? 39??? 37??? 35??? 43??? 34??? 44??? 38 >>> ? ABCC5??? 30??? 27.73443878??? 26.93934246??? 29.88113261??? >>> 30.76352197??? 26.51166676??? 26.8989347??? 27.49219508 >>> ? ADM??? 31.3??? 30.76178949??? 29.01887064??? 32.4307173??? >>> 33.92136694??? 26.41664286??? 30.47900874??? 28.5495872 >>> ? CEBPB??? 29.8??? 28.90631647??? 27.12839047??? 27.94344824??? >>> 29.64299633??? 27.46190571??? 27.96328103??? 26.24254985 >>> ? CSF1R??? 31.2??? 28.0274082??? 28.5462506??? 37.32591991??? >>> 36.46801611??? 31.16783762??? 35.31694663??? 29.70310587 >>> ? CXCL16??? 26.9??? 27.92975172??? 24.57624224??? 30.39104955??? >>> 28.42060473??? 26.03654728??? 27.47948724??? 26.91543574 >>> >>> ? This looks odd - why the first sample seems to be taken as a >>> 'reference' for both normalization methods and hence is left >>> unchanged ? >>> >>> ? This happens with ANY normalization procedure selected. >>> >>> ? Another (related ?) oddity is that in the final differential >>> analysis result the same sample ID is always reported >>> ? in the feature.pos field, as you can see below: >>> >>> ????? genes??? feature.pos??? t.test??? p.value??? adj.p.value >>> ? 22??? NUCB1??? 41??? -1.998838921??? 0.077900837??? 0.251381346 >>> ? 8??? ERH??? 41??? -1.958143348??? 0.091329532??? 0.251381346 >>> ? 16??? MAFB??? 41??? -1.887142703??? 0.09421993??? 0.251381346 >>> ? 28??? RNF130??? 41??? -1.904866754??? 0.099644523??? 0.251381346 >>> ? 3??? CEBPB??? 41??? -1.853176708??? 0.103563968??? 0.251381346 >>> ? 18??? MSR1??? 41??? -1.80887129??? 0.10432619??? 0.251381346 >>> >>> ? Are we doing something wrong in the data input or subsequent >>> elaboration here? can we actually trust these normalizations? >>> >>> ? Many thanks in advance - kind regards >>> >>> ? Alessandro & Elena >>> >>> ? -- output of sessionInfo(): >>> >>> R version 3.0.1 (2013-05-16) >>> Platform: x86_64-w64-mingw32/x64 (64-bit) >>> >>> locale: >>> [1] LC_COLLATE=English_United States.1252 >>> [2] LC_CTYPE=English_United States.1252 >>> [3] LC_MONETARY=English_United States.1252 >>> [4] LC_NUMERIC=C >>> [5] LC_TIME=English_United States.1252 >>> >>> attached base packages: >>> [1] parallel? stats???? graphics? grDevices utils???? datasets? >>> methods >>> [8] base >>> >>> other attached packages: >>> [1] HTqPCR_1.14.0????? limma_3.16.8?????? RColorBrewer_1.0-5 >>> Biobase_2.20.1 >>> [5] BiocGenerics_0.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affy_1.38.1?????????? affyio_1.28.0???????? BiocInstaller_1.10.3 >>> [4] gdata_2.13.2????????? gplots_2.11.3???????? gtools_3.0.0 >>> [7] preprocessCore_1.22.0 stats4_3.0.1????????? zlibbioc_1.6.0 >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor [1] >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor [2] >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences Hi Alessandro, my apologies for the lack of a reply. I'm recently had to take a hiatus from Bioconductor, but will resume work on HTqPCR following the impending Bioconductor release. best, \Heidi >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 > > -- > Alessandro Guffanti > > Alessandro Guffanti > > Head, Bioinformatics > > GENOMNIA SRL > > Via Nerviano, 31/B ? 20020 Lainate (MI) > > Tel. +39-0293305.702 / Fax +39-0293305.777 > > www.genomnia.com [3] > > alessandro.guffanti at genomnia.com > > P PER CORTESIA, PRIMA DI STAMPARE QUESTA E-MAIL PENSATE ALL'AMBIENTE. > > ???????????PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS MAIL > NOTE. > ----------------------------------------------------------- > Il Contenuto del presente messaggio potrebbe contenere informazioni > confidenziali a favore dei > soli destinatari del messaggio stesso. Qualora riceviate per errore > questo messaggio siete pregati > di cancellarlo dalla memoria del computer e di contattare i numeri > sopra indicati. Ogni utilizzo o > ritrasmissione dei contenuti del messaggio da parte di soggetti > diversi dai destinatari ? da > considerarsi vietato ed abusivo. > > The information transmitted is intended only for the person or > entity to which it is addressed and > contains confidential and/or privileged material. Any review, > retransmission, dissemination or other > use of, or taking of any action in reliance upon, this information > by persons or entities other than > the intended recipient is prohibited. If you received this in error, > please contact the sender and > delete the material from any computer. > ----------------------------------------------------------- > > > Links: > ------ > [1] https://stat.ethz.ch/mailman/listinfo/bioconductor > [2] http://news.gmane.org/gmane.science.biology.informatics.conductor > [3] http://www.genomnia.com
ADD REPLY
0
Entering edit mode
On 10/10/2013 09:35, heidi wrote: > On 10/10/2013 07:40, alessandro.guffanti at genomnia.com wrote: >> Thanks, much appreciated ! >> It would be important for us to understand wether we are doing >> something fundamental wrong, or if there actually is a bug on the >> software (happens), because we are using heavily this package for >> validating NGS gene expression analysis findings.. >> Thanks you so much for the excellent work ! >> Keep in touch >> Alessandro & Elena >> On 10/10/2013 3:56 PM, James W. MacDonald wrote: >> >>> Hi Allesandro, >>> I believe this package is still maintained, and it is unfortunate >>> that you have not received a reply. The expectation is that package >>> maintainers will subscribe (and pay attention) to the Bioc listserv, >>> but the list is fairly high traffic, so it never hurts to add a CC to >>> the maintainer as well (which I have done for you). >>> Best, >>> Jim >>> On Thursday, October 10, 2013 8:35:06 AM, Alessandro Guffanti >>> [guest] wrote: >>> >>>> Dear all, this is our third posting without a real reply so we >>>> wonder if this package is actually not maintained anymore ? if yes, >>>> it would be useful for us to know... >>>> We are using HTqPCR to analyze a set of cards which we trasformed >>>> in this format, which is accepted by HtQPCR: >>>> ? 2??? Run05??? 41??? Passed??? sample 41??? ABCC5??? Target??? 30 >>>> ? 3??? Run05??? 41??? Passed??? sample 41??? ADM??? Target??? 31.3 >>>> ? 4??? Run05??? 41??? Passed??? sample 41??? CEBPB??? Target??? >>>> 29.8 >>>> ? 5??? Run05??? 41??? Passed??? sample 41??? CSF1R??? Target??? >>>> 31.2 >>>> ? 6??? Run05??? 41??? Passed??? sample 41??? CXCL16??? Target??? >>>> 26.9 >>>> ? 7??? Run05??? 41??? Passed??? sample 41??? CYC1??? Target??? 25.7 >>>> ? [...] >>>> ? The total number of files and groups is as follows - summarized >>>> in the file "Elenco_1.txt" which is used below: >>>> ? File??? Group >>>> ? 41.txt??? Sano >>>> ? 39.txt??? Sano >>>> ? 37.txt??? Sano >>>> ? 35.txt??? Sano >>>> ? 43.txt??? Sano >>>> ? 34.txt??? Sano >>>> ? 44.txt??? Sano >>>> ? 38.txt??? Sano >>>> ? 48.txt??? Sano >>>> ? 40.txt??? Sano >>>> ? 47.txt??? Sano >>>> ? 6.txt??? Non Responder DISEASE >>>> ? 26.txt??? Non Responder DISEASE >>>> ? 2.txt??? Non Responder DISEASE >>>> ? 69.txt??? Non Responder DISEASE >>>> ? 68.txt??? Non Responder DISEASE >>>> ? 5.txt??? Non Responder DISEASE >>>> ? 71.txt??? Responder DISEASE >>>> ? 3.txt??? Responder DISEASE >>>> ? 17.txt??? Responder DISEASE >>>> ? 1.txt??? Responder DISEASE >>>> ? 19.txt??? Responder DISEASE >>>> ? The comparison is DISEASE vs non DISEASE, but what leaves us >>>> dubious is the normalization part. >>>> ? Note that sample 41 is the *first* of the list. >>>> ? Here is the code up to the dump of the normalized values >>>> matrices: >>>> ? library("HTqPCR") >>>> ? path <- ("whatever/") >>>> ? files <- read.delim (file.path(path, "Elenco_1.txt")) >>>> ? files >>>> ? filelist <- as.character(files$File) >>>> ? filelist >>>> ? raw <- readCtData(files = filelist, path = path, n.features=46, >>>> type=7, flag=NULL, feature=6, Ct=8, header=FALSE, n.data=1) >>>> ? featureNames (raw) >>>> ? raw.cat <- setCategory(raw, Ct.max=36, Ct.min=9, >>>> replicates=FALSE, quantile=0.9, groups =files$Group, verbose=TRUE) >>>> ? s.norm <- normalizeCtDataraw.cat, norm="scale.rank") >>>> ? exprs(s.norm) >>>> ? write.table(exprs(s.norm),file="Ct norm scaling.txt") >>>> ? g.norm <- normalizeCtDataraw.cat, norm="geometric.mean") >>>> ? exprs(g.norm) >>>> ? write.table(exprs(g.norm),file="Ct norm media geometrica.txt") >>>> ? Now if we look at the content of the two expression value files, >>>> it looks like that the first column >>>> ? (corresponding to the first sample) is always unchanged, while >>>> all the others have been normalized. >>>> ? In this case the first dataset is sample 41 so you can check >>>> comparing between the corresponding column >>>> ? above and below what is happening. >>>> ? We do not include here all the columns; however, you can see that >>>> all the samples *except the first (number 41)* have all their values >>>> normalized >>>> ? Ct norm scaling: >>>> ????? 41??? 39??? 37??? 35??? 43??? 34??? 44??? 38 >>>> ? ABCC5??? 30??? 27.37706161??? 26.47393365??? 29.7721327??? >>>> 31.20189573??? 26.39260664??? 26.32436019??? 27.54274882 >>>> ? ADM??? 31.3??? 30.36540284??? 28.51753555??? 32.31241706??? >>>> 34.40473934??? 26.29800948??? 29.82796209??? 28.60208531 >>>> ? CEBPB??? 29.8??? 28.53383886??? 26.65971564??? 27.84151659??? >>>> 30.06540284??? 27.3385782??? 27.36597156??? 26.29080569 >>>> ? CSF1R??? 31.2??? 27.66625592??? 28.05308057??? 37.18976303??? >>>> 36.98767773??? 31.0278673??? 34.56255924??? 29.75772512 >>>> ? CXCL16??? 26.9??? 27.56985782??? 24.15165877??? 30.28018957??? >>>> 28.82559242??? 25.91962085??? 26.89251185??? 26.96492891 >>>> ?? Ct norm geometric >>>> ????? 41??? 39??? 37??? 35??? 43??? 34??? 44??? 38 >>>> ? ABCC5??? 30??? 27.73443878??? 26.93934246??? 29.88113261??? >>>> 30.76352197??? 26.51166676??? 26.8989347??? 27.49219508 >>>> ? ADM??? 31.3??? 30.76178949??? 29.01887064??? 32.4307173??? >>>> 33.92136694??? 26.41664286??? 30.47900874??? 28.5495872 >>>> ? CEBPB??? 29.8??? 28.90631647??? 27.12839047??? 27.94344824??? >>>> 29.64299633??? 27.46190571??? 27.96328103??? 26.24254985 >>>> ? CSF1R??? 31.2??? 28.0274082??? 28.5462506??? 37.32591991??? >>>> 36.46801611??? 31.16783762??? 35.31694663??? 29.70310587 >>>> ? CXCL16??? 26.9??? 27.92975172??? 24.57624224??? 30.39104955??? >>>> 28.42060473??? 26.03654728??? 27.47948724??? 26.91543574 >>>> ? This looks odd - why the first sample seems to be taken as a >>>> 'reference' for both normalization methods and hence is left >>>> unchanged ? >>>> ? This happens with ANY normalization procedure selected. >>>> ? Another (related ?) oddity is that in the final differential >>>> analysis result the same sample ID is always reported >>>> ? in the feature.pos field, as you can see below: >>>> ????? genes??? feature.pos??? t.test??? p.value??? adj.p.value >>>> ? 22??? NUCB1??? 41??? -1.998838921??? 0.077900837??? 0.251381346 >>>> ? 8??? ERH??? 41??? -1.958143348??? 0.091329532??? 0.251381346 >>>> ? 16??? MAFB??? 41??? -1.887142703??? 0.09421993??? 0.251381346 >>>> ? 28??? RNF130??? 41??? -1.904866754??? 0.099644523??? 0.251381346 >>>> ? 3??? CEBPB??? 41??? -1.853176708??? 0.103563968??? 0.251381346 >>>> ? 18??? MSR1??? 41??? -1.80887129??? 0.10432619??? 0.251381346 >>>> ? Are we doing something wrong in the data input or subsequent >>>> elaboration here? can we actually trust these normalizations? >>>> ? Many thanks in advance - kind regards >>>> ? Alessandro & Elena >>>> ? -- output of sessionInfo(): >>>> R version 3.0.1 (2013-05-16) >>>> Platform: x86_64-w64-mingw32/x64 (64-bit) >>>> locale: >>>> [1] LC_COLLATE=English_United States.1252 >>>> [2] LC_CTYPE=English_United States.1252 >>>> [3] LC_MONETARY=English_United States.1252 >>>> [4] LC_NUMERIC=C >>>> [5] LC_TIME=English_United States.1252 >>>> attached base packages: >>>> [1] parallel? stats???? graphics? grDevices utils???? datasets? >>>> methods >>>> [8] base >>>> other attached packages: >>>> [1] HTqPCR_1.14.0????? limma_3.16.8?????? RColorBrewer_1.0-5 >>>> Biobase_2.20.1 >>>> [5] BiocGenerics_0.6.0 >>>> loaded via a namespace (and not attached): >>>> [1] affy_1.38.1?????????? affyio_1.28.0???????? >>>> BiocInstaller_1.10.3 >>>> [4] gdata_2.13.2????????? gplots_2.11.3???????? gtools_3.0.0 >>>> [7] preprocessCore_1.22.0 stats4_3.0.1????????? zlibbioc_1.6.0 >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor [1] >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> [2] >>> -- >>> James W. MacDonald, M.S. Hi Alessandro, my apologies for the lack of a response. I've recently had to take a hiatus from Bioconductor, but will resume work on HTqPCR following the impending Bioconductor release. Best, \Heidi >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences > Hi Alessandro, > > my apologies for the lack of a reply. I'm recently had to take a > hiatus from Bioconductor, but will resume work on HTqPCR following the > impending Bioconductor release. > > best, > \Heidi > >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >> -- >> Alessandro Guffanti >> Alessandro Guffanti >> Head, Bioinformatics >> GENOMNIA SRL >> Via Nerviano, 31/B ? 20020 Lainate (MI) >> Tel. +39-0293305.702 / Fax +39-0293305.777 >> www.genomnia.com [3] >> alessandro.guffanti at genomnia.com >> P PER CORTESIA, PRIMA DI STAMPARE QUESTA E-MAIL PENSATE ALL'AMBIENTE. >> ???????????PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS MAIL >> NOTE. >> ----------------------------------------------------------- >> Il Contenuto del presente messaggio potrebbe contenere informazioni >> confidenziali a favore dei >> soli destinatari del messaggio stesso. Qualora riceviate per errore >> questo messaggio siete pregati >> di cancellarlo dalla memoria del computer e di contattare i numeri >> sopra indicati. Ogni utilizzo o >> ritrasmissione dei contenuti del messaggio da parte di soggetti >> diversi dai destinatari ? da >> considerarsi vietato ed abusivo. >> The information transmitted is intended only for the person or >> entity to which it is addressed and >> contains confidential and/or privileged material. Any review, >> retransmission, dissemination or other >> use of, or taking of any action in reliance upon, this information >> by persons or entities other than >> the intended recipient is prohibited. If you received this in error, >> please contact the sender and >> delete the material from any computer. >> ----------------------------------------------------------- >> >> Links: >> ------ >> [1] https://stat.ethz.ch/mailman/listinfo/bioconductor >> [2] http://news.gmane.org/gmane.science.biology.informatics.conductor >> [3] http://www.genomnia.com
ADD REPLY
0
Entering edit mode
Thanks Heidi - maybe you have already seen what we describe, and hence there is an easy explanation which does not require further work on your side ? best wishes, Alessandro Hi Alessandro, > > my apologies for the lack of a response. I've recently had to take a > hiatus from Bioconductor, but will resume work on HTqPCR following the > impending Bioconductor release. > > Best, > \Heidi > -- Alessandro Guffanti Alessandro Guffanti Head, Bioinformatics *Genomnia srl* Via Nerviano, 31/B – 20020 Lainate (MI) Tel. +39-0293305.702 / Fax +39-0293305.777 www.genomnia.com <http: www.genomnia.com=""> alessandro.guffanti@genomnia.com <mailto:alessandro.guffanti@genomnia.com> *P* *Per cortesia, prima di stampare questa e-mail pensate all'ambiente.* * Please consider the environment before printing this mail note.* ----------------------------------------------------------- Il Contenuto del presente messaggio potrebbe contenere informazioni confidenziali a favore dei soli destinatari del messaggio stesso. Qualora riceviate per errore questo messaggio siete pregati di cancellarlo dalla memoria del computer e di contattare i numeri sopra indicati. Ogni utilizzo o ritrasmissione dei contenuti del messaggio da parte di soggetti diversi dai destinatari è da considerarsi vietato ed abusivo. The information transmitted is intended only for the per...{{dropped:10}}

Login before adding your answer.

Traffic: 749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6