R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db
1
0
Entering edit mode
@manca-marco-path-4295
Last seen 9.6 years ago
Dear Sean, thank you for your note on the difference between Ingenuity Pathway analysis and GO/KEGG analysis. I would hope to receive also a feedback to my question, which was rather different from that of the lady you answered to: concerning the somewhat more "raw" annotation of probes (LocusLink/EtrezIDs, UniProt, etc) I am having a hard time with, how would you explain (if you have any insight at all of course) the difference in the number of annotated probes according to Bioconductor (around 27000 either by illuminaHumanv3BeadID.db or by lumiHumanAll.db) or Ingenuity (around 47000-48000 according to my colleague's experience), at least for the Illumina human ht12 v3 beadchip? And most important, how is an investigator supposed to decide which tool should be used? Of course I could test them and then go to the bench to verify my results, but this would increase my expenses and would delay my publications... and most likely wouldn't give me a general principle to rely on in this situation the next time I have to perform a microarray analysis... Thank you in advance for your attention. best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Manca Marco (PATH) Inviato: mercoled? 24 novembre 2010 17.05 A: J.Oosting at lumc.nl; bioconductor at stat.math.ethz.ch Cc: mark.dunning at gmail.com Oggetto: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db Dear Jan, thank you for your prompt reply. Can I blatantly ask why are these probes included in the chip if so? Yet I have been comparing the results I obtain in BioConductor with those of a colleague who is analyzing data obtained with the same chip but using Ingenuity Pathway Analysis and she is apparently missing only a few hundreds annotation rather than my tens of thousands... Is Ingenuity doing something wrong here (like attributing annotations based on imperfect alignments?) or shall I abide to those results and leave R/BioConductor for these datasets? Thank you in advance for any insight you will share with me. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: J.Oosting at lumc.nl [J.Oosting at lumc.nl] Inviato: mercoled? 24 novembre 2010 16.47 A: Manca Marco (PATH); bioconductor at stat.math.ethz.ch Oggetto: RE: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db This is normal for this chiptype. The HT12 contains a lot of probes that have no proper annotation. These are mostly ESTs that have been submitted to Genbank at some time, but have never been properly attributed to any gene. For most types of analysis these extra probes are basically worthless, so I usually exclude them before the statistical analysis. Jan > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor- > bounces at stat.math.ethz.ch] On Behalf Of Manca Marco (PATH) > Sent: woensdag 24 november 2010 16:01 > To: bioconductor mailing list > Subject: [BioC] Illumina human ht12 v3 beadchip and > illuminaHumanv3BeadID.db > Importance: High > > > > Dearest BioConductors, > > good afternoon. > > I request your assistance on an issue about which I have found a few old > posts but I can't manage to find the solution. > > I am analyzing an experiment performed by use of Illumina's "human ht12 v3 > beadchip" and I am trying now perform some GO and Pathway analysis to make > sense of the results I have obtained. > > The package "of choice" for annotating this chip should be > illuminaHumanv3BeadID.db (but I have similar results with lumiHumanAll.db > after converting Illumina probes' IDs to NuIDs): the chip has apparently > 48803 probes, while I can obtain annotations for roughly 27500 of them > > > qcdata = capture.output(illuminaHumanv3BeadID()) > > head(qcdata, 35) > [1] "Quality control information for illuminaHumanv3BeadID:" > [2] "" > [3] "" > [4] "This package has the following mappings:" > [5] "" > [6] "illuminaHumanv3BeadIDACCNUM has 27570 mapped keys (of 27570 keys)" > [7] "illuminaHumanv3BeadIDALIAS2PROBE has 67274 mapped keys (of 109070 > keys)" > [8] "illuminaHumanv3BeadIDCHR has 25726 mapped keys (of 27570 keys)" > [9] "illuminaHumanv3BeadIDCHRLENGTHS has 25 mapped keys (of 25 keys)" > [10] "illuminaHumanv3BeadIDCHRLOC has 24916 mapped keys (of 27570 keys)" > [11] "illuminaHumanv3BeadIDCHRLOCEND has 24916 mapped keys (of 27570 > keys)" > [12] "illuminaHumanv3BeadIDENSEMBL has 24681 mapped keys (of 27570 keys)" > [13] "illuminaHumanv3BeadIDENSEMBL2PROBE has 17009 mapped keys (of 19892 > keys)" > [14] "illuminaHumanv3BeadIDENTREZID has 25726 mapped keys (of 27570 keys)" > [15] "illuminaHumanv3BeadIDENZYME has 2890 mapped keys (of 27570 keys)" > [16] "illuminaHumanv3BeadIDENZYME2PROBE has 857 mapped keys (of 901 keys)" > [17] "illuminaHumanv3BeadIDGENENAME has 25726 mapped keys (of 27570 keys)" > [18] "illuminaHumanv3BeadIDGO has 22807 mapped keys (of 27570 keys)" > [19] "illuminaHumanv3BeadIDGO2ALLPROBES has 11021 mapped keys (of 11236 > keys)" > [20] "illuminaHumanv3BeadIDGO2PROBE has 8010 mapped keys (of 8245 keys)" > [21] "illuminaHumanv3BeadIDMAP has 25589 mapped keys (of 27570 keys)" > [22] "illuminaHumanv3BeadIDOMIM has 17688 mapped keys (of 27570 keys)" > [23] "illuminaHumanv3BeadIDPATH has 7029 mapped keys (of 27570 keys)" > [24] "illuminaHumanv3BeadIDPATH2PROBE has 220 mapped keys (of 220 keys)" > [25] "illuminaHumanv3BeadIDPFAM has 25262 mapped keys (of 27570 keys)" > [26] "illuminaHumanv3BeadIDPMID has 25101 mapped keys (of 27570 keys)" > [27] "illuminaHumanv3BeadIDPMID2PROBE has 231447 mapped keys (of 248847 > keys)" > [28] "illuminaHumanv3BeadIDPROSITE has 25262 mapped keys (of 27570 keys)" > [29] "illuminaHumanv3BeadIDREFSEQ has 25726 mapped keys (of 27570 keys)" > [30] "illuminaHumanv3BeadIDSYMBOL has 25726 mapped keys (of 27570 keys)" > [31] "illuminaHumanv3BeadIDUNIGENE has 25256 mapped keys (of 27570 keys)" > [32] "illuminaHumanv3BeadIDUNIPROT has 24730 mapped keys (of 27570 keys)" > [33] "" > [34] "" > [35] "Additional Information about this package:" > > My sessionInfo() is as follows: > > > sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C > [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 > [7] LC_PAPER=en_US.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] SPIA_1.4.0 RCurl_1.4-3 > [3] bitops_1.0-4.1 annaffy_1.18.0 > [5] KEGG.db_2.3.5 GO.db_2.3.5 > [7] limma_3.2.3 illuminaHumanv3BeadID.db_1.4.1 > [9] org.Hs.eg.db_2.3.6 lumi_1.12.4 > [11] MASS_7.3-7 RSQLite_0.9-2 > [13] DBI_0.2-5 preprocessCore_1.8.0 > [15] mgcv_1.6-2 affy_1.24.2 > [17] annotate_1.24.1 AnnotationDbi_1.8.2 > [19] Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 grid_2.10.1 lattice_0.18-3 > Matrix_0.999375-44 > [5] nlme_3.1-97 tools_2.10.1 xtable_1.5-6 > > > > > > > Thank you in advance for any hints to how to solve the problem (or to why > I see this discrepancy) > > Best regards, Marco > > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 > HX Maastricht > > E-mail: m.manca at maastrichtuniversity.nl > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > ********************************************************************** ** ** > ******************************************* > > This email and any files transmitted with it are confide...{{dropped:15}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
Microarray Annotation GO Microarray Annotation GO • 1.1k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Wed, Nov 24, 2010 at 1:48 PM, Manca Marco (PATH) < m.manca@maastrichtuniversity.nl> wrote: > > Dear Sean, > > thank you for your note on the difference between Ingenuity Pathway > analysis and GO/KEGG analysis. > > I would hope to receive also a feedback to my question, which was rather > different from that of the lady you answered to: concerning the somewhat > more "raw" annotation of probes (LocusLink/EtrezIDs, UniProt, etc) I am > having a hard time with, how would you explain (if you have any insight at > all of course) the difference in the number of annotated probes according to > Bioconductor (around 27000 either by illuminaHumanv3BeadID.db or by > lumiHumanAll.db) or Ingenuity (around 47000-48000 according to my > colleague's experience), at least for the Illumina human ht12 v3 beadchip? > > Hi, Manca. The annotation process is pretty straightforward for Bioconductor (although one of the Seattle folks or Pan may correct my comments slightly). The sequence identifiers (typically RefSeq or GenBank identifiers) from the manufacturer are mapped via Entrez Gene resources to an Entrez Gene Identifier. The other information available in the annotation packages is then derived from the Entrez Gene Id. If a sequence identifier does not map to an Entrez Gene ID, then there will not be further information available. In this particular case, there are many EST sequences represented on the array. If you take the Genbank accession of one of those probes that does not have further information (you can get these from the xxxxACCNUM object in the annotation packages) and put it into the NCBI Entrez website, you will likely see that it does not map to an Entrez Gene. No rich annotation information will be obtained in this case. I do not know how IPA does its annotation, but it could be significantly different from the one described for Bioconductor. Perhaps someone with more experience with IPA will be able to help here. Alternatively, you may contact the IPA technical support to see how it is done. > And most important, how is an investigator supposed to decide which tool > should be used? Of course I could test them and then go to the bench to > verify my results, but this would increase my expenses and would delay my > publications... and most likely wouldn't give me a general principle to rely > on in this situation the next time I have to perform a microarray > analysis... > > Unfortunately, what is a good tool (in the sense that it teaches you something of biological importance) in one situation might not be a good one for the next. And, of course, you may need to verify results at the bench. And as for "general principles" to rely on, you will need to evaluate the tools and methods you use in the context of the experiment--there is not always one right answer. Finally, often two tools that purport to perform the same function are actually doing two very different things in practice, making a priori evaluation of effectiveness difficult. Hope this helps a bit. Sean > Thank you in advance for your attention. best regards, Marco > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX > Maastricht > > E-mail: m.manca@maastrichtuniversity.nl > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > > ******************************************************************** ************************************************* > > This email and any files transmitted with it are confidential and solely > for the use of the intended recipient. > > It may contain material protected by privacy or attorney-client privilege. > If you are not the intended recipient or the person responsible for > > delivering to the intended recipient, be advised that you have received > this email in error and that any use is STRICTLY PROHIBITED. > > If you have received this email in error please notify us by telephone on > +31626441205 Dr Marco MANCA > > > ******************************************************************** ************************************************* > > ________________________________________ > Da: Manca Marco (PATH) > Inviato: mercoledì 24 novembre 2010 17.05 > A: J.Oosting@lumc.nl; bioconductor@stat.math.ethz.ch > Cc: mark.dunning@gmail.com > Oggetto: R: [BioC] Illumina human ht12 v3 beadchip and > illuminaHumanv3BeadID.db > > Dear Jan, > > thank you for your prompt reply. > > Can I blatantly ask why are these probes included in the chip if so? > > Yet I have been comparing the results I obtain in BioConductor with those > of a colleague who is analyzing data obtained with the same chip but using > Ingenuity Pathway Analysis and she is apparently missing only a few hundreds > annotation rather than my tens of thousands... Is Ingenuity doing something > wrong here (like attributing annotations based on imperfect alignments?) or > shall I abide to those results and leave R/BioConductor for these datasets? > > Thank you in advance for any insight you will share with me. > > Best regards, Marco > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX > Maastricht > > E-mail: m.manca@maastrichtuniversity.nl > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > > ******************************************************************** ************************************************* > > This email and any files transmitted with it are confidential and solely > for the use of the intended recipient. > > It may contain material protected by privacy or attorney-client privilege. > If you are not the intended recipient or the person responsible for > > delivering to the intended recipient, be advised that you have received > this email in error and that any use is STRICTLY PROHIBITED. > > If you have received this email in error please notify us by telephone on > +31626441205 Dr Marco MANCA > > > ******************************************************************** ************************************************* > > ________________________________________ > Da: J.Oosting@lumc.nl [J.Oosting@lumc.nl] > Inviato: mercoledì 24 novembre 2010 16.47 > A: Manca Marco (PATH); bioconductor@stat.math.ethz.ch > Oggetto: RE: [BioC] Illumina human ht12 v3 beadchip and > illuminaHumanv3BeadID.db > > This is normal for this chiptype. The HT12 contains a lot of probes that > have no proper annotation. These are mostly ESTs that have been > submitted to Genbank at some time, but have never been properly > attributed to any gene. > > For most types of analysis these extra probes are basically worthless, > so I usually exclude them before the statistical analysis. > > Jan > > > > -----Original Message----- > > From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor- > > bounces@stat.math.ethz.ch] On Behalf Of Manca Marco (PATH) > > Sent: woensdag 24 november 2010 16:01 > > To: bioconductor mailing list > > Subject: [BioC] Illumina human ht12 v3 beadchip and > > illuminaHumanv3BeadID.db > > Importance: High > > > > > > > > Dearest BioConductors, > > > > good afternoon. > > > > I request your assistance on an issue about which I have found a few > old > > posts but I can't manage to find the solution. > > > > I am analyzing an experiment performed by use of Illumina's "human > ht12 v3 > > beadchip" and I am trying now perform some GO and Pathway analysis to > make > > sense of the results I have obtained. > > > > The package "of choice" for annotating this chip should be > > illuminaHumanv3BeadID.db (but I have similar results with > lumiHumanAll.db > > after converting Illumina probes' IDs to NuIDs): the chip has > apparently > > 48803 probes, while I can obtain annotations for roughly 27500 of them > > > > > qcdata = capture.output(illuminaHumanv3BeadID()) > > > head(qcdata, 35) > > [1] "Quality control information for illuminaHumanv3BeadID:" > > [2] "" > > [3] "" > > [4] "This package has the following mappings:" > > [5] "" > > [6] "illuminaHumanv3BeadIDACCNUM has 27570 mapped keys (of 27570 > keys)" > > [7] "illuminaHumanv3BeadIDALIAS2PROBE has 67274 mapped keys (of > 109070 > > keys)" > > [8] "illuminaHumanv3BeadIDCHR has 25726 mapped keys (of 27570 keys)" > > [9] "illuminaHumanv3BeadIDCHRLENGTHS has 25 mapped keys (of 25 keys)" > > [10] "illuminaHumanv3BeadIDCHRLOC has 24916 mapped keys (of 27570 > keys)" > > [11] "illuminaHumanv3BeadIDCHRLOCEND has 24916 mapped keys (of 27570 > > keys)" > > [12] "illuminaHumanv3BeadIDENSEMBL has 24681 mapped keys (of 27570 > keys)" > > [13] "illuminaHumanv3BeadIDENSEMBL2PROBE has 17009 mapped keys (of > 19892 > > keys)" > > [14] "illuminaHumanv3BeadIDENTREZID has 25726 mapped keys (of 27570 > keys)" > > [15] "illuminaHumanv3BeadIDENZYME has 2890 mapped keys (of 27570 > keys)" > > [16] "illuminaHumanv3BeadIDENZYME2PROBE has 857 mapped keys (of 901 > keys)" > > [17] "illuminaHumanv3BeadIDGENENAME has 25726 mapped keys (of 27570 > keys)" > > [18] "illuminaHumanv3BeadIDGO has 22807 mapped keys (of 27570 keys)" > > [19] "illuminaHumanv3BeadIDGO2ALLPROBES has 11021 mapped keys (of > 11236 > > keys)" > > [20] "illuminaHumanv3BeadIDGO2PROBE has 8010 mapped keys (of 8245 > keys)" > > [21] "illuminaHumanv3BeadIDMAP has 25589 mapped keys (of 27570 keys)" > > [22] "illuminaHumanv3BeadIDOMIM has 17688 mapped keys (of 27570 keys)" > > [23] "illuminaHumanv3BeadIDPATH has 7029 mapped keys (of 27570 keys)" > > [24] "illuminaHumanv3BeadIDPATH2PROBE has 220 mapped keys (of 220 > keys)" > > [25] "illuminaHumanv3BeadIDPFAM has 25262 mapped keys (of 27570 keys)" > > [26] "illuminaHumanv3BeadIDPMID has 25101 mapped keys (of 27570 keys)" > > [27] "illuminaHumanv3BeadIDPMID2PROBE has 231447 mapped keys (of > 248847 > > keys)" > > [28] "illuminaHumanv3BeadIDPROSITE has 25262 mapped keys (of 27570 > keys)" > > [29] "illuminaHumanv3BeadIDREFSEQ has 25726 mapped keys (of 27570 > keys)" > > [30] "illuminaHumanv3BeadIDSYMBOL has 25726 mapped keys (of 27570 > keys)" > > [31] "illuminaHumanv3BeadIDUNIGENE has 25256 mapped keys (of 27570 > keys)" > > [32] "illuminaHumanv3BeadIDUNIPROT has 24730 mapped keys (of 27570 > keys)" > > [33] "" > > [34] "" > > [35] "Additional Information about this package:" > > > > My sessionInfo() is as follows: > > > > > sessionInfo() > > R version 2.10.1 (2009-12-14) > > x86_64-pc-linux-gnu > > > > locale: > > [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C > > [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 > > [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 > > [7] LC_PAPER=en_US.utf8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] SPIA_1.4.0 RCurl_1.4-3 > > [3] bitops_1.0-4.1 annaffy_1.18.0 > > [5] KEGG.db_2.3.5 GO.db_2.3.5 > > [7] limma_3.2.3 illuminaHumanv3BeadID.db_1.4.1 > > [9] org.Hs.eg.db_2.3.6 lumi_1.12.4 > > [11] MASS_7.3-7 RSQLite_0.9-2 > > [13] DBI_0.2-5 preprocessCore_1.8.0 > > [15] mgcv_1.6-2 affy_1.24.2 > > [17] annotate_1.24.1 AnnotationDbi_1.8.2 > > [19] Biobase_2.6.1 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.14.0 grid_2.10.1 lattice_0.18-3 > > Matrix_0.999375-44 > > [5] nlme_3.1-97 tools_2.10.1 xtable_1.5-6 > > > > > > > > > > > > > Thank you in advance for any hints to how to solve the problem (or to > why > > I see this discrepancy) > > > > Best regards, Marco > > > > > > -- > > Marco Manca, MD > > University of Maastricht > > Faculty of Health, Medicine and Life Sciences (FHML) > > Cardiovascular Research Institute (CARIM) > > > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > > Visiting address: Experimental Vascular Pathology group, Dept of > Pathology > > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, > 6229 > > HX Maastricht > > > > E-mail: m.manca@maastrichtuniversity.nl > > Office telephone: +31(0)433874633 > > Personal mobile: +31(0)626441205 > > Twitter: @markomanka > > > > > > > ******************************************************************** **** > ** > > ******************************************* > > > > This email and any files transmitted with it are > confide...{{dropped:15}} > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Sean, thank you for your insightful reply. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: seandavi at gmail.com [seandavi at gmail.com] per conto di Sean Davis [sdavis2 at mail.nih.gov] Inviato: mercoled? 24 novembre 2010 21.27 A: Manca Marco (PATH) Cc: bioconductor at stat.math.ethz.ch Oggetto: Re: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db On Wed, Nov 24, 2010 at 1:48 PM, Manca Marco (PATH) <m.manca at="" maastrichtuniversity.nl<mailto:m.manca="" at="" maastrichtuniversity.nl="">> wrote: Dear Sean, thank you for your note on the difference between Ingenuity Pathway analysis and GO/KEGG analysis. I would hope to receive also a feedback to my question, which was rather different from that of the lady you answered to: concerning the somewhat more "raw" annotation of probes (LocusLink/EtrezIDs, UniProt, etc) I am having a hard time with, how would you explain (if you have any insight at all of course) the difference in the number of annotated probes according to Bioconductor (around 27000 either by illuminaHumanv3BeadID.db or by lumiHumanAll.db) or Ingenuity (around 47000-48000 according to my colleague's experience), at least for the Illumina human ht12 v3 beadchip? Hi, Manca. The annotation process is pretty straightforward for Bioconductor (although one of the Seattle folks or Pan may correct my comments slightly). The sequence identifiers (typically RefSeq or GenBank identifiers) from the manufacturer are mapped via Entrez Gene resources to an Entrez Gene Identifier. The other information available in the annotation packages is then derived from the Entrez Gene Id. If a sequence identifier does not map to an Entrez Gene ID, then there will not be further information available. In this particular case, there are many EST sequences represented on the array. If you take the Genbank accession of one of those probes that does not have further information (you can get these from the xxxxACCNUM object in the annotation packages) and put it into the NCBI Entrez website, you will likely see that it does not map to an Entrez Gene. No rich annotation information will be obtained in this case. I do not know how IPA does its annotation, but it could be significantly different from the one described for Bioconductor. Perhaps someone with more experience with IPA will be able to help here. Alternatively, you may contact the IPA technical support to see how it is done. And most important, how is an investigator supposed to decide which tool should be used? Of course I could test them and then go to the bench to verify my results, but this would increase my expenses and would delay my publications... and most likely wouldn't give me a general principle to rely on in this situation the next time I have to perform a microarray analysis... Unfortunately, what is a good tool (in the sense that it teaches you something of biological importance) in one situation might not be a good one for the next. And, of course, you may need to verify results at the bench. And as for "general principles" to rely on, you will need to evaluate the tools and methods you use in the context of the experiment--there is not always one right answer. Finally, often two tools that purport to perform the same function are actually doing two very different things in practice, making a priori evaluation of effectiveness difficult. Hope this helps a bit. Sean Thank you in advance for your attention. best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Manca Marco (PATH) Inviato: mercoled? 24 novembre 2010 17.05 A: J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl="">; bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> Cc: mark.dunning at gmail.com<mailto:mark.dunning at="" gmail.com=""> Oggetto: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db Dear Jan, thank you for your prompt reply. Can I blatantly ask why are these probes included in the chip if so? Yet I have been comparing the results I obtain in BioConductor with those of a colleague who is analyzing data obtained with the same chip but using Ingenuity Pathway Analysis and she is apparently missing only a few hundreds annotation rather than my tens of thousands... Is Ingenuity doing something wrong here (like attributing annotations based on imperfect alignments?) or shall I abide to those results and leave R/BioConductor for these datasets? Thank you in advance for any insight you will share with me. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl=""> [J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl="">] Inviato: mercoled? 24 novembre 2010 16.47 A: Manca Marco (PATH); bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> Oggetto: RE: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db This is normal for this chiptype. The HT12 contains a lot of probes that have no proper annotation. These are mostly ESTs that have been submitted to Genbank at some time, but have never been properly attributed to any gene. For most types of analysis these extra probes are basically worthless, so I usually exclude them before the statistical analysis. Jan > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch<mailto:bioconductor- bounces="" at="" stat.math.ethz.ch=""> [mailto:bioconductor-<mailto:bioconductor-> > bounces at stat.math.ethz.ch<mailto:bounces at="" stat.math.ethz.ch="">] On Behalf Of Manca Marco (PATH) > Sent: woensdag 24 november 2010 16:01 > To: bioconductor mailing list > Subject: [BioC] Illumina human ht12 v3 beadchip and > illuminaHumanv3BeadID.db > Importance: High > > > > Dearest BioConductors, > > good afternoon. > > I request your assistance on an issue about which I have found a few old > posts but I can't manage to find the solution. > > I am analyzing an experiment performed by use of Illumina's "human ht12 v3 > beadchip" and I am trying now perform some GO and Pathway analysis to make > sense of the results I have obtained. > > The package "of choice" for annotating this chip should be > illuminaHumanv3BeadID.db (but I have similar results with lumiHumanAll.db > after converting Illumina probes' IDs to NuIDs): the chip has apparently > 48803 probes, while I can obtain annotations for roughly 27500 of them > > > qcdata = capture.output(illuminaHumanv3BeadID()) > > head(qcdata, 35) > [1] "Quality control information for illuminaHumanv3BeadID:" > [2] "" > [3] "" > [4] "This package has the following mappings:" > [5] "" > [6] "illuminaHumanv3BeadIDACCNUM has 27570 mapped keys (of 27570 keys)" > [7] "illuminaHumanv3BeadIDALIAS2PROBE has 67274 mapped keys (of 109070 > keys)" > [8] "illuminaHumanv3BeadIDCHR has 25726 mapped keys (of 27570 keys)" > [9] "illuminaHumanv3BeadIDCHRLENGTHS has 25 mapped keys (of 25 keys)" > [10] "illuminaHumanv3BeadIDCHRLOC has 24916 mapped keys (of 27570 keys)" > [11] "illuminaHumanv3BeadIDCHRLOCEND has 24916 mapped keys (of 27570 > keys)" > [12] "illuminaHumanv3BeadIDENSEMBL has 24681 mapped keys (of 27570 keys)" > [13] "illuminaHumanv3BeadIDENSEMBL2PROBE has 17009 mapped keys (of 19892 > keys)" > [14] "illuminaHumanv3BeadIDENTREZID has 25726 mapped keys (of 27570 keys)" > [15] "illuminaHumanv3BeadIDENZYME has 2890 mapped keys (of 27570 keys)" > [16] "illuminaHumanv3BeadIDENZYME2PROBE has 857 mapped keys (of 901 keys)" > [17] "illuminaHumanv3BeadIDGENENAME has 25726 mapped keys (of 27570 keys)" > [18] "illuminaHumanv3BeadIDGO has 22807 mapped keys (of 27570 keys)" > [19] "illuminaHumanv3BeadIDGO2ALLPROBES has 11021 mapped keys (of 11236 > keys)" > [20] "illuminaHumanv3BeadIDGO2PROBE has 8010 mapped keys (of 8245 keys)" > [21] "illuminaHumanv3BeadIDMAP has 25589 mapped keys (of 27570 keys)" > [22] "illuminaHumanv3BeadIDOMIM has 17688 mapped keys (of 27570 keys)" > [23] "illuminaHumanv3BeadIDPATH has 7029 mapped keys (of 27570 keys)" > [24] "illuminaHumanv3BeadIDPATH2PROBE has 220 mapped keys (of 220 keys)" > [25] "illuminaHumanv3BeadIDPFAM has 25262 mapped keys (of 27570 keys)" > [26] "illuminaHumanv3BeadIDPMID has 25101 mapped keys (of 27570 keys)" > [27] "illuminaHumanv3BeadIDPMID2PROBE has 231447 mapped keys (of 248847 > keys)" > [28] "illuminaHumanv3BeadIDPROSITE has 25262 mapped keys (of 27570 keys)" > [29] "illuminaHumanv3BeadIDREFSEQ has 25726 mapped keys (of 27570 keys)" > [30] "illuminaHumanv3BeadIDSYMBOL has 25726 mapped keys (of 27570 keys)" > [31] "illuminaHumanv3BeadIDUNIGENE has 25256 mapped keys (of 27570 keys)" > [32] "illuminaHumanv3BeadIDUNIPROT has 24730 mapped keys (of 27570 keys)" > [33] "" > [34] "" > [35] "Additional Information about this package:" > > My sessionInfo() is as follows: > > > sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C > [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 > [7] LC_PAPER=en_US.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] SPIA_1.4.0 RCurl_1.4-3 > [3] bitops_1.0-4.1 annaffy_1.18.0 > [5] KEGG.db_2.3.5 GO.db_2.3.5 > [7] limma_3.2.3 illuminaHumanv3BeadID.db_1.4.1 > [9] org.Hs.eg.db_2.3.6 lumi_1.12.4 > [11] MASS_7.3-7 RSQLite_0.9-2 > [13] DBI_0.2-5 preprocessCore_1.8.0 > [15] mgcv_1.6-2 affy_1.24.2 > [17] annotate_1.24.1 AnnotationDbi_1.8.2 > [19] Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 grid_2.10.1 lattice_0.18-3 > Matrix_0.999375-44 > [5] nlme_3.1-97 tools_2.10.1 xtable_1.5-6 > > > > > > > Thank you in advance for any hints to how to solve the problem (or to why > I see this discrepancy) > > Best regards, Marco > > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 > HX Maastricht > > E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > ********************************************************************** ** ** > ******************************************* > > This email and any files transmitted with it are confide...{{dropped:15}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
yes - thank you, Marco, for expanding on my question, and thank you, Sean, for your prompt answer (unfortunately what I expected, so it seems I/we cannot afford giving up on Ingenuity for now ...) Ina ----- Original Message ----- From: "Manca Marco (PATH)" manca@maastrichtuniversity.nl> To: "Sean Davis" nih.gov> Cc: bioconductor at stat.math.ethz.ch Sent: Wednesday, November 24, 2010 3:35:10 PM Subject: [BioC] R: R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db Hi Sean, thank you for your insightful reply. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: seandavi at gmail.com [seandavi at gmail.com] per conto di Sean Davis [sdavis2 at mail.nih.gov] Inviato: mercoled? 24 novembre 2010 21.27 A: Manca Marco (PATH) Cc: bioconductor at stat.math.ethz.ch Oggetto: Re: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db On Wed, Nov 24, 2010 at 1:48 PM, Manca Marco (PATH) manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl="">> wrote: Dear Sean, thank you for your note on the difference between Ingenuity Pathway analysis and GO/KEGG analysis. I would hope to receive also a feedback to my question, which was rather different from that of the lady you answered to: concerning the somewhat more "raw" annotation of probes (LocusLink/EtrezIDs, UniProt, etc) I am having a hard time with, how would you explain (if you have any insight at all of course) the difference in the number of annotated probes according to Bioconductor (around 27000 either by illuminaHumanv3BeadID.db or by lumiHumanAll.db) or Ingenuity (around 47000-48000 according to my colleague's experience), at least for the Illumina human ht12 v3 beadchip? Hi, Manca. The annotation process is pretty straightforward for Bioconductor (although one of the Seattle folks or Pan may correct my comments slightly). The sequence identifiers (typically RefSeq or GenBank identifiers) from the manufacturer are mapped via Entrez Gene resources to an Entrez Gene Identifier. The other information available in the annotation packages is then derived from the Entrez Gene Id. If a sequence identifier does not map to an Entrez Gene ID, then there will not be further information available. In this particular case, there are many EST sequences represented on the array. If you take the Genbank accession of one of those probes that does not have further information (you can get these from the xxxxACCNUM object in the annotation packages) and put it into the NCBI Entrez website, you will likely see that it does not map to an Entrez Gene. No rich annotation information will be obtained in this case. I do not know how IPA does its annotation, but it could be significantly different from the one described for Bioconductor. Perhaps someone with more experience with IPA will be able to help here. Alternatively, you may contact the IPA technical support to see how it is done. And most important, how is an investigator supposed to decide which tool should be used? Of course I could test them and then go to the bench to verify my results, but this would increase my expenses and would delay my publications... and most likely wouldn't give me a general principle to rely on in this situation the next time I have to perform a microarray analysis... Unfortunately, what is a good tool (in the sense that it teaches you something of biological importance) in one situation might not be a good one for the next. And, of course, you may need to verify results at the bench. And as for "general principles" to rely on, you will need to evaluate the tools and methods you use in the context of the experiment--there is not always one right answer. Finally, often two tools that purport to perform the same function are actually doing two very different things in practice, making a priori evaluation of effectiveness difficult. Hope this helps a bit. Sean Thank you in advance for your attention. best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Manca Marco (PATH) Inviato: mercoled? 24 novembre 2010 17.05 A: J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl="">; bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> Cc: mark.dunning at gmail.com<mailto:mark.dunning at="" gmail.com=""> Oggetto: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db Dear Jan, thank you for your prompt reply. Can I blatantly ask why are these probes included in the chip if so? Yet I have been comparing the results I obtain in BioConductor with those of a colleague who is analyzing data obtained with the same chip but using Ingenuity Pathway Analysis and she is apparently missing only a few hundreds annotation rather than my tens of thousands... Is Ingenuity doing something wrong here (like attributing annotations based on imperfect alignments?) or shall I abide to those results and leave R/BioConductor for these datasets? Thank you in advance for any insight you will share with me. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl=""> [J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl="">] Inviato: mercoled? 24 novembre 2010 16.47 A: Manca Marco (PATH); bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> Oggetto: RE: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db This is normal for this chiptype. The HT12 contains a lot of probes that have no proper annotation. These are mostly ESTs that have been submitted to Genbank at some time, but have never been properly attributed to any gene. For most types of analysis these extra probes are basically worthless, so I usually exclude them before the statistical analysis. Jan > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch<mailto:bioconductor- bounces="" at="" stat.math.ethz.ch=""> [mailto:bioconductor-<mailto:bioconductor-> > bounces at stat.math.ethz.ch<mailto:bounces at="" stat.math.ethz.ch="">] On Behalf Of Manca Marco (PATH) > Sent: woensdag 24 november 2010 16:01 > To: bioconductor mailing list > Subject: [BioC] Illumina human ht12 v3 beadchip and > illuminaHumanv3BeadID.db > Importance: High > > > > Dearest BioConductors, > > good afternoon. > > I request your assistance on an issue about which I have found a few old > posts but I can't manage to find the solution. > > I am analyzing an experiment performed by use of Illumina's "human ht12 v3 > beadchip" and I am trying now perform some GO and Pathway analysis to make > sense of the results I have obtained. > > The package "of choice" for annotating this chip should be > illuminaHumanv3BeadID.db (but I have similar results with lumiHumanAll.db > after converting Illumina probes' IDs to NuIDs): the chip has apparently > 48803 probes, while I can obtain annotations for roughly 27500 of them > > > qcdata = capture.output(illuminaHumanv3BeadID()) > > head(qcdata, 35) > [1] "Quality control information for illuminaHumanv3BeadID:" > [2] "" > [3] "" > [4] "This package has the following mappings:" > [5] "" > [6] "illuminaHumanv3BeadIDACCNUM has 27570 mapped keys (of 27570 keys)" > [7] "illuminaHumanv3BeadIDALIAS2PROBE has 67274 mapped keys (of 109070 > keys)" > [8] "illuminaHumanv3BeadIDCHR has 25726 mapped keys (of 27570 keys)" > [9] "illuminaHumanv3BeadIDCHRLENGTHS has 25 mapped keys (of 25 keys)" > [10] "illuminaHumanv3BeadIDCHRLOC has 24916 mapped keys (of 27570 keys)" > [11] "illuminaHumanv3BeadIDCHRLOCEND has 24916 mapped keys (of 27570 > keys)" > [12] "illuminaHumanv3BeadIDENSEMBL has 24681 mapped keys (of 27570 keys)" > [13] "illuminaHumanv3BeadIDENSEMBL2PROBE has 17009 mapped keys (of 19892 > keys)" > [14] "illuminaHumanv3BeadIDENTREZID has 25726 mapped keys (of 27570 keys)" > [15] "illuminaHumanv3BeadIDENZYME has 2890 mapped keys (of 27570 keys)" > [16] "illuminaHumanv3BeadIDENZYME2PROBE has 857 mapped keys (of 901 keys)" > [17] "illuminaHumanv3BeadIDGENENAME has 25726 mapped keys (of 27570 keys)" > [18] "illuminaHumanv3BeadIDGO has 22807 mapped keys (of 27570 keys)" > [19] "illuminaHumanv3BeadIDGO2ALLPROBES has 11021 mapped keys (of 11236 > keys)" > [20] "illuminaHumanv3BeadIDGO2PROBE has 8010 mapped keys (of 8245 keys)" > [21] "illuminaHumanv3BeadIDMAP has 25589 mapped keys (of 27570 keys)" > [22] "illuminaHumanv3BeadIDOMIM has 17688 mapped keys (of 27570 keys)" > [23] "illuminaHumanv3BeadIDPATH has 7029 mapped keys (of 27570 keys)" > [24] "illuminaHumanv3BeadIDPATH2PROBE has 220 mapped keys (of 220 keys)" > [25] "illuminaHumanv3BeadIDPFAM has 25262 mapped keys (of 27570 keys)" > [26] "illuminaHumanv3BeadIDPMID has 25101 mapped keys (of 27570 keys)" > [27] "illuminaHumanv3BeadIDPMID2PROBE has 231447 mapped keys (of 248847 > keys)" > [28] "illuminaHumanv3BeadIDPROSITE has 25262 mapped keys (of 27570 keys)" > [29] "illuminaHumanv3BeadIDREFSEQ has 25726 mapped keys (of 27570 keys)" > [30] "illuminaHumanv3BeadIDSYMBOL has 25726 mapped keys (of 27570 keys)" > [31] "illuminaHumanv3BeadIDUNIGENE has 25256 mapped keys (of 27570 keys)" > [32] "illuminaHumanv3BeadIDUNIPROT has 24730 mapped keys (of 27570 keys)" > [33] "" > [34] "" > [35] "Additional Information about this package:" > > My sessionInfo() is as follows: > > > sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C > [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 > [7] LC_PAPER=en_US.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] SPIA_1.4.0 RCurl_1.4-3 > [3] bitops_1.0-4.1 annaffy_1.18.0 > [5] KEGG.db_2.3.5 GO.db_2.3.5 > [7] limma_3.2.3 illuminaHumanv3BeadID.db_1.4.1 > [9] org.Hs.eg.db_2.3.6 lumi_1.12.4 > [11] MASS_7.3-7 RSQLite_0.9-2 > [13] DBI_0.2-5 preprocessCore_1.8.0 > [15] mgcv_1.6-2 affy_1.24.2 > [17] annotate_1.24.1 AnnotationDbi_1.8.2 > [19] Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 grid_2.10.1 lattice_0.18-3 > Matrix_0.999375-44 > [5] nlme_3.1-97 tools_2.10.1 xtable_1.5-6 > > > > > > > Thank you in advance for any hints to how to solve the problem (or to why > I see this discrepancy) > > Best regards, Marco > > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 > HX Maastricht > > E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > ********************************************************************** ** ** > ******************************************* > > This email and any files transmitted with it are confide...{{dropped:15}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Dear Ina, maybe other fellow BioConductors will correct me, but I would say that it is not completely true you HAVE TO depend on Ingenuity for the task of recovering "pathology relatedness" data. One way in BioConductor could be to make use of the information available within the annotation packages through links to OMIM. By inputting the line > ls("package:YourAnnotationPackageHere.db" you will receive a list of pockets of your annotation package among which YourAnnotationPackageHereOMIM which is described as (I'm copying and pasting the description of hgu133plus2OMIM as an example): "Each manufacturer identifier is mapped to a vector of OMIM identifiers. The vector length may be one or longer, depending on how many OMIM identifiers the manufacturer identifier maps to. An NA is reported for any manufacturer identifier that cannot be mapped to an OMIM identifier at this time. OMIM is based upon the book Mendelian Inheritance in Man (V. A. McKusick) and focuses primarily on inherited or heritable genetic diseases. It contains textual information, pictures, and reference information that can be searched using various terms, among which the MIM number is one. Mappings were based on data provided by: Entrez Gene ftp://ftp.ncbi.nlm.nih.gov/gene/DATA With a date stamp from the source of: 2010-Mar1 " OMIM nowadays interpretes " inherited or heritable genetic diseases" in the largest meaning as to include also susceptibilities &Co. I would dare saying that then you can treat this identifiers for your queries and any further enrichment analysis you have in mind... I hope this helps. Best regards, Marco. -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Ina Hoeschele [inah at vbi.vt.edu] Inviato: mercoled? 24 novembre 2010 21.46 A: Manca Marco (PATH) Cc: bioconductor at stat.math.ethz.ch; Sean Davis Oggetto: Re: [BioC] R: R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db yes - thank you, Marco, for expanding on my question, and thank you, Sean, for your prompt answer (unfortunately what I expected, so it seems I/we cannot afford giving up on Ingenuity for now ...) Ina ----- Original Message ----- From: "Manca Marco (PATH)" manca@maastrichtuniversity.nl> To: "Sean Davis" nih.gov> Cc: bioconductor at stat.math.ethz.ch Sent: Wednesday, November 24, 2010 3:35:10 PM Subject: [BioC] R: R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db Hi Sean, thank you for your insightful reply. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: seandavi at gmail.com [seandavi at gmail.com] per conto di Sean Davis [sdavis2 at mail.nih.gov] Inviato: mercoled? 24 novembre 2010 21.27 A: Manca Marco (PATH) Cc: bioconductor at stat.math.ethz.ch Oggetto: Re: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db On Wed, Nov 24, 2010 at 1:48 PM, Manca Marco (PATH) manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl="">> wrote: Dear Sean, thank you for your note on the difference between Ingenuity Pathway analysis and GO/KEGG analysis. I would hope to receive also a feedback to my question, which was rather different from that of the lady you answered to: concerning the somewhat more "raw" annotation of probes (LocusLink/EtrezIDs, UniProt, etc) I am having a hard time with, how would you explain (if you have any insight at all of course) the difference in the number of annotated probes according to Bioconductor (around 27000 either by illuminaHumanv3BeadID.db or by lumiHumanAll.db) or Ingenuity (around 47000-48000 according to my colleague's experience), at least for the Illumina human ht12 v3 beadchip? Hi, Manca. The annotation process is pretty straightforward for Bioconductor (although one of the Seattle folks or Pan may correct my comments slightly). The sequence identifiers (typically RefSeq or GenBank identifiers) from the manufacturer are mapped via Entrez Gene resources to an Entrez Gene Identifier. The other information available in the annotation packages is then derived from the Entrez Gene Id. If a sequence identifier does not map to an Entrez Gene ID, then there will not be further information available. In this particular case, there are many EST sequences represented on the array. If you take the Genbank accession of one of those probes that does not have further information (you can get these from the xxxxACCNUM object in the annotation packages) and put it into the NCBI Entrez website, you will likely see that it does not map to an Entrez Gene. No rich annotation information will be obtained in this case. I do not know how IPA does its annotation, but it could be significantly different from the one described for Bioconductor. Perhaps someone with more experience with IPA will be able to help here. Alternatively, you may contact the IPA technical support to see how it is done. And most important, how is an investigator supposed to decide which tool should be used? Of course I could test them and then go to the bench to verify my results, but this would increase my expenses and would delay my publications... and most likely wouldn't give me a general principle to rely on in this situation the next time I have to perform a microarray analysis... Unfortunately, what is a good tool (in the sense that it teaches you something of biological importance) in one situation might not be a good one for the next. And, of course, you may need to verify results at the bench. And as for "general principles" to rely on, you will need to evaluate the tools and methods you use in the context of the experiment--there is not always one right answer. Finally, often two tools that purport to perform the same function are actually doing two very different things in practice, making a priori evaluation of effectiveness difficult. Hope this helps a bit. Sean Thank you in advance for your attention. best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Manca Marco (PATH) Inviato: mercoled? 24 novembre 2010 17.05 A: J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl="">; bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> Cc: mark.dunning at gmail.com<mailto:mark.dunning at="" gmail.com=""> Oggetto: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db Dear Jan, thank you for your prompt reply. Can I blatantly ask why are these probes included in the chip if so? Yet I have been comparing the results I obtain in BioConductor with those of a colleague who is analyzing data obtained with the same chip but using Ingenuity Pathway Analysis and she is apparently missing only a few hundreds annotation rather than my tens of thousands... Is Ingenuity doing something wrong here (like attributing annotations based on imperfect alignments?) or shall I abide to those results and leave R/BioConductor for these datasets? Thank you in advance for any insight you will share with me. Best regards, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl=""> [J.Oosting at lumc.nl<mailto:j.oosting at="" lumc.nl="">] Inviato: mercoled? 24 novembre 2010 16.47 A: Manca Marco (PATH); bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> Oggetto: RE: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db This is normal for this chiptype. The HT12 contains a lot of probes that have no proper annotation. These are mostly ESTs that have been submitted to Genbank at some time, but have never been properly attributed to any gene. For most types of analysis these extra probes are basically worthless, so I usually exclude them before the statistical analysis. Jan > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch<mailto:bioconductor- bounces="" at="" stat.math.ethz.ch=""> [mailto:bioconductor-<mailto:bioconductor-> > bounces at stat.math.ethz.ch<mailto:bounces at="" stat.math.ethz.ch="">] On Behalf Of Manca Marco (PATH) > Sent: woensdag 24 november 2010 16:01 > To: bioconductor mailing list > Subject: [BioC] Illumina human ht12 v3 beadchip and > illuminaHumanv3BeadID.db > Importance: High > > > > Dearest BioConductors, > > good afternoon. > > I request your assistance on an issue about which I have found a few old > posts but I can't manage to find the solution. > > I am analyzing an experiment performed by use of Illumina's "human ht12 v3 > beadchip" and I am trying now perform some GO and Pathway analysis to make > sense of the results I have obtained. > > The package "of choice" for annotating this chip should be > illuminaHumanv3BeadID.db (but I have similar results with lumiHumanAll.db > after converting Illumina probes' IDs to NuIDs): the chip has apparently > 48803 probes, while I can obtain annotations for roughly 27500 of them > > > qcdata = capture.output(illuminaHumanv3BeadID()) > > head(qcdata, 35) > [1] "Quality control information for illuminaHumanv3BeadID:" > [2] "" > [3] "" > [4] "This package has the following mappings:" > [5] "" > [6] "illuminaHumanv3BeadIDACCNUM has 27570 mapped keys (of 27570 keys)" > [7] "illuminaHumanv3BeadIDALIAS2PROBE has 67274 mapped keys (of 109070 > keys)" > [8] "illuminaHumanv3BeadIDCHR has 25726 mapped keys (of 27570 keys)" > [9] "illuminaHumanv3BeadIDCHRLENGTHS has 25 mapped keys (of 25 keys)" > [10] "illuminaHumanv3BeadIDCHRLOC has 24916 mapped keys (of 27570 keys)" > [11] "illuminaHumanv3BeadIDCHRLOCEND has 24916 mapped keys (of 27570 > keys)" > [12] "illuminaHumanv3BeadIDENSEMBL has 24681 mapped keys (of 27570 keys)" > [13] "illuminaHumanv3BeadIDENSEMBL2PROBE has 17009 mapped keys (of 19892 > keys)" > [14] "illuminaHumanv3BeadIDENTREZID has 25726 mapped keys (of 27570 keys)" > [15] "illuminaHumanv3BeadIDENZYME has 2890 mapped keys (of 27570 keys)" > [16] "illuminaHumanv3BeadIDENZYME2PROBE has 857 mapped keys (of 901 keys)" > [17] "illuminaHumanv3BeadIDGENENAME has 25726 mapped keys (of 27570 keys)" > [18] "illuminaHumanv3BeadIDGO has 22807 mapped keys (of 27570 keys)" > [19] "illuminaHumanv3BeadIDGO2ALLPROBES has 11021 mapped keys (of 11236 > keys)" > [20] "illuminaHumanv3BeadIDGO2PROBE has 8010 mapped keys (of 8245 keys)" > [21] "illuminaHumanv3BeadIDMAP has 25589 mapped keys (of 27570 keys)" > [22] "illuminaHumanv3BeadIDOMIM has 17688 mapped keys (of 27570 keys)" > [23] "illuminaHumanv3BeadIDPATH has 7029 mapped keys (of 27570 keys)" > [24] "illuminaHumanv3BeadIDPATH2PROBE has 220 mapped keys (of 220 keys)" > [25] "illuminaHumanv3BeadIDPFAM has 25262 mapped keys (of 27570 keys)" > [26] "illuminaHumanv3BeadIDPMID has 25101 mapped keys (of 27570 keys)" > [27] "illuminaHumanv3BeadIDPMID2PROBE has 231447 mapped keys (of 248847 > keys)" > [28] "illuminaHumanv3BeadIDPROSITE has 25262 mapped keys (of 27570 keys)" > [29] "illuminaHumanv3BeadIDREFSEQ has 25726 mapped keys (of 27570 keys)" > [30] "illuminaHumanv3BeadIDSYMBOL has 25726 mapped keys (of 27570 keys)" > [31] "illuminaHumanv3BeadIDUNIGENE has 25256 mapped keys (of 27570 keys)" > [32] "illuminaHumanv3BeadIDUNIPROT has 24730 mapped keys (of 27570 keys)" > [33] "" > [34] "" > [35] "Additional Information about this package:" > > My sessionInfo() is as follows: > > > sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C > [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 > [7] LC_PAPER=en_US.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] SPIA_1.4.0 RCurl_1.4-3 > [3] bitops_1.0-4.1 annaffy_1.18.0 > [5] KEGG.db_2.3.5 GO.db_2.3.5 > [7] limma_3.2.3 illuminaHumanv3BeadID.db_1.4.1 > [9] org.Hs.eg.db_2.3.6 lumi_1.12.4 > [11] MASS_7.3-7 RSQLite_0.9-2 > [13] DBI_0.2-5 preprocessCore_1.8.0 > [15] mgcv_1.6-2 affy_1.24.2 > [17] annotate_1.24.1 AnnotationDbi_1.8.2 > [19] Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 grid_2.10.1 lattice_0.18-3 > Matrix_0.999375-44 > [5] nlme_3.1-97 tools_2.10.1 xtable_1.5-6 > > > > > > > Thank you in advance for any hints to how to solve the problem (or to why > I see this discrepancy) > > Best regards, Marco > > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology > - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 > HX Maastricht > > E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at="" maastrichtuniversity.nl=""> > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > ********************************************************************** ** ** > ******************************************* > > This email and any files transmitted with it are confide...{{dropped:15}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch<mailto:bioconductor at="" stat.math.ethz.ch=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6