hugene10sttranscriptclusterACCNUM has no mappings

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 8.0 years ago

United States

Hi. Sorry for the delay I was not in the office for almost a week (and I left the day before this question popped up). Part of the reason for the confusion here is because the ACCNUM field is supposed to represent the source accessions that were used when designing the package. In that sense ACCNUM is kind of an anachronism since I don't think people really design chips this way very much anymore, so the reason that bimap is even present is largely for backwards compatibility more than anything else. This is why the man page for the ACCNUM mapping says this: "For chip packages such as this, the ACCNUM mapping comes directly from the manufacturer. This is different from other mappings which are mapped onto the probes via an Entrez Gene identifier." Anyhow the code that builds the ChipDb package is proceeding under the notion that you would only "have" those special ACCNUM values if those were listed in your primary (fileName) set of keys. That is, if the probes are not really based on genbank accessions then you don't really have any ACCNUM values anyways and that field should (in that case) probably be left out entirely. So if you don't have legitimate ACCNUM values (that is you are not dealing with an old chip where these really are the primary initial keys that everything was based off of), then I don't think you should fake them into the package by including them 1st. Because effectively what you will be doing is to inadvertently resurrect old retired IDs from the dead. I mean yes you can extract them out like that with old dead accession numbers: but I don't think it's best practice to do that. Those ids were presumably retired for a reason. I hope this helps to explain things better, Marc On 08/29/2014 07:15 AM, James W. MacDonald wrote: > Hi Thomas, > > I built that package, and as you note, there are no accession numbers. > But maybe that is because I misunderstand something, so I am directly > including Marc Carlson in this conversation. > > Since the annotation packages are Gene ID-centric, I create two files, > one with probeid->GeneID, and one with probeid->GeneBank/RefSeq ID. I > then use the first file as the primary annotation file, and the second > as the 'otherSrc' file. If I then run makeDBPackage(), I get this output: > > baseMapType is eg > Prepending Metadata > Creating Genes table > Appending Probes > Found 0 Probe Accessions > Appending Gene Info > Found 19962 Gene Names > Found 19962 Gene Symbols > <snip> > > But if I then reverse the source files, using the second file as the > primary annotation file, and the GeneID file as the 'otherSrc' file, I > get: > > baseMapType is gb or gbNRef > Prepending Metadata > Creating Genes table > Appending Probes > Found 21941 Probe Accessions > Appending Gene Info > Found 20195 Gene Names > Found 20195 Gene Symbols > <snip> > > From my understanding of the SQLForge vignette, I should be able to > use either ordering, and get identical results, but obviously this is > not the case. Marc, can you shed some light on this? Evidently I > should re-make the packages using gbNRef rather than eg as the > baseMapType. > > Best, > > Jim > > > > > On Fri, Aug 29, 2014 at 4:30 AM, Thomas Pfau <thomas.pfau at="" uni.lu=""> <mailto:thomas.pfau at="" uni.lu="">> wrote: > > Hello, > > I just tried to get a probe to accession matching the above > annotation database. In particular it does not yield any mappings > for accessions. (i.e. > x <- hugene10sttranscriptclusterACCNUM > mapped_probes <- mappedkeys(x) > yields an empty mapped_probes list. > > > I'm Running R 3.1.1 on ubuntu. > The loaded packages are: > > [1] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 > [4] IRanges_1.22.10 oligoClasses_1.26.0 > hugene10sttranscriptcluster.db_8.1.0 > [7] org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 > [10] AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0 > [13] BiocGenerics_0.10.0 BiocInstaller_1.14.2 > > and capture.output(hugene10sttranscriptcluster()) yields: > [1] "Quality control information for hugene10sttranscriptcluster:" > [2] "" > [3] "" > [4] "This package has the following mappings:" > [5] "" > [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of > 33297 keys)" > [7] "hugene10sttranscriptclusterALIAS2PROBE has 60778 mapped keys > (of 103510 keys)" > [8] "hugene10sttranscriptclusterCHR has 19962 mapped keys (of > 33297 keys)" > [9] "hugene10sttranscriptclusterCHRLENGTHS has 93 mapped keys (of > 93 keys)" > [10] "hugene10sttranscriptclusterCHRLOC has 19424 mapped keys (of > 33297 keys)" > [11] "hugene10sttranscriptclusterCHRLOCEND has 19424 mapped keys > (of 33297 keys)" > [12] "hugene10sttranscriptclusterENSEMBL has 19416 mapped keys (of > 33297 keys)" > [13] "hugene10sttranscriptclusterENSEMBL2PROBE has 20590 mapped > keys (of 28046 keys)" > [14] "hugene10sttranscriptclusterENTREZID has 19962 mapped keys > (of 33297 keys)" > [15] "hugene10sttranscriptclusterENZYME has 2201 mapped keys (of > 33297 keys)" > [16] "hugene10sttranscriptclusterENZYME2PROBE has 958 mapped keys > (of 975 keys)" > [17] "hugene10sttranscriptclusterGENENAME has 19962 mapped keys > (of 33297 keys)" > [18] "hugene10sttranscriptclusterGO has 17412 mapped keys (of > 33297 keys)" > [19] "hugene10sttranscriptclusterGO2ALLPROBES has 17930 mapped > keys (of 18078 keys)" > [20] "hugene10sttranscriptclusterGO2PROBE has 13970 mapped keys > (of 14134 keys)" > [21] "hugene10sttranscriptclusterMAP has 19832 mapped keys (of > 33297 keys)" > [22] "hugene10sttranscriptclusterOMIM has 13778 mapped keys (of > 33297 keys)" > [23] "hugene10sttranscriptclusterPATH has 5768 mapped keys (of > 33297 keys)" > [24] "hugene10sttranscriptclusterPATH2PROBE has 229 mapped keys > (of 229 keys)" > [25] "hugene10sttranscriptclusterPFAM has 18146 mapped keys (of > 33297 keys)" > [26] "hugene10sttranscriptclusterPMID has 19726 mapped keys (of > 33297 keys)" > [27] "hugene10sttranscriptclusterPMID2PROBE has 396421 mapped keys > (of 412133 keys)" > [28] "hugene10sttranscriptclusterPROSITE has 18146 mapped keys (of > 33297 keys)" > [29] "hugene10sttranscriptclusterREFSEQ has 19873 mapped keys (of > 33297 keys)" > [30] "hugene10sttranscriptclusterSYMBOL has 19962 mapped keys (of > 33297 keys)" > [31] "hugene10sttranscriptclusterUNIGENE has 19578 mapped keys (of > 33297 keys)" > [32] "hugene10sttranscriptclusterUNIPROT has 18193 mapped keys (of > 33297 keys)" > [33] "" > [34] "" > [35] "Additional Information about this package:" > [36] "" > [37] "DB schema: HUMANCHIP_DB" > [38] "DB schema version: 2.1" > [39] "Organism: Homo sapiens" > [40] "Date for NCBI data: 2014-Mar13" > [41] "Date for GO data: 20140308" > [42] "Date for KEGG data: 2011-Mar15" > [43] "Date for Golden Path data: 2010-Mar22" > [44] "Date for Ensembl data: 2014-Feb26" > > It seems like something is broken there showing in line 4: > [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of > 33297 keys)" > > Any ideas on how to solve this? Or whether this is a bug on my > side or on the package side? > > Kind Regards > > Thomas > > > -- > Universit? du Luxembourg > Facult? des Sciences, de la Technologie et de la Communication > Campus Limpertsberg, BRB 2.13 > 162a, avenue de la Fa?encerie > L-1511 Luxembourg > Email: thomas.pfau at uni.lu <mailto:thomas.pfau at="" uni.lu=""> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 [[alternative HTML version deleted]]

Annotation GO ChipDb probe • 1.8k views

ADD COMMENT • link updated 9.9 years ago by Thomas Pfau ▴ 20 • written 9.9 years ago by Marc Carlson ★ 7.2k

0

Entering edit mode

Thomas Pfau ▴ 20

@thomas-pfau-6715

Last seen 8.9 years ago

Luxembourg

Hi Marc, Thanks for the clarification. I just stumbled over this as I read that newer chips often have transcript specific probes and since entrez gene ids do not reflect those probes I was kind of hoped that these accessions would allow me a more precise mapping (or at least the potential to then get other database IDs that match to the specific transcripts out of the accessions). Learning that the accessions are not the way to go, I'm wondering whether there is any linkage to transcripts. Best, Thomas On 09/04/2014 02:38 AM, Marc Carlson wrote: > Hi. Sorry for the delay I was not in the office for almost a week > (and I left the day before this question popped up). Part of the > reason for the confusion here is because the ACCNUM field is supposed > to represent the source accessions that were used when designing the > package. In that sense ACCNUM is kind of an anachronism since I don't > think people really design chips this way very much anymore, so the > reason that bimap is even present is largely for backwards > compatibility more than anything else. This is why the man page for > the ACCNUM mapping says this: > > "For chip packages such as this, the ACCNUM mapping comes directly > from the manufacturer. This is different from other mappings which > are mapped onto the probes via an Entrez Gene identifier." > > Anyhow the code that builds the ChipDb package is proceeding under the > notion that you would only "have" those special ACCNUM values if those > were listed in your primary (fileName) set of keys. That is, if the > probes are not really based on genbank accessions then you don't > really have any ACCNUM values anyways and that field should (in that > case) probably be left out entirely. > > So if you don't have legitimate ACCNUM values (that is you are not > dealing with an old chip where these really are the primary initial > keys that everything was based off of), then I don't think you should > fake them into the package by including them 1st. Because effectively > what you will be doing is to inadvertently resurrect old retired IDs > from the dead. I mean yes you can extract them out like that with old > dead accession numbers: but I don't think it's best practice to do > that. Those ids were presumably retired for a reason. > > I hope this helps to explain things better, > > > Marc > > > > > On 08/29/2014 07:15 AM, James W. MacDonald wrote: >> Hi Thomas, >> >> I built that package, and as you note, there are no accession >> numbers. But maybe that is because I misunderstand something, so I am >> directly including Marc Carlson in this conversation. >> >> Since the annotation packages are Gene ID-centric, I create two >> files, one with probeid->GeneID, and one with >> probeid->GeneBank/RefSeq ID. I then use the first file as the primary >> annotation file, and the second as the 'otherSrc' file. If I then run >> makeDBPackage(), I get this output: >> >> baseMapType is eg >> Prepending Metadata >> Creating Genes table >> Appending Probes >> Found 0 Probe Accessions >> Appending Gene Info >> Found 19962 Gene Names >> Found 19962 Gene Symbols >> <snip> >> >> But if I then reverse the source files, using the second file as the >> primary annotation file, and the GeneID file as the 'otherSrc' file, >> I get: >> >> baseMapType is gb or gbNRef >> Prepending Metadata >> Creating Genes table >> Appending Probes >> Found 21941 Probe Accessions >> Appending Gene Info >> Found 20195 Gene Names >> Found 20195 Gene Symbols >> <snip> >> >> From my understanding of the SQLForge vignette, I should be able to >> use either ordering, and get identical results, but obviously this is >> not the case. Marc, can you shed some light on this? Evidently I >> should re-make the packages using gbNRef rather than eg as the >> baseMapType. >> >> Best, >> >> Jim >> >> >> >> >> On Fri, Aug 29, 2014 at 4:30 AM, Thomas Pfau <thomas.pfau at="" uni.lu="">> <mailto:thomas.pfau at="" uni.lu="">> wrote: >> >> Hello, >> >> I just tried to get a probe to accession matching the above >> annotation database. In particular it does not yield any mappings >> for accessions. (i.e. >> x <- hugene10sttranscriptclusterACCNUM >> mapped_probes <- mappedkeys(x) >> yields an empty mapped_probes list. >> >> >> I'm Running R 3.1.1 on ubuntu. >> The loaded packages are: >> >> [1] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 >> [4] IRanges_1.22.10 oligoClasses_1.26.0 >> hugene10sttranscriptcluster.db_8.1.0 >> [7] org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 >> [10] AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0 >> [13] BiocGenerics_0.10.0 BiocInstaller_1.14.2 >> >> and capture.output(hugene10sttranscriptcluster()) yields: >> [1] "Quality control information for hugene10sttranscriptcluster:" >> [2] "" >> [3] "" >> [4] "This package has the following mappings:" >> [5] "" >> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of >> 33297 keys)" >> [7] "hugene10sttranscriptclusterALIAS2PROBE has 60778 mapped >> keys (of 103510 keys)" >> [8] "hugene10sttranscriptclusterCHR has 19962 mapped keys (of >> 33297 keys)" >> [9] "hugene10sttranscriptclusterCHRLENGTHS has 93 mapped keys >> (of 93 keys)" >> [10] "hugene10sttranscriptclusterCHRLOC has 19424 mapped keys (of >> 33297 keys)" >> [11] "hugene10sttranscriptclusterCHRLOCEND has 19424 mapped keys >> (of 33297 keys)" >> [12] "hugene10sttranscriptclusterENSEMBL has 19416 mapped keys >> (of 33297 keys)" >> [13] "hugene10sttranscriptclusterENSEMBL2PROBE has 20590 mapped >> keys (of 28046 keys)" >> [14] "hugene10sttranscriptclusterENTREZID has 19962 mapped keys >> (of 33297 keys)" >> [15] "hugene10sttranscriptclusterENZYME has 2201 mapped keys (of >> 33297 keys)" >> [16] "hugene10sttranscriptclusterENZYME2PROBE has 958 mapped keys >> (of 975 keys)" >> [17] "hugene10sttranscriptclusterGENENAME has 19962 mapped keys >> (of 33297 keys)" >> [18] "hugene10sttranscriptclusterGO has 17412 mapped keys (of >> 33297 keys)" >> [19] "hugene10sttranscriptclusterGO2ALLPROBES has 17930 mapped >> keys (of 18078 keys)" >> [20] "hugene10sttranscriptclusterGO2PROBE has 13970 mapped keys >> (of 14134 keys)" >> [21] "hugene10sttranscriptclusterMAP has 19832 mapped keys (of >> 33297 keys)" >> [22] "hugene10sttranscriptclusterOMIM has 13778 mapped keys (of >> 33297 keys)" >> [23] "hugene10sttranscriptclusterPATH has 5768 mapped keys (of >> 33297 keys)" >> [24] "hugene10sttranscriptclusterPATH2PROBE has 229 mapped keys >> (of 229 keys)" >> [25] "hugene10sttranscriptclusterPFAM has 18146 mapped keys (of >> 33297 keys)" >> [26] "hugene10sttranscriptclusterPMID has 19726 mapped keys (of >> 33297 keys)" >> [27] "hugene10sttranscriptclusterPMID2PROBE has 396421 mapped >> keys (of 412133 keys)" >> [28] "hugene10sttranscriptclusterPROSITE has 18146 mapped keys >> (of 33297 keys)" >> [29] "hugene10sttranscriptclusterREFSEQ has 19873 mapped keys (of >> 33297 keys)" >> [30] "hugene10sttranscriptclusterSYMBOL has 19962 mapped keys (of >> 33297 keys)" >> [31] "hugene10sttranscriptclusterUNIGENE has 19578 mapped keys >> (of 33297 keys)" >> [32] "hugene10sttranscriptclusterUNIPROT has 18193 mapped keys >> (of 33297 keys)" >> [33] "" >> [34] "" >> [35] "Additional Information about this package:" >> [36] "" >> [37] "DB schema: HUMANCHIP_DB" >> [38] "DB schema version: 2.1" >> [39] "Organism: Homo sapiens" >> [40] "Date for NCBI data: 2014-Mar13" >> [41] "Date for GO data: 20140308" >> [42] "Date for KEGG data: 2011-Mar15" >> [43] "Date for Golden Path data: 2010-Mar22" >> [44] "Date for Ensembl data: 2014-Feb26" >> >> It seems like something is broken there showing in line 4: >> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of >> 33297 keys)" >> >> Any ideas on how to solve this? Or whether this is a bug on my >> side or on the package side? >> >> Kind Regards >> >> Thomas >> >> >> -- >> Universit? du Luxembourg >> Facult? des Sciences, de la Technologie et de la Communication >> Campus Limpertsberg, BRB 2.13 >> 162a, avenue de la Fa?encerie >> L-1511 Luxembourg >> Email: thomas.pfau at uni.lu <mailto:thomas.pfau at="" uni.lu=""> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 > -- Universit? du Luxembourg Facult? des Sciences, de la Technologie et de la Communication Campus Limpertsberg, BRB 2.13 162a, avenue de la Fa?encerie L-1511 Luxembourg Email: thomas.pfau at uni.lu [[alternative HTML version deleted]]

ADD COMMENT • link 9.9 years ago Thomas Pfau ▴ 20

0

Entering edit mode

Hi Thomas, You are correct that the current ChipDb packages (as generated by the makeDBPackage function) are not designed for transcript level specificity at all. They are meant to be gene centric only. A popular bioconductor object that works at the transcript level would be something like the TranscriptDb object (which has a different use case). It's possible that it's finally time to think about making some transcript centric ChipDb style of objects. And at 1st blush these might not even be all that tricky to make and use. But before we go too far down that path I am curious about how many platforms could actually be able to take advantage of that (and how many probes on those platforms could even detect with that level of specificity)? Marc On 09/03/2014 11:25 PM, Thomas Pfau wrote: > Hi Marc, > > Thanks for the clarification. I just stumbled over this as I read that > newer chips often have transcript specific probes and since entrez > gene ids do not reflect those probes I was kind of hoped that these > accessions would allow me a more precise mapping (or at least the > potential to then get other database IDs that match to the specific > transcripts out of the accessions). > Learning that the accessions are not the way to go, I'm wondering > whether there is any linkage to transcripts. > > Best, > > Thomas > > On 09/04/2014 02:38 AM, Marc Carlson wrote: >> Hi. Sorry for the delay I was not in the office for almost a week >> (and I left the day before this question popped up). Part of the >> reason for the confusion here is because the ACCNUM field is supposed >> to represent the source accessions that were used when designing the >> package. In that sense ACCNUM is kind of an anachronism since I >> don't think people really design chips this way very much anymore, so >> the reason that bimap is even present is largely for backwards >> compatibility more than anything else. This is why the man page for >> the ACCNUM mapping says this: >> >> "For chip packages such as this, the ACCNUM mapping comes directly >> from the manufacturer. This is different from other mappings which >> are mapped onto the probes via an Entrez Gene identifier." >> >> Anyhow the code that builds the ChipDb package is proceeding under >> the notion that you would only "have" those special ACCNUM values if >> those were listed in your primary (fileName) set of keys. That is, >> if the probes are not really based on genbank accessions then you >> don't really have any ACCNUM values anyways and that field should (in >> that case) probably be left out entirely. >> >> So if you don't have legitimate ACCNUM values (that is you are not >> dealing with an old chip where these really are the primary initial >> keys that everything was based off of), then I don't think you should >> fake them into the package by including them 1st. Because >> effectively what you will be doing is to inadvertently resurrect old >> retired IDs from the dead. I mean yes you can extract them out like >> that with old dead accession numbers: but I don't think it's best >> practice to do that. Those ids were presumably retired for a reason. >> >> I hope this helps to explain things better, >> >> >> Marc >> >> >> >> >> On 08/29/2014 07:15 AM, James W. MacDonald wrote: >>> Hi Thomas, >>> >>> I built that package, and as you note, there are no accession >>> numbers. But maybe that is because I misunderstand something, so I >>> am directly including Marc Carlson in this conversation. >>> >>> Since the annotation packages are Gene ID-centric, I create two >>> files, one with probeid->GeneID, and one with >>> probeid->GeneBank/RefSeq ID. I then use the first file as the >>> primary annotation file, and the second as the 'otherSrc' file. If I >>> then run makeDBPackage(), I get this output: >>> >>> baseMapType is eg >>> Prepending Metadata >>> Creating Genes table >>> Appending Probes >>> Found 0 Probe Accessions >>> Appending Gene Info >>> Found 19962 Gene Names >>> Found 19962 Gene Symbols >>> <snip> >>> >>> But if I then reverse the source files, using the second file as the >>> primary annotation file, and the GeneID file as the 'otherSrc' file, >>> I get: >>> >>> baseMapType is gb or gbNRef >>> Prepending Metadata >>> Creating Genes table >>> Appending Probes >>> Found 21941 Probe Accessions >>> Appending Gene Info >>> Found 20195 Gene Names >>> Found 20195 Gene Symbols >>> <snip> >>> >>> From my understanding of the SQLForge vignette, I should be able to >>> use either ordering, and get identical results, but obviously this >>> is not the case. Marc, can you shed some light on this? Evidently I >>> should re-make the packages using gbNRef rather than eg as the >>> baseMapType. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> >>> On Fri, Aug 29, 2014 at 4:30 AM, Thomas Pfau <thomas.pfau at="" uni.lu="">>> <mailto:thomas.pfau at="" uni.lu="">> wrote: >>> >>> Hello, >>> >>> I just tried to get a probe to accession matching the above >>> annotation database. In particular it does not yield any >>> mappings for accessions. (i.e. >>> x <- hugene10sttranscriptclusterACCNUM >>> mapped_probes <- mappedkeys(x) >>> yields an empty mapped_probes list. >>> >>> >>> I'm Running R 3.1.1 on ubuntu. >>> The loaded packages are: >>> >>> [1] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 >>> [4] IRanges_1.22.10 oligoClasses_1.26.0 >>> hugene10sttranscriptcluster.db_8.1.0 >>> [7] org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 >>> [10] AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0 >>> [13] BiocGenerics_0.10.0 BiocInstaller_1.14.2 >>> >>> and capture.output(hugene10sttranscriptcluster()) yields: >>> [1] "Quality control information for hugene10sttranscriptcluster:" >>> [2] "" >>> [3] "" >>> [4] "This package has the following mappings:" >>> [5] "" >>> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of >>> 33297 keys)" >>> [7] "hugene10sttranscriptclusterALIAS2PROBE has 60778 mapped >>> keys (of 103510 keys)" >>> [8] "hugene10sttranscriptclusterCHR has 19962 mapped keys (of >>> 33297 keys)" >>> [9] "hugene10sttranscriptclusterCHRLENGTHS has 93 mapped keys >>> (of 93 keys)" >>> [10] "hugene10sttranscriptclusterCHRLOC has 19424 mapped keys >>> (of 33297 keys)" >>> [11] "hugene10sttranscriptclusterCHRLOCEND has 19424 mapped keys >>> (of 33297 keys)" >>> [12] "hugene10sttranscriptclusterENSEMBL has 19416 mapped keys >>> (of 33297 keys)" >>> [13] "hugene10sttranscriptclusterENSEMBL2PROBE has 20590 mapped >>> keys (of 28046 keys)" >>> [14] "hugene10sttranscriptclusterENTREZID has 19962 mapped keys >>> (of 33297 keys)" >>> [15] "hugene10sttranscriptclusterENZYME has 2201 mapped keys (of >>> 33297 keys)" >>> [16] "hugene10sttranscriptclusterENZYME2PROBE has 958 mapped >>> keys (of 975 keys)" >>> [17] "hugene10sttranscriptclusterGENENAME has 19962 mapped keys >>> (of 33297 keys)" >>> [18] "hugene10sttranscriptclusterGO has 17412 mapped keys (of >>> 33297 keys)" >>> [19] "hugene10sttranscriptclusterGO2ALLPROBES has 17930 mapped >>> keys (of 18078 keys)" >>> [20] "hugene10sttranscriptclusterGO2PROBE has 13970 mapped keys >>> (of 14134 keys)" >>> [21] "hugene10sttranscriptclusterMAP has 19832 mapped keys (of >>> 33297 keys)" >>> [22] "hugene10sttranscriptclusterOMIM has 13778 mapped keys (of >>> 33297 keys)" >>> [23] "hugene10sttranscriptclusterPATH has 5768 mapped keys (of >>> 33297 keys)" >>> [24] "hugene10sttranscriptclusterPATH2PROBE has 229 mapped keys >>> (of 229 keys)" >>> [25] "hugene10sttranscriptclusterPFAM has 18146 mapped keys (of >>> 33297 keys)" >>> [26] "hugene10sttranscriptclusterPMID has 19726 mapped keys (of >>> 33297 keys)" >>> [27] "hugene10sttranscriptclusterPMID2PROBE has 396421 mapped >>> keys (of 412133 keys)" >>> [28] "hugene10sttranscriptclusterPROSITE has 18146 mapped keys >>> (of 33297 keys)" >>> [29] "hugene10sttranscriptclusterREFSEQ has 19873 mapped keys >>> (of 33297 keys)" >>> [30] "hugene10sttranscriptclusterSYMBOL has 19962 mapped keys >>> (of 33297 keys)" >>> [31] "hugene10sttranscriptclusterUNIGENE has 19578 mapped keys >>> (of 33297 keys)" >>> [32] "hugene10sttranscriptclusterUNIPROT has 18193 mapped keys >>> (of 33297 keys)" >>> [33] "" >>> [34] "" >>> [35] "Additional Information about this package:" >>> [36] "" >>> [37] "DB schema: HUMANCHIP_DB" >>> [38] "DB schema version: 2.1" >>> [39] "Organism: Homo sapiens" >>> [40] "Date for NCBI data: 2014-Mar13" >>> [41] "Date for GO data: 20140308" >>> [42] "Date for KEGG data: 2011-Mar15" >>> [43] "Date for Golden Path data: 2010-Mar22" >>> [44] "Date for Ensembl data: 2014-Feb26" >>> >>> It seems like something is broken there showing in line 4: >>> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of >>> 33297 keys)" >>> >>> Any ideas on how to solve this? Or whether this is a bug on my >>> side or on the package side? >>> >>> Kind Regards >>> >>> Thomas >>> >>> >>> -- >>> Universit? du Luxembourg >>> Facult? des Sciences, de la Technologie et de la Communication >>> Campus Limpertsberg, BRB 2.13 >>> 162a, avenue de la Fa?encerie >>> L-1511 Luxembourg >>> Email: thomas.pfau at uni.lu <mailto:thomas.pfau at="" uni.lu=""> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >> > > -- > Universit? du Luxembourg > Facult? des Sciences, de la Technologie et de la Communication > Campus Limpertsberg, BRB 2.13 > 162a, avenue de la Fa?encerie > L-1511 Luxembourg > Email:thomas.pfau at uni.lu [[alternative HTML version deleted]]

ADD REPLY • link 9.9 years ago Marc Carlson ★ 7.2k

0

Entering edit mode

Hi Marc, As far as the offerings from Affymetrix go, the only possible contender for measuring transcript abundance (where by transcript abundance, I mean the ability to infer different splice variants for a given gene, and then infer relative abundances for each) is the HTA 2.0 array. And having some small experience with that array, I am not convinced that it is even possible with the HTA 2.0, without having absurd amounts of replication and really big differences in the fractional representation of transcripts. It is true that Affymetrix has transcript-level annotation data in their annotation files for e.g., the Gene ST arrays, but I think it is a mistake to think that these data mean anything other than to imply that the probes for a given probeset interrogate regions that are considered to be a part of each listed transcript. In other words, the expression value from a given probeset might be a combination of the abundances of multiple underlying splice variants for a particular gene, but those data are completely confounded, so all we can reasonably do is to attribute the data to the gene rather than one or more transcript variants. Best, Jim On Thu, Sep 4, 2014 at 1:04 PM, Marc Carlson <mcarlson at="" fhcrc.org=""> wrote: > Hi Thomas, > > You are correct that the current ChipDb packages (as generated by the > makeDBPackage function) are not designed for transcript level > specificity at all. They are meant to be gene centric only. A popular > bioconductor object that works at the transcript level would be > something like the TranscriptDb object (which has a different use > case). It's possible that it's finally time to think about making some > transcript centric ChipDb style of objects. And at 1st blush these > might not even be all that tricky to make and use. But before we go too > far down that path I am curious about how many platforms could actually > be able to take advantage of that (and how many probes on those > platforms could even detect with that level of specificity)? > > Marc > > > On 09/03/2014 11:25 PM, Thomas Pfau wrote: > > Hi Marc, > > > > Thanks for the clarification. I just stumbled over this as I read that > > newer chips often have transcript specific probes and since entrez > > gene ids do not reflect those probes I was kind of hoped that these > > accessions would allow me a more precise mapping (or at least the > > potential to then get other database IDs that match to the specific > > transcripts out of the accessions). > > Learning that the accessions are not the way to go, I'm wondering > > whether there is any linkage to transcripts. > > > > Best, > > > > Thomas > > > > On 09/04/2014 02:38 AM, Marc Carlson wrote: > >> Hi. Sorry for the delay I was not in the office for almost a week > >> (and I left the day before this question popped up). Part of the > >> reason for the confusion here is because the ACCNUM field is supposed > >> to represent the source accessions that were used when designing the > >> package. In that sense ACCNUM is kind of an anachronism since I > >> don't think people really design chips this way very much anymore, so > >> the reason that bimap is even present is largely for backwards > >> compatibility more than anything else. This is why the man page for > >> the ACCNUM mapping says this: > >> > >> "For chip packages such as this, the ACCNUM mapping comes directly > >> from the manufacturer. This is different from other mappings which > >> are mapped onto the probes via an Entrez Gene identifier." > >> > >> Anyhow the code that builds the ChipDb package is proceeding under > >> the notion that you would only "have" those special ACCNUM values if > >> those were listed in your primary (fileName) set of keys. That is, > >> if the probes are not really based on genbank accessions then you > >> don't really have any ACCNUM values anyways and that field should (in > >> that case) probably be left out entirely. > >> > >> So if you don't have legitimate ACCNUM values (that is you are not > >> dealing with an old chip where these really are the primary initial > >> keys that everything was based off of), then I don't think you should > >> fake them into the package by including them 1st. Because > >> effectively what you will be doing is to inadvertently resurrect old > >> retired IDs from the dead. I mean yes you can extract them out like > >> that with old dead accession numbers: but I don't think it's best > >> practice to do that. Those ids were presumably retired for a reason. > >> > >> I hope this helps to explain things better, > >> > >> > >> Marc > >> > >> > >> > >> > >> On 08/29/2014 07:15 AM, James W. MacDonald wrote: > >>> Hi Thomas, > >>> > >>> I built that package, and as you note, there are no accession > >>> numbers. But maybe that is because I misunderstand something, so I > >>> am directly including Marc Carlson in this conversation. > >>> > >>> Since the annotation packages are Gene ID-centric, I create two > >>> files, one with probeid->GeneID, and one with > >>> probeid->GeneBank/RefSeq ID. I then use the first file as the > >>> primary annotation file, and the second as the 'otherSrc' file. If I > >>> then run makeDBPackage(), I get this output: > >>> > >>> baseMapType is eg > >>> Prepending Metadata > >>> Creating Genes table > >>> Appending Probes > >>> Found 0 Probe Accessions > >>> Appending Gene Info > >>> Found 19962 Gene Names > >>> Found 19962 Gene Symbols > >>> <snip> > >>> > >>> But if I then reverse the source files, using the second file as the > >>> primary annotation file, and the GeneID file as the 'otherSrc' file, > >>> I get: > >>> > >>> baseMapType is gb or gbNRef > >>> Prepending Metadata > >>> Creating Genes table > >>> Appending Probes > >>> Found 21941 Probe Accessions > >>> Appending Gene Info > >>> Found 20195 Gene Names > >>> Found 20195 Gene Symbols > >>> <snip> > >>> > >>> From my understanding of the SQLForge vignette, I should be able to > >>> use either ordering, and get identical results, but obviously this > >>> is not the case. Marc, can you shed some light on this? Evidently I > >>> should re-make the packages using gbNRef rather than eg as the > >>> baseMapType. > >>> > >>> Best, > >>> > >>> Jim > >>> > >>> > >>> > >>> > >>> On Fri, Aug 29, 2014 at 4:30 AM, Thomas Pfau <thomas.pfau at="" uni.lu=""> >>> <mailto:thomas.pfau at="" uni.lu="">> wrote: > >>> > >>> Hello, > >>> > >>> I just tried to get a probe to accession matching the above > >>> annotation database. In particular it does not yield any > >>> mappings for accessions. (i.e. > >>> x <- hugene10sttranscriptclusterACCNUM > >>> mapped_probes <- mappedkeys(x) > >>> yields an empty mapped_probes list. > >>> > >>> > >>> I'm Running R 3.1.1 on ubuntu. > >>> The loaded packages are: > >>> > >>> [1] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 > >>> [4] IRanges_1.22.10 oligoClasses_1.26.0 > >>> hugene10sttranscriptcluster.db_8.1.0 > >>> [7] org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 > >>> [10] AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0 > >>> [13] BiocGenerics_0.10.0 BiocInstaller_1.14.2 > >>> > >>> and capture.output(hugene10sttranscriptcluster()) yields: > >>> [1] "Quality control information for hugene10sttranscriptcluster:" > >>> [2] "" > >>> [3] "" > >>> [4] "This package has the following mappings:" > >>> [5] "" > >>> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of > >>> 33297 keys)" > >>> [7] "hugene10sttranscriptclusterALIAS2PROBE has 60778 mapped > >>> keys (of 103510 keys)" > >>> [8] "hugene10sttranscriptclusterCHR has 19962 mapped keys (of > >>> 33297 keys)" > >>> [9] "hugene10sttranscriptclusterCHRLENGTHS has 93 mapped keys > >>> (of 93 keys)" > >>> [10] "hugene10sttranscriptclusterCHRLOC has 19424 mapped keys > >>> (of 33297 keys)" > >>> [11] "hugene10sttranscriptclusterCHRLOCEND has 19424 mapped keys > >>> (of 33297 keys)" > >>> [12] "hugene10sttranscriptclusterENSEMBL has 19416 mapped keys > >>> (of 33297 keys)" > >>> [13] "hugene10sttranscriptclusterENSEMBL2PROBE has 20590 mapped > >>> keys (of 28046 keys)" > >>> [14] "hugene10sttranscriptclusterENTREZID has 19962 mapped keys > >>> (of 33297 keys)" > >>> [15] "hugene10sttranscriptclusterENZYME has 2201 mapped keys (of > >>> 33297 keys)" > >>> [16] "hugene10sttranscriptclusterENZYME2PROBE has 958 mapped > >>> keys (of 975 keys)" > >>> [17] "hugene10sttranscriptclusterGENENAME has 19962 mapped keys > >>> (of 33297 keys)" > >>> [18] "hugene10sttranscriptclusterGO has 17412 mapped keys (of > >>> 33297 keys)" > >>> [19] "hugene10sttranscriptclusterGO2ALLPROBES has 17930 mapped > >>> keys (of 18078 keys)" > >>> [20] "hugene10sttranscriptclusterGO2PROBE has 13970 mapped keys > >>> (of 14134 keys)" > >>> [21] "hugene10sttranscriptclusterMAP has 19832 mapped keys (of > >>> 33297 keys)" > >>> [22] "hugene10sttranscriptclusterOMIM has 13778 mapped keys (of > >>> 33297 keys)" > >>> [23] "hugene10sttranscriptclusterPATH has 5768 mapped keys (of > >>> 33297 keys)" > >>> [24] "hugene10sttranscriptclusterPATH2PROBE has 229 mapped keys > >>> (of 229 keys)" > >>> [25] "hugene10sttranscriptclusterPFAM has 18146 mapped keys (of > >>> 33297 keys)" > >>> [26] "hugene10sttranscriptclusterPMID has 19726 mapped keys (of > >>> 33297 keys)" > >>> [27] "hugene10sttranscriptclusterPMID2PROBE has 396421 mapped > >>> keys (of 412133 keys)" > >>> [28] "hugene10sttranscriptclusterPROSITE has 18146 mapped keys > >>> (of 33297 keys)" > >>> [29] "hugene10sttranscriptclusterREFSEQ has 19873 mapped keys > >>> (of 33297 keys)" > >>> [30] "hugene10sttranscriptclusterSYMBOL has 19962 mapped keys > >>> (of 33297 keys)" > >>> [31] "hugene10sttranscriptclusterUNIGENE has 19578 mapped keys > >>> (of 33297 keys)" > >>> [32] "hugene10sttranscriptclusterUNIPROT has 18193 mapped keys > >>> (of 33297 keys)" > >>> [33] "" > >>> [34] "" > >>> [35] "Additional Information about this package:" > >>> [36] "" > >>> [37] "DB schema: HUMANCHIP_DB" > >>> [38] "DB schema version: 2.1" > >>> [39] "Organism: Homo sapiens" > >>> [40] "Date for NCBI data: 2014-Mar13" > >>> [41] "Date for GO data: 20140308" > >>> [42] "Date for KEGG data: 2011-Mar15" > >>> [43] "Date for Golden Path data: 2010-Mar22" > >>> [44] "Date for Ensembl data: 2014-Feb26" > >>> > >>> It seems like something is broken there showing in line 4: > >>> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of > >>> 33297 keys)" > >>> > >>> Any ideas on how to solve this? Or whether this is a bug on my > >>> side or on the package side? > >>> > >>> Kind Regards > >>> > >>> Thomas > >>> > >>> > >>> -- > >>> Universit? du Luxembourg > >>> Facult? des Sciences, de la Technologie et de la Communication > >>> Campus Limpertsberg, BRB 2.13 > >>> 162a, avenue de la Fa?encerie > >>> L-1511 Luxembourg > >>> Email: thomas.pfau at uni.lu <mailto:thomas.pfau at="" uni.lu=""> > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: > >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > >>> > >>> > >>> > >>> -- > >>> James W. MacDonald, M.S. > >>> Biostatistician > >>> University of Washington > >>> Environmental and Occupational Health Sciences > >>> 4225 Roosevelt Way NE, # 100 > >>> Seattle WA 98105-6099 > >> > > > > -- > > Universit? du Luxembourg > > Facult? des Sciences, de la Technologie et de la Communication > > Campus Limpertsberg, BRB 2.13 > > 162a, avenue de la Fa?encerie > > L-1511 Luxembourg > > Email:thomas.pfau at uni.lu > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.9 years ago James W. MacDonald 66k

0

Entering edit mode

Thanks Jim, I suspected that things might look like that, the technical problems to overcome in making a platform like that are considerable. But it's always good to hear clear verification from those of you who have to deal with these platforms first hand about how things are going. Marc On 09/04/2014 10:52 AM, James W. MacDonald wrote: > Hi Marc, > > As far as the offerings from Affymetrix go, the only possible > contender for measuring transcript abundance (where by transcript > abundance, I mean the ability to infer different splice variants for a > given gene, and then infer relative abundances for each) is the HTA > 2.0 array. And having some small experience with that array, I am not > convinced that it is even possible with the HTA 2.0, without having > absurd amounts of replication and really big differences in the > fractional representation of transcripts. > > It is true that Affymetrix has transcript-level annotation data in > their annotation files for e.g., the Gene ST arrays, but I think it is > a mistake to think that these data mean anything other than to imply > that the probes for a given probeset interrogate regions that are > considered to be a part of each listed transcript. In other words, the > expression value from a given probeset might be a combination of the > abundances of multiple underlying splice variants for a particular > gene, but those data are completely confounded, so all we can > reasonably do is to attribute the data to the gene rather than one or > more transcript variants. > > Best, > > Jim > > > On Thu, Sep 4, 2014 at 1:04 PM, Marc Carlson <mcarlson at="" fhcrc.org=""> <mailto:mcarlson at="" fhcrc.org="">> wrote: > > Hi Thomas, > > You are correct that the current ChipDb packages (as generated by the > makeDBPackage function) are not designed for transcript level > specificity at all. They are meant to be gene centric only. A > popular > bioconductor object that works at the transcript level would be > something like the TranscriptDb object (which has a different use > case). It's possible that it's finally time to think about making > some > transcript centric ChipDb style of objects. And at 1st blush these > might not even be all that tricky to make and use. But before we > go too > far down that path I am curious about how many platforms could > actually > be able to take advantage of that (and how many probes on those > platforms could even detect with that level of specificity)? > > Marc > > > On 09/03/2014 11:25 PM, Thomas Pfau wrote: > > Hi Marc, > > > > Thanks for the clarification. I just stumbled over this as I > read that > > newer chips often have transcript specific probes and since entrez > > gene ids do not reflect those probes I was kind of hoped that these > > accessions would allow me a more precise mapping (or at least the > > potential to then get other database IDs that match to the specific > > transcripts out of the accessions). > > Learning that the accessions are not the way to go, I'm wondering > > whether there is any linkage to transcripts. > > > > Best, > > > > Thomas > > > > On 09/04/2014 02:38 AM, Marc Carlson wrote: > >> Hi. Sorry for the delay I was not in the office for almost a week > >> (and I left the day before this question popped up). Part of the > >> reason for the confusion here is because the ACCNUM field is > supposed > >> to represent the source accessions that were used when > designing the > >> package. In that sense ACCNUM is kind of an anachronism since I > >> don't think people really design chips this way very much > anymore, so > >> the reason that bimap is even present is largely for backwards > >> compatibility more than anything else. This is why the man page for > >> the ACCNUM mapping says this: > >> > >> "For chip packages such as this, the ACCNUM mapping comes directly > >> from the manufacturer. This is different from other mappings which > >> are mapped onto the probes via an Entrez Gene identifier." > >> > >> Anyhow the code that builds the ChipDb package is proceeding under > >> the notion that you would only "have" those special ACCNUM > values if > >> those were listed in your primary (fileName) set of keys. That is, > >> if the probes are not really based on genbank accessions then you > >> don't really have any ACCNUM values anyways and that field > should (in > >> that case) probably be left out entirely. > >> > >> So if you don't have legitimate ACCNUM values (that is you are not > >> dealing with an old chip where these really are the primary initial > >> keys that everything was based off of), then I don't think you > should > >> fake them into the package by including them 1st. Because > >> effectively what you will be doing is to inadvertently > resurrect old > >> retired IDs from the dead. I mean yes you can extract them out > like > >> that with old dead accession numbers: but I don't think it's best > >> practice to do that. Those ids were presumably retired for a > reason. > >> > >> I hope this helps to explain things better, > >> > >> > >> Marc > >> > >> > >> > >> > >> On 08/29/2014 07:15 AM, James W. MacDonald wrote: > >>> Hi Thomas, > >>> > >>> I built that package, and as you note, there are no accession > >>> numbers. But maybe that is because I misunderstand something, so I > >>> am directly including Marc Carlson in this conversation. > >>> > >>> Since the annotation packages are Gene ID-centric, I create two > >>> files, one with probeid->GeneID, and one with > >>> probeid->GeneBank/RefSeq ID. I then use the first file as the > >>> primary annotation file, and the second as the 'otherSrc' > file. If I > >>> then run makeDBPackage(), I get this output: > >>> > >>> baseMapType is eg > >>> Prepending Metadata > >>> Creating Genes table > >>> Appending Probes > >>> Found 0 Probe Accessions > >>> Appending Gene Info > >>> Found 19962 Gene Names > >>> Found 19962 Gene Symbols > >>> <snip> > >>> > >>> But if I then reverse the source files, using the second file > as the > >>> primary annotation file, and the GeneID file as the 'otherSrc' > file, > >>> I get: > >>> > >>> baseMapType is gb or gbNRef > >>> Prepending Metadata > >>> Creating Genes table > >>> Appending Probes > >>> Found 21941 Probe Accessions > >>> Appending Gene Info > >>> Found 20195 Gene Names > >>> Found 20195 Gene Symbols > >>> <snip> > >>> > >>> From my understanding of the SQLForge vignette, I should be > able to > >>> use either ordering, and get identical results, but obviously this > >>> is not the case. Marc, can you shed some light on this? > Evidently I > >>> should re-make the packages using gbNRef rather than eg as the > >>> baseMapType. > >>> > >>> Best, > >>> > >>> Jim > >>> > >>> > >>> > >>> > >>> On Fri, Aug 29, 2014 at 4:30 AM, Thomas Pfau > <thomas.pfau at="" uni.lu="" <mailto:thomas.pfau="" at="" uni.lu=""> > >>> <mailto:thomas.pfau at="" uni.lu="" <mailto:thomas.pfau="" at="" uni.lu="">>> wrote: > >>> > >>> Hello, > >>> > >>> I just tried to get a probe to accession matching the above > >>> annotation database. In particular it does not yield any > >>> mappings for accessions. (i.e. > >>> x <- hugene10sttranscriptclusterACCNUM > >>> mapped_probes <- mappedkeys(x) > >>> yields an empty mapped_probes list. > >>> > >>> > >>> I'm Running R 3.1.1 on ubuntu. > >>> The loaded packages are: > >>> > >>> [1] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 > >>> [4] IRanges_1.22.10 oligoClasses_1.26.0 > >>> hugene10sttranscriptcluster.db_8.1.0 > >>> [7] org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 > >>> [10] AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0 > >>> [13] BiocGenerics_0.10.0 BiocInstaller_1.14.2 > >>> > >>> and capture.output(hugene10sttranscriptcluster()) yields: > >>> [1] "Quality control information for > hugene10sttranscriptcluster:" > >>> [2] "" > >>> [3] "" > >>> [4] "This package has the following mappings:" > >>> [5] "" > >>> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of > >>> 33297 keys)" > >>> [7] "hugene10sttranscriptclusterALIAS2PROBE has 60778 mapped > >>> keys (of 103510 keys)" > >>> [8] "hugene10sttranscriptclusterCHR has 19962 mapped keys (of > >>> 33297 keys)" > >>> [9] "hugene10sttranscriptclusterCHRLENGTHS has 93 mapped keys > >>> (of 93 keys)" > >>> [10] "hugene10sttranscriptclusterCHRLOC has 19424 mapped keys > >>> (of 33297 keys)" > >>> [11] "hugene10sttranscriptclusterCHRLOCEND has 19424 > mapped keys > >>> (of 33297 keys)" > >>> [12] "hugene10sttranscriptclusterENSEMBL has 19416 mapped keys > >>> (of 33297 keys)" > >>> [13] "hugene10sttranscriptclusterENSEMBL2PROBE has 20590 > mapped > >>> keys (of 28046 keys)" > >>> [14] "hugene10sttranscriptclusterENTREZID has 19962 mapped > keys > >>> (of 33297 keys)" > >>> [15] "hugene10sttranscriptclusterENZYME has 2201 mapped > keys (of > >>> 33297 keys)" > >>> [16] "hugene10sttranscriptclusterENZYME2PROBE has 958 mapped > >>> keys (of 975 keys)" > >>> [17] "hugene10sttranscriptclusterGENENAME has 19962 mapped > keys > >>> (of 33297 keys)" > >>> [18] "hugene10sttranscriptclusterGO has 17412 mapped keys (of > >>> 33297 keys)" > >>> [19] "hugene10sttranscriptclusterGO2ALLPROBES has 17930 mapped > >>> keys (of 18078 keys)" > >>> [20] "hugene10sttranscriptclusterGO2PROBE has 13970 mapped > keys > >>> (of 14134 keys)" > >>> [21] "hugene10sttranscriptclusterMAP has 19832 mapped keys (of > >>> 33297 keys)" > >>> [22] "hugene10sttranscriptclusterOMIM has 13778 mapped > keys (of > >>> 33297 keys)" > >>> [23] "hugene10sttranscriptclusterPATH has 5768 mapped keys (of > >>> 33297 keys)" > >>> [24] "hugene10sttranscriptclusterPATH2PROBE has 229 mapped > keys > >>> (of 229 keys)" > >>> [25] "hugene10sttranscriptclusterPFAM has 18146 mapped > keys (of > >>> 33297 keys)" > >>> [26] "hugene10sttranscriptclusterPMID has 19726 mapped > keys (of > >>> 33297 keys)" > >>> [27] "hugene10sttranscriptclusterPMID2PROBE has 396421 mapped > >>> keys (of 412133 keys)" > >>> [28] "hugene10sttranscriptclusterPROSITE has 18146 mapped keys > >>> (of 33297 keys)" > >>> [29] "hugene10sttranscriptclusterREFSEQ has 19873 mapped keys > >>> (of 33297 keys)" > >>> [30] "hugene10sttranscriptclusterSYMBOL has 19962 mapped keys > >>> (of 33297 keys)" > >>> [31] "hugene10sttranscriptclusterUNIGENE has 19578 mapped keys > >>> (of 33297 keys)" > >>> [32] "hugene10sttranscriptclusterUNIPROT has 18193 mapped keys > >>> (of 33297 keys)" > >>> [33] "" > >>> [34] "" > >>> [35] "Additional Information about this package:" > >>> [36] "" > >>> [37] "DB schema: HUMANCHIP_DB" > >>> [38] "DB schema version: 2.1" > >>> [39] "Organism: Homo sapiens" > >>> [40] "Date for NCBI data: 2014-Mar13" > >>> [41] "Date for GO data: 20140308" > >>> [42] "Date for KEGG data: 2011-Mar15" > >>> [43] "Date for Golden Path data: 2010-Mar22" > >>> [44] "Date for Ensembl data: 2014-Feb26" > >>> > >>> It seems like something is broken there showing in line 4: > >>> [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of > >>> 33297 keys)" > >>> > >>> Any ideas on how to solve this? Or whether this is a bug on my > >>> side or on the package side? > >>> > >>> Kind Regards > >>> > >>> Thomas > >>> > >>> > >>> -- > >>> Universit? du Luxembourg > >>> Facult? des Sciences, de la Technologie et de la Communication > >>> Campus Limpertsberg, BRB 2.13 > >>> 162a, avenue de la Fa?encerie > >>> L-1511 Luxembourg > >>> Email: thomas.pfau at uni.lu <mailto:thomas.pfau at="" uni.lu=""> > <mailto:thomas.pfau at="" uni.lu="" <mailto:thomas.pfau="" at="" uni.lu="">> > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-project.org=""> <mailto:bioconductor at="" r-project.org="">> > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: > >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > >>> > >>> > >>> > >>> -- > >>> James W. MacDonald, M.S. > >>> Biostatistician > >>> University of Washington > >>> Environmental and Occupational Health Sciences > >>> 4225 Roosevelt Way NE, # 100 > >>> Seattle WA 98105-6099 > >> > > > > -- > > Universit? du Luxembourg > > Facult? des Sciences, de la Technologie et de la Communication > > Campus Limpertsberg, BRB 2.13 > > 162a, avenue de la Fa?encerie > > L-1511 Luxembourg > > Email:thomas.pfau at uni.lu <mailto:email%3athomas.pfau at="" uni.lu=""> > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.9 years ago Marc Carlson ★ 7.2k

Login before adding your answer.