annotation package ?
2
0
Entering edit mode
Jing Huang ▴ 380
@jing-huang-4737
Last seen 10.2 years ago
Dear All members, I need to analyze a GEO database dataset. The data was generated with the platform GPL1528<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?acc="GPL1528">: NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was generated by Affymetrix platform. Can somebody advise me what R annotation package I should use to solve my problem in this case? Many Thanks Jing [[alternative HTML version deleted]]
Annotation hgu133plus2 Annotation hgu133plus2 • 1.8k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.3 years ago
United States
Oops, pasted the wrong link before. You want this one: http://www.bioconductor.org/packages/2.8/bioc/vignettes/AnnotationDbi/ inst/doc/SQLForge.pdf Marc On 08/22/2011 04:55 PM, Marc Carlson wrote: > Hi Jing, > > If you need a chip package that is not presently hosted, you can 1) > retrieve the probe to gene mappings from the people who made the > platform and then 2) follow the instructions in this vignette to > generate a custom package: > > http://www.bioconductor.org/packages/2.8/bioc/vignettes/AnnotationDb i/inst/doc/SQLForge.R > > > > Marc > > > On 08/22/2011 04:07 PM, Sean Davis wrote: >> Hi, Jing. >> >> You could try: >> >> http://bioconductor.org/packages/release/data/annotation/html/Opero nHumanV3.db.html >> >> >> Note that this might not be right, but the Operon set was in common >> use a few years ago. >> >> If this isn't what you need, you know that GEOquery automatically >> grabs the annotation data from NCBI GEO? For example using a GSE from >> GPL1528, see below. You can use the AnnotationDbi package to make >> your own annotation packages based on these annotations. In >> particular, for GPL1528, the Unigene IDs are included. >> >> Hope that helps. >> >> Sean >> >> >> >>> library(GEOquery) >> Loading required package: Biobase >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'browseVignettes()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation("pkgname")'. >> >> Setting options('download.file.method.GEOquery'='curl') >>> gse = getGEO("GSE2020") >> Found 1 file(s) >> GSE2020_series_matrix.txt.gz >> trying URL >> 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020/GSE2020_s eries_matrix.txt.gz' >> ftp data connection made, file length 518963 bytes >> opened URL >> ================================================== >> downloaded 506 Kb >> >> File stored at: >> /tmp/Rtmpdgx7wJ/GPL1528.soft >> >>> gse >> $GSE2020_series_matrix.txt.gz >> ExpressionSet (storageMode: lockedEnvironment) >> assayData: 21794 features, 10 samples >> element names: exprs >> protocolData: none >> phenoData >> sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total) >> varLabels: title geo_accession ... data_row_count (31 total) >> varMetadata: labelDescription >> featureData >> featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total) >> fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total) >> fvarMetadata: Column Description labelDescription >> experimentData: use 'experimentData(object)' >> Annotation: GPL1528 >> >>> head(fData(gse[[1]])) >> ID MADB_WELL_ID OLIGO_ID GENE UNIGENE >> 1140849_1 1140849_1 1140849 SptRpt-2a1 >> 1140850_1 1140850_1 1140850 SptRpt-2a2 >> 1140851_1 1140851_1 1140851 SptRpt-2a3 >> 1140852_1 1140852_1 1140852 SptRpt-2a4 >> 1140853_1 1140853_1 1140853 SptRpt-2a5 >> 1140854_1 1140854_1 1140854 SptRpt-2a6 >> >> DESCRIPTION >> 1140849_1 Human Beta-Actin PCR Product >> Human Beta-Actin 100ng/ul >> 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 >> chlorophyll a/b-binding protein >> 1140851_1 PCR Product 5 (LTP6) A. thaliana >> lipid transfer protien 6 >> 1140852_1 >> 3XSSC >> 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 >> chlorophyll a/b-binding protein >> 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana >> lipid transfer protien 6 >> GB_LIST >> 1140849_1 >> 1140850_1 >> 1140851_1 >> 1140852_1 >> 1140853_1 >> 1140854_1 >> >> SPOT_ID >> 1140849_1 Human Beta-Actin PCR Product >> Human Beta-Actin 100ng/ul >> 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 >> chlorophyll a/b-binding protein >> 1140851_1 PCR Product 5 (LTP6) A. thaliana >> lipid transfer protien 6 >> 1140852_1 >> 3XSSC >> 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 >> chlorophyll a/b-binding protein >> 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana >> lipid transfer protien 6 >> >> >> On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang<huangji at="" ohsu.edu=""> wrote: >>> Dear All members, >>> >>> I need to analyze a GEO database dataset. The data was generated >>> with the platform >>> GPL1528<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?acc="GPL1528">: >>> NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was >>> generated by Affymetrix platform. >>> >>> Can somebody advise me what R annotation package I should use to >>> solve my problem in this case? >>> >>> >>> Many Thanks >>> >>> Jing >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
Hi, Jing. You could try: http://bioconductor.org/packages/release/data/annotation/html/OperonHu manV3.db.html Note that this might not be right, but the Operon set was in common use a few years ago. If this isn't what you need, you know that GEOquery automatically grabs the annotation data from NCBI GEO? For example using a GSE from GPL1528, see below. You can use the AnnotationDbi package to make your own annotation packages based on these annotations. In particular, for GPL1528, the Unigene IDs are included. Hope that helps. Sean > library(GEOquery) Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. Setting options('download.file.method.GEOquery'='curl') > gse = getGEO("GSE2020") Found 1 file(s) GSE2020_series_matrix.txt.gz trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020/G SE2020_series_matrix.txt.gz' ftp data connection made, file length 518963 bytes opened URL ================================================== downloaded 506 Kb File stored at: /tmp/Rtmpdgx7wJ/GPL1528.soft > gse $GSE2020_series_matrix.txt.gz ExpressionSet (storageMode: lockedEnvironment) assayData: 21794 features, 10 samples element names: exprs protocolData: none phenoData sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total) varLabels: title geo_accession ... data_row_count (31 total) varMetadata: labelDescription featureData featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total) fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total) fvarMetadata: Column Description labelDescription experimentData: use 'experimentData(object)' Annotation: GPL1528 > head(fData(gse[[1]])) ID MADB_WELL_ID OLIGO_ID GENE UNIGENE 1140849_1 1140849_1 1140849 SptRpt-2a1 1140850_1 1140850_1 1140850 SptRpt-2a2 1140851_1 1140851_1 1140851 SptRpt-2a3 1140852_1 1140852_1 1140852 SptRpt-2a4 1140853_1 1140853_1 1140853 SptRpt-2a5 1140854_1 1140854_1 1140854 SptRpt-2a6 DESCRIPTION 1140849_1 Human Beta-Actin PCR Product Human Beta-Actin 100ng/ul 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140851_1 PCR Product 5 (LTP6) A. thaliana lipid transfer protien 6 1140852_1 3XSSC 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana lipid transfer protien 6 GB_LIST 1140849_1 1140850_1 1140851_1 1140852_1 1140853_1 1140854_1 SPOT_ID 1140849_1 Human Beta-Actin PCR Product Human Beta-Actin 100ng/ul 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140851_1 PCR Product 5 (LTP6) A. thaliana lipid transfer protien 6 1140852_1 3XSSC 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana lipid transfer protien 6 On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang <huangji at="" ohsu.edu=""> wrote: > Dear All members, > > I need to analyze a GEO database dataset. The data was generated with the platform GPL1528<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?acc="GPL1528">: NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was generated by Affymetrix platform. > > Can somebody advise me what R annotation package I should use to solve my problem in this case? > > > Many Thanks > > Jing > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Jing, If you need a chip package that is not presently hosted, you can 1) retrieve the probe to gene mappings from the people who made the platform and then 2) follow the instructions in this vignette to generate a custom package: http://www.bioconductor.org/packages/2.8/bioc/vignettes/AnnotationDbi/ inst/doc/SQLForge.R Marc On 08/22/2011 04:07 PM, Sean Davis wrote: > Hi, Jing. > > You could try: > > http://bioconductor.org/packages/release/data/annotation/html/Operon HumanV3.db.html > > Note that this might not be right, but the Operon set was in common > use a few years ago. > > If this isn't what you need, you know that GEOquery automatically > grabs the annotation data from NCBI GEO? For example using a GSE from > GPL1528, see below. You can use the AnnotationDbi package to make > your own annotation packages based on these annotations. In > particular, for GPL1528, the Unigene IDs are included. > > Hope that helps. > > Sean > > > >> library(GEOquery) > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'browseVignettes()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation("pkgname")'. > > Setting options('download.file.method.GEOquery'='curl') >> gse = getGEO("GSE2020") > Found 1 file(s) > GSE2020_series_matrix.txt.gz > trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020 /GSE2020_series_matrix.txt.gz' > ftp data connection made, file length 518963 bytes > opened URL > ================================================== > downloaded 506 Kb > > File stored at: > /tmp/Rtmpdgx7wJ/GPL1528.soft > >> gse > $GSE2020_series_matrix.txt.gz > ExpressionSet (storageMode: lockedEnvironment) > assayData: 21794 features, 10 samples > element names: exprs > protocolData: none > phenoData > sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total) > varLabels: title geo_accession ... data_row_count (31 total) > varMetadata: labelDescription > featureData > featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total) > fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total) > fvarMetadata: Column Description labelDescription > experimentData: use 'experimentData(object)' > Annotation: GPL1528 > >> head(fData(gse[[1]])) > ID MADB_WELL_ID OLIGO_ID GENE UNIGENE > 1140849_1 1140849_1 1140849 SptRpt-2a1 > 1140850_1 1140850_1 1140850 SptRpt-2a2 > 1140851_1 1140851_1 1140851 SptRpt-2a3 > 1140852_1 1140852_1 1140852 SptRpt-2a4 > 1140853_1 1140853_1 1140853 SptRpt-2a5 > 1140854_1 1140854_1 1140854 SptRpt-2a6 > > DESCRIPTION > 1140849_1 Human Beta-Actin PCR Product > Human Beta-Actin 100ng/ul > 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 > chlorophyll a/b-binding protein > 1140851_1 PCR Product 5 (LTP6) A. thaliana > lipid transfer protien 6 > 1140852_1 > 3XSSC > 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 > chlorophyll a/b-binding protein > 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana > lipid transfer protien 6 > GB_LIST > 1140849_1 > 1140850_1 > 1140851_1 > 1140852_1 > 1140853_1 > 1140854_1 > > SPOT_ID > 1140849_1 Human Beta-Actin PCR Product > Human Beta-Actin 100ng/ul > 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 > chlorophyll a/b-binding protein > 1140851_1 PCR Product 5 (LTP6) A. thaliana > lipid transfer protien 6 > 1140852_1 > 3XSSC > 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 > chlorophyll a/b-binding protein > 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana > lipid transfer protien 6 > > > On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang<huangji at="" ohsu.edu=""> wrote: >> Dear All members, >> >> I need to analyze a GEO database dataset. The data was generated with the platform GPL1528<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?acc="GPL1528">: NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was generated by Affymetrix platform. >> >> Can somebody advise me what R annotation package I should use to solve my problem in this case? >> >> >> Many Thanks >> >> Jing >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Thank You Sean. It really helps! Jing -----Original Message----- From: seandavi@gmail.com [mailto:seandavi@gmail.com] On Behalf Of Sean Davis Sent: Monday, August 22, 2011 4:07 PM To: Jing Huang Cc: bioconductor at r-project.org Subject: Re: [BioC] annotation package ? Hi, Jing. You could try: http://bioconductor.org/packages/release/data/annotation/html/OperonHu manV3.db.html Note that this might not be right, but the Operon set was in common use a few years ago. If this isn't what you need, you know that GEOquery automatically grabs the annotation data from NCBI GEO? For example using a GSE from GPL1528, see below. You can use the AnnotationDbi package to make your own annotation packages based on these annotations. In particular, for GPL1528, the Unigene IDs are included. Hope that helps. Sean > library(GEOquery) Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. Setting options('download.file.method.GEOquery'='curl') > gse = getGEO("GSE2020") Found 1 file(s) GSE2020_series_matrix.txt.gz trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020/G SE2020_series_matrix.txt.gz' ftp data connection made, file length 518963 bytes opened URL ================================================== downloaded 506 Kb File stored at: /tmp/Rtmpdgx7wJ/GPL1528.soft > gse $GSE2020_series_matrix.txt.gz ExpressionSet (storageMode: lockedEnvironment) assayData: 21794 features, 10 samples element names: exprs protocolData: none phenoData sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total) varLabels: title geo_accession ... data_row_count (31 total) varMetadata: labelDescription featureData featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total) fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total) fvarMetadata: Column Description labelDescription experimentData: use 'experimentData(object)' Annotation: GPL1528 > head(fData(gse[[1]])) ID MADB_WELL_ID OLIGO_ID GENE UNIGENE 1140849_1 1140849_1 1140849 SptRpt-2a1 1140850_1 1140850_1 1140850 SptRpt-2a2 1140851_1 1140851_1 1140851 SptRpt-2a3 1140852_1 1140852_1 1140852 SptRpt-2a4 1140853_1 1140853_1 1140853 SptRpt-2a5 1140854_1 1140854_1 1140854 SptRpt-2a6 DESCRIPTION 1140849_1 Human Beta-Actin PCR Product Human Beta-Actin 100ng/ul 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140851_1 PCR Product 5 (LTP6) A. thaliana lipid transfer protien 6 1140852_1 3XSSC 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana lipid transfer protien 6 GB_LIST 1140849_1 1140850_1 1140851_1 1140852_1 1140853_1 1140854_1 SPOT_ID 1140849_1 Human Beta-Actin PCR Product Human Beta-Actin 100ng/ul 1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140851_1 PCR Product 5 (LTP6) A. thaliana lipid transfer protien 6 1140852_1 3XSSC 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1 chlorophyll a/b-binding protein 1140854_1 Oligonucleotide 5 (LTP6) A. thaliana lipid transfer protien 6 On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang <huangji at="" ohsu.edu=""> wrote: > Dear All members, > > I need to analyze a GEO database dataset. The data was generated with the platform GPL1528<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?acc="GPL1528">: NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was generated by Affymetrix platform. > > Can somebody advise me what R annotation package I should use to solve my problem in this case? > > > Many Thanks > > Jing > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 669 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6