(no subject)
1
0
Entering edit mode
Tine Casneuf ▴ 80
@tine-casneuf-1773
Last seen 9.6 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060810/ 4b232209/attachment.pl
• 767 views
ADD COMMENT
0
Entering edit mode
Nianhua Li ▴ 870
@nianhua-li-1606
Last seen 9.6 years ago
Dear Tine and Bj?rn, Thanks a lot for your detailed replies. I really appreciate them. I would like to summarize them to make sure we are on the same page: Now I understand that we should use AGI locus as gene identifier and it can be missing for some probesets. It also seems EntrezGene ID is unnecessary. I was actually more interested in the *source*. Whether should we use *Affymetrix's annotation* (https://www.affymetrix.com/support/technical/byproduct.affx?product=a rab) or *TAIR's* (ftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/affy_ATH1_ array_elements-2006-07-14.txt) for probeset-to-gene mapping. You both prefer TAIR's, don't you? The current implementation (athPkgBuilder) is based on Affymetrix's. Thanks for the PubMed source (ftp://ftp.arabidopsis.org/home/tair/User_Requests/LocusPublished.0801 2006.txt). Should I make it the default in athPkgBuilder then? It is fairly easy to obtain KEGG annotation. File ftp://ftp.genome.jp/pub/kegg/genomes/ath/ath_tair.list maps AGI locus to KEGG Gene ID mapping. If you look at the file, the two identifiers always have the same value. And then file ftp://ftp.genome.jp/pub/kegg/pathways/ath/ath_gene_map.tab maps KEGG Gene ID to KEGG pathway ID. Finally file ftp://ftp.genome.jp/pub/kegg/pathways/map_title.tab maps KEGG pathway ID to pathway title. Another detail is that KEGG has two "genome code" for Arabidopsis: ath and eath. "ath" contains mappings between pathway and CDS (real genes), whereas "eath" maps pathway with ESTs. For example, "eath00051" and "ath00051" shows the same pathway graph, but links to CDS and EST respectively: http://www.genome.jp/dbget-bin/show_pathway?eath00051 http://www.genome.jp/dbget-bin/show_pathway?ath00051 Should we use "ath" or "eath"? Also it seems the gene description (ath1121501GENENAME) part should keep the current implementation (base on ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR_sequenced_genes ). thanks again nianhua
ADD COMMENT
0
Entering edit mode
Nianhua, I suggest to use the probeset-to-gene mappings from TAIR, since they are in charge of the annotation of this genome. This way one can be sure the probeset-to-gene mappings align with new annotation releases of this genome. Also, I would consider to include the gene/locus-to-GO mappings from TAIR. This data set is downloadable directly from GO.org: http://geneontology.org/GO.current.annotations.shtml http://www.geneontology.org/cgi- bin/downloadGOGA.pl/gene_association.tair.gz Thanks for taking care of this. Thomas On Thu 08/10/06 10:25, Nianhua Li wrote: > Dear Tine and Bj?rn, > > Thanks a lot for your detailed replies. I really appreciate them. I > would like to summarize them to make sure we are on the same page: > > Now I understand that we should use AGI locus as gene identifier and it > can be missing for some probesets. It also seems EntrezGene ID is > unnecessary. I was actually more interested in the *source*. Whether > should we use *Affymetrix's annotation* > (https://www.affymetrix.com/support/technical/byproduct.affx?product =arab) > or *TAIR's* > (ftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/affy_ATH 1_array_elements-2006-07-14.txt) > for probeset-to-gene mapping. You both prefer TAIR's, don't you? The > current implementation (athPkgBuilder) is based on Affymetrix's. > > Thanks for the PubMed source > (ftp://ftp.arabidopsis.org/home/tair/User_Requests/LocusPublished.08 012006.txt). > Should I make it the default in athPkgBuilder then? > > It is fairly easy to obtain KEGG annotation. File > ftp://ftp.genome.jp/pub/kegg/genomes/ath/ath_tair.list maps AGI locus > to KEGG Gene ID mapping. If you look at the file, the two identifiers > always have the same value. And then file > ftp://ftp.genome.jp/pub/kegg/pathways/ath/ath_gene_map.tab maps KEGG > Gene ID to KEGG pathway ID. Finally file > ftp://ftp.genome.jp/pub/kegg/pathways/map_title.tab maps KEGG pathway ID > to pathway title. Another detail is that KEGG has two "genome code" for > Arabidopsis: ath and eath. "ath" contains mappings between pathway and > CDS (real genes), whereas "eath" maps pathway with ESTs. For example, > "eath00051" and "ath00051" shows the same pathway graph, but links to > CDS and EST respectively: > http://www.genome.jp/dbget-bin/show_pathway?eath00051 > http://www.genome.jp/dbget-bin/show_pathway?ath00051 > Should we use "ath" or "eath"? > > Also it seems the gene description (ath1121501GENENAME) part should keep > the current implementation (base on > ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR_sequenced_genes ). > > thanks again > > nianhua > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Thomas Girke, Ph.D. 1008 Noel T. Keen Hall Center for Plant Cell Biology (CEPCEB) University of California Riverside, CA 92521 E-mail: thomas.girke at ucr.edu Website: http://faculty.ucr.edu/~tgirke Ph: 951-827-2469 Fax: 951-827-4437
ADD REPLY
0
Entering edit mode
Hi guys, I agree with Thomas. They have taken over from TIGR last year and are taking care of the annotation (that's why the previous releases of the genome were called TIGRx and the last one is TAIR6). About the ath and eath in kegg: we map probesets to locusIDs and not to the transcripts themselves (otherwise you could distinguish between spice variants f.e.), so I guess the 'ath' will be allright too. about the PMIDs: I mailed the lady back who told me about the file asking if they are going to keep that file where it is (ftp://ftp.arabidopsis.org/home/tair/User_Requests/LocusPublished.0801 2006.txt). Maybe they made it available because I asked for it (since it is in the user request directory). the ath1121501GENENAME works fine for me! best wishes and thank you, tine Thomas Girke wrote: >Nianhua, > >I suggest to use the probeset-to-gene mappings from TAIR, since they >are in charge of the annotation of this genome. This way one can be sure the >probeset-to-gene mappings align with new annotation releases of this >genome. > >Also, I would consider to include the gene/locus-to-GO mappings from >TAIR. This data set is downloadable directly from GO.org: > >http://geneontology.org/GO.current.annotations.shtml >http://www.geneontology.org/cgi- bin/downloadGOGA.pl/gene_association.tair.gz > >Thanks for taking care of this. > >Thomas > > >On Thu 08/10/06 10:25, Nianhua Li wrote: > > >>Dear Tine and Bj?rn, >> >>Thanks a lot for your detailed replies. I really appreciate them. I >>would like to summarize them to make sure we are on the same page: >> >>Now I understand that we should use AGI locus as gene identifier and it >>can be missing for some probesets. It also seems EntrezGene ID is >>unnecessary. I was actually more interested in the *source*. Whether >>should we use *Affymetrix's annotation* >>(https://www.affymetrix.com/support/technical/byproduct.affx?product =arab) >>or *TAIR's* >>(ftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/affy_ATH 1_array_elements-2006-07-14.txt) >>for probeset-to-gene mapping. You both prefer TAIR's, don't you? The >>current implementation (athPkgBuilder) is based on Affymetrix's. >> >>Thanks for the PubMed source >>(ftp://ftp.arabidopsis.org/home/tair/User_Requests/LocusPublished.08 012006.txt). >>Should I make it the default in athPkgBuilder then? >> >>It is fairly easy to obtain KEGG annotation. File >>ftp://ftp.genome.jp/pub/kegg/genomes/ath/ath_tair.list maps AGI locus >>to KEGG Gene ID mapping. If you look at the file, the two identifiers >>always have the same value. And then file >>ftp://ftp.genome.jp/pub/kegg/pathways/ath/ath_gene_map.tab maps KEGG >>Gene ID to KEGG pathway ID. Finally file >>ftp://ftp.genome.jp/pub/kegg/pathways/map_title.tab maps KEGG pathway ID >>to pathway title. Another detail is that KEGG has two "genome code" for >>Arabidopsis: ath and eath. "ath" contains mappings between pathway and >>CDS (real genes), whereas "eath" maps pathway with ESTs. For example, >>"eath00051" and "ath00051" shows the same pathway graph, but links to >>CDS and EST respectively: >> http://www.genome.jp/dbget-bin/show_pathway?eath00051 >> http://www.genome.jp/dbget-bin/show_pathway?ath00051 >>Should we use "ath" or "eath"? >> >>Also it seems the gene description (ath1121501GENENAME) part should keep >>the current implementation (base on >>ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR_sequenced_genes ). >> >>thanks again >> >>nianhua >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor at stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor >>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> > > >
ADD REPLY

Login before adding your answer.

Traffic: 693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6