annotation files for agilent: a bit off-topic

0

Entering edit mode

Weiwei Shi ★ 1.2k

@weiwei-shi-1407

Last seen 9.7 years ago

Hi, there: I knew this is a bit off-topic but hope someone has knowledge to share: I found 4 zipped files about annotation from agilent: Human 1A(v2) Human Genome CGH 44A Human Genome CGH 44B Human Genome, Whole I assume I can use the last one for my arrays but w/o knowing the difference b/w them, I am not quite sure. thanks -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III

Annotation CGH Annotation CGH • 1.6k views

ADD COMMENT • link updated 16.8 years ago by Gaj Stan BIGCAT ▴ 100 • written 16.8 years ago by Weiwei Shi ★ 1.2k

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 4 months ago

United States

Weiwei Shi wrote: > Hi, there: > > I knew this is a bit off-topic but hope someone has knowledge to share: > > I found 4 zipped files about annotation from agilent: > > Human 1A(v2) > Human Genome CGH 44A > Human Genome CGH 44B > Human Genome, Whole > > I assume I can use the last one for my arrays but w/o knowing the > difference b/w them, I am not quite sure. You will need to find out what platform your arrays use or do some probe ID matching between your arrays and the annotation packages. The former is preferred. Sean

ADD COMMENT • link 16.8 years ago Sean Davis 21k

0

Entering edit mode

I am doing the latter now b/c I don't know the answer to the first question. The data provider is sloooooowwww in reply. On 7/17/07, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > Weiwei Shi wrote: > > Hi, there: > > > > I knew this is a bit off-topic but hope someone has knowledge to share: > > > > I found 4 zipped files about annotation from agilent: > > > > Human 1A(v2) > > Human Genome CGH 44A > > Human Genome CGH 44B > > Human Genome, Whole > > > > I assume I can use the last one for my arrays but w/o knowing the > > difference b/w them, I am not quite sure. > > You will need to find out what platform your arrays use or do some probe > ID matching between your arrays and the annotation packages. The former > is preferred. > > Sean > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III

ADD REPLY • link 16.8 years ago Weiwei Shi ★ 1.2k

0

Entering edit mode

Hi Weiwei, I'd assume the last one: 1st is a very old chip 2nd & 3rd are for CGH, not expression 4th is their basic human gene expression. Keep in mind that they now have 4x44 arrays that use the same non- control probes but has them in different positions. If the chips were purchased recently, they are likely the 4x44 ones, as they end up being a lot cheaper. The quick and dirty way of finding out: look in the feature extraction file, you'll see a column that says FeatureExtractor_DesignFileName in the header. With this should be a file that looks like 014868_D_F_20060807.xml. The first part (014868) says the chip type (design ID, actually) while the 2nd gives the annotation release date. Then go to http://www.chem.agilent.com/cag/bsp/array_list.asp and search in the list. In this case, you'd see this is the 4x44 whole genome mouse chip. There is a bioconductor package for the human whole genome chips: hgug4112a. This does not include any non-control probes, so it should work with both the 1x44 and 4x44. Also, the read.maimages should also grab the gene annotation that is included in the feature extractor software. They might be out of date, but it should help you to keep going. Hope this helps, Francois On Tue, 2007-07-17 at 17:43 -0400, Weiwei Shi wrote: > I am doing the latter now b/c I don't know the answer to the first > question. The data provider is sloooooowwww in reply. > > On 7/17/07, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > Weiwei Shi wrote: > > > Hi, there: > > > > > > I knew this is a bit off-topic but hope someone has knowledge to share: > > > > > > I found 4 zipped files about annotation from agilent: > > > > > > Human 1A(v2) > > > Human Genome CGH 44A > > > Human Genome CGH 44B > > > Human Genome, Whole > > > > > > I assume I can use the last one for my arrays but w/o knowing the > > > difference b/w them, I am not quite sure. > > > > You will need to find out what platform your arrays use or do some probe > > ID matching between your arrays and the annotation packages. The former > > is preferred. > > > > Sean > > > >

ADD REPLY • link 16.8 years ago Francois Pepin ★ 1.3k

0

Entering edit mode

HI, Francois: first, thanks for the detailed reply. The matching is done and only ~7700 probes out of ~10,100 are matched ( and I assume they start with A_) However, some probeID are like > tail(x0, 10) [1] "A_24_P913609" "Hs345093.1" "A_23_P144999" "A_23_P399001" [5] "A_23_P340617" "A_32_P104088" "A_32_P34372" "A_23_P62764" [9] "Hs132898.3" "A_32_P370539" since it is a customized array, I think they might use UnigeneID(?), but what's ".3"? Should it be Hs.132898? confused! FeatureExtractor_DesignFileName gives D:\Array_Data\Kinder-Onko\Design Files KinderOnko\Custom_Final_280904\012714_d_20040819.xml Is that right? Be honest, I hate people providing data w/o good annotation :( Kinda asking us to play the guessing game. Best, Weiwei On 7/17/07, Francois Pepin <fpepin at="" cs.mcgill.ca=""> wrote: > Hi Weiwei, > > I'd assume the last one: > > 1st is a very old chip > 2nd & 3rd are for CGH, not expression > 4th is their basic human gene expression. > > Keep in mind that they now have 4x44 arrays that use the same non- > control probes but has them in different positions. If the chips were > purchased recently, they are likely the 4x44 ones, as they end up being > a lot cheaper. > > The quick and dirty way of finding out: look in the feature extraction > file, you'll see a column that says FeatureExtractor_DesignFileName in > the header. With this should be a file that looks like > 014868_D_F_20060807.xml. The first part (014868) says the chip type > (design ID, actually) while the 2nd gives the annotation release date. > Then go to http://www.chem.agilent.com/cag/bsp/array_list.asp and search > in the list. In this case, you'd see this is the 4x44 whole genome mouse > chip. > > There is a bioconductor package for the human whole genome chips: > hgug4112a. This does not include any non-control probes, so it should > work with both the 1x44 and 4x44. > > Also, the read.maimages should also grab the gene annotation that is > included in the feature extractor software. They might be out of date, > but it should help you to keep going. > > Hope this helps, > > Francois > > On Tue, 2007-07-17 at 17:43 -0400, Weiwei Shi wrote: > > I am doing the latter now b/c I don't know the answer to the first > > question. The data provider is sloooooowwww in reply. > > > > On 7/17/07, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > Weiwei Shi wrote: > > > > Hi, there: > > > > > > > > I knew this is a bit off-topic but hope someone has knowledge to share: > > > > > > > > I found 4 zipped files about annotation from agilent: > > > > > > > > Human 1A(v2) > > > > Human Genome CGH 44A > > > > Human Genome CGH 44B > > > > Human Genome, Whole > > > > > > > > I assume I can use the last one for my arrays but w/o knowing the > > > > difference b/w them, I am not quite sure. > > > > > > You will need to find out what platform your arrays use or do some probe > > > ID matching between your arrays and the annotation packages. The former > > > is preferred. > > > > > > Sean > > > > > > > > > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III

ADD REPLY • link 16.8 years ago Weiwei Shi ★ 1.2k

0

Entering edit mode

Hi Weiwei, Look at the $genes element from read.maimages. Hopefully it'll have what you need. Otherwise, you'll have to depend on the data provider. It's a custom chip and only they will have the annotations for it. Francois On Tue, 2007-07-17 at 19:04 -0400, Weiwei Shi wrote: > HI, Francois: > first, thanks for the detailed reply. > > The matching is done and only ~7700 probes out of ~10,100 are matched > ( and I assume they start with A_) > > However, some probeID are like > > tail(x0, 10) > [1] "A_24_P913609" "Hs345093.1" "A_23_P144999" "A_23_P399001" > [5] "A_23_P340617" "A_32_P104088" "A_32_P34372" "A_23_P62764" > [9] "Hs132898.3" "A_32_P370539" > > since it is a customized array, I think they might use UnigeneID(?), > but what's ".3"? Should it be Hs.132898? confused! > > FeatureExtractor_DesignFileName gives > D:\Array_Data\Kinder-Onko\Design Files > KinderOnko\Custom_Final_280904\012714_d_20040819.xml > > Is that right? > > Be honest, I hate people providing data w/o good annotation :( > > Kinda asking us to play the guessing game. > > Best, > > Weiwei > > > > On 7/17/07, Francois Pepin <fpepin at="" cs.mcgill.ca=""> wrote: > > Hi Weiwei, > > > > I'd assume the last one: > > > > 1st is a very old chip > > 2nd & 3rd are for CGH, not expression > > 4th is their basic human gene expression. > > > > Keep in mind that they now have 4x44 arrays that use the same non- > > control probes but has them in different positions. If the chips were > > purchased recently, they are likely the 4x44 ones, as they end up being > > a lot cheaper. > > > > The quick and dirty way of finding out: look in the feature extraction > > file, you'll see a column that says FeatureExtractor_DesignFileName in > > the header. With this should be a file that looks like > > 014868_D_F_20060807.xml. The first part (014868) says the chip type > > (design ID, actually) while the 2nd gives the annotation release date. > > Then go to http://www.chem.agilent.com/cag/bsp/array_list.asp and search > > in the list. In this case, you'd see this is the 4x44 whole genome mouse > > chip. > > > > There is a bioconductor package for the human whole genome chips: > > hgug4112a. This does not include any non-control probes, so it should > > work with both the 1x44 and 4x44. > > > > Also, the read.maimages should also grab the gene annotation that is > > included in the feature extractor software. They might be out of date, > > but it should help you to keep going. > > > > Hope this helps, > > > > Francois > > > > On Tue, 2007-07-17 at 17:43 -0400, Weiwei Shi wrote: > > > I am doing the latter now b/c I don't know the answer to the first > > > question. The data provider is sloooooowwww in reply. > > > > > > On 7/17/07, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > > Weiwei Shi wrote: > > > > > Hi, there: > > > > > > > > > > I knew this is a bit off-topic but hope someone has knowledge to share: > > > > > > > > > > I found 4 zipped files about annotation from agilent: > > > > > > > > > > Human 1A(v2) > > > > > Human Genome CGH 44A > > > > > Human Genome CGH 44B > > > > > Human Genome, Whole > > > > > > > > > > I assume I can use the last one for my arrays but w/o knowing the > > > > > difference b/w them, I am not quite sure. > > > > > > > > You will need to find out what platform your arrays use or do some probe > > > > ID matching between your arrays and the annotation packages. The former > > > > is preferred. > > > > > > > > Sean > > > > > > > > > > > > > > > >

ADD REPLY • link 16.8 years ago Francois Pepin ★ 1.3k

0

Entering edit mode

John Zhang ★ 2.9k

@john-zhang-6

Last seen 9.7 years ago

> >I knew this is a bit off-topic but hope someone has knowledge to share: > >I found 4 zipped files about annotation from agilent: > >Human 1A(v2) >Human Genome CGH 44A >Human Genome CGH 44B >Human Genome, Whole > >I assume I can use the last one for my arrays but w/o knowing the >difference b/w them, I am not quite sure. They are different platforms. You need to use the one that matches the array you are using. > >thanks > >-- >Weiwei Shi, Ph.D >Research Scientist >GeneGO, Inc. > >"Did you always know?" >"No, I did not. But I believed..." >---Matrix III > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084

ADD COMMENT • link 16.8 years ago John Zhang ★ 2.9k

0

Entering edit mode

Gaj Stan BIGCAT ▴ 100

@gaj-stan-bigcat-1591

Last seen 9.7 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070718/ fc593311/attachment.pl

ADD COMMENT • link 16.8 years ago Gaj Stan BIGCAT ▴ 100

0

Entering edit mode

Thanks for all of you guys. I approach this by using sequence info. Anyway, it works. I will try to use others as suggested later on. Best, Weiwei On 7/18/07, Gaj Stan (BIGCAT) <stan.gaj at="" bigcat.unimaas.nl=""> wrote: > > > > Hi there, > > Another tip would be to use the barcode present on each Agilent array (or > if you used an Agilent Scanner, it's the part of the filename between the > first and second underscore, if I recall correctly). Visiting the Agilent > website chem.agilent.com) and downloading the annotation/gene file results > in a window where it asks for this barcode. After submission the correct > annotation files will be displayed for you to download! > > For more details, I recommend visiting the Agilent website. > > Best of luck, > > -- Stan > > > > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch on behalf of > John Zhang > Sent: Wed 7/18/2007 16:25 > To: bioconductor at stat.math.ethz.ch; helprhelp at gmail.com > Subject: Re: [BioC] annotation files for agilent: a bit off-topic > > > > > >I knew this is a bit off-topic but hope someone has knowledge to share: > > > >I found 4 zipped files about annotation from agilent: > > > >Human 1A(v2) > >Human Genome CGH 44A > >Human Genome CGH 44B > >Human Genome, Whole > > > >I assume I can use the last one for my arrays but w/o knowing the > >difference b/w them, I am not quite sure. > > They are different platforms. You need to use the one that matches the > array you > are using. > > > > > >thanks > > > >-- > >Weiwei Shi, Ph.D > >Research Scientist > >GeneGO, Inc. > > > >"Did you always know?" > >"No, I did not. But I believed..." > >---Matrix III > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > Jianhua Zhang > Department of Medical Oncology > Dana-Farber Cancer Institute > 44 Binney Street > Boston, MA 02115-6084 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III

ADD REPLY • link 16.8 years ago Weiwei Shi ★ 1.2k

Login before adding your answer.