CGH analysis without genome positions

0

Entering edit mode

adam_pgsql ▴ 70

@adam_pgsql-3901

Last seen 11.4 years ago

Hi, I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? thanks for any help adam

CGH CGH • 1.9k views

ADD COMMENT • link updated 15.9 years ago by Sean Davis 21k • written 15.9 years ago by adam_pgsql ▴ 70

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 20 hours ago

United States

On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: > > Hi, > > I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? > I don't think of CGH analysis as "present/absent", but perhaps I am not clear on what you mean by CGH analysis. For Agilent arrays, presumably you have two colors, one representing the sample and the other the reference. Simply make a ratio and then rank the probes based on that. Sean

ADD COMMENT • link 15.9 years ago Sean Davis 21k

0

Entering edit mode

On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: > On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >> >> Hi, >> >> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >> > > I don't think of CGH analysis as "present/absent", but perhaps I am > not clear on what you mean by CGH analysis. ?For Agilent arrays, > presumably you have two colors, one representing the sample and the > other the reference. ?Simply make a ratio and then rank the probes > based on that. I'm making an assumption here that you are using some custom array based on an organism with no assembled genome. If there is an assembled genome, then you should map your probes to the genome using an alignment tool (blast, blat, etc.) and use those alignments for more standard CGH analysis. Sean

ADD REPLY • link 15.9 years ago Sean Davis 21k

0

Entering edit mode

On 6 Apr 2010, at 19:15, Sean Davis wrote: > On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>> >>> Hi, >>> >>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>> >> >> I don't think of CGH analysis as "present/absent", but perhaps I am >> not clear on what you mean by CGH analysis. For Agilent arrays, >> presumably you have two colors, one representing the sample and the >> other the reference. Simply make a ratio and then rank the probes >> based on that. > > I'm making an assumption here that you are using some custom array > based on an organism with no assembled genome. If there is an > assembled genome, then you should map your probes to the genome using > an alignment tool (blast, blat, etc.) and use those alignments for > more standard CGH analysis. Thanks Sean for your reply. This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? thanks again for your help adam

ADD REPLY • link 15.9 years ago adam_pgsql ▴ 70

0

Entering edit mode

On Tue, Apr 6, 2010 at 6:53 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: > > On 6 Apr 2010, at 19:15, Sean Davis wrote: > >> On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >>> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>>> >>>> Hi, >>>> >>>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>>> >>> >>> I don't think of CGH analysis as "present/absent", but perhaps I am >>> not clear on what you mean by CGH analysis. ?For Agilent arrays, >>> presumably you have two colors, one representing the sample and the >>> other the reference. ?Simply make a ratio and then rank the probes >>> based on that. >> >> I'm making an assumption here that you are using some custom array >> based on an organism with no assembled genome. ?If there is an >> assembled genome, then you should map your probes to the genome using >> an alignment tool (blast, blat, etc.) and use those alignments for >> more standard CGH analysis. > > Thanks Sean for your reply. > > This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. > > in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? > Hi, Adam. There are many ways to go here, but one would really need to know the experimental design in more detail. If you have replicates, then there are MANY statistical methodologies that could be applied to find differences between the reference and the test. Any gene expression hypothesis testing packages could probably be applied. Sean

ADD REPLY • link 15.9 years ago Sean Davis 21k

0

Entering edit mode

On 7 Apr 2010, at 00:01, Sean Davis wrote: > On Tue, Apr 6, 2010 at 6:53 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >> >> On 6 Apr 2010, at 19:15, Sean Davis wrote: >> >>> On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >>>> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>>>> >>>> >>>> I don't think of CGH analysis as "present/absent", but perhaps I am >>>> not clear on what you mean by CGH analysis. For Agilent arrays, >>>> presumably you have two colors, one representing the sample and the >>>> other the reference. Simply make a ratio and then rank the probes >>>> based on that. >>> >>> I'm making an assumption here that you are using some custom array >>> based on an organism with no assembled genome. If there is an >>> assembled genome, then you should map your probes to the genome using >>> an alignment tool (blast, blat, etc.) and use those alignments for >>> more standard CGH analysis. >> >> Thanks Sean for your reply. >> >> This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. >> >> in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? >> > > Hi, Adam. > > There are many ways to go here, but one would really need to know the > experimental design in more detail. If you have replicates, then > there are MANY statistical methodologies that could be applied to find > differences between the reference and the test. Any gene expression > hypothesis testing packages could probably be applied. > > Sean thanks again for your reply Sean. for the arrays that have been performed so far, the design is simply test against reference strain (no biological replicates), 3 or more different oligos per gene, printed in duplicate. The problem with the reference design is that as I mentioned before many of the oligos map to genes that are not present in the reference strain, so there will be lots of features with little or no signal in the reference channel. We would in fact like to be able to do this with single colour data if possible. Are there any packages that could help with this? thanks again adam

ADD REPLY • link 15.9 years ago adam_pgsql ▴ 70

0

Entering edit mode

On Wed, Apr 7, 2010 at 4:25 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: > > On 7 Apr 2010, at 00:01, Sean Davis wrote: > >> On Tue, Apr 6, 2010 at 6:53 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>> >>> On 6 Apr 2010, at 19:15, Sean Davis wrote: >>> >>>> On Tue, Apr 6, 2010 at 1:58 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >>>>> On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql <adam_pgsql at="" witneyweb.org=""> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to do some CGH analysis with Agilent arrays, but all the analyses methods seem to require genome position information. Does anyone know of any packages that will call genes as present/absent without the genome position? >>>>>> >>>>> >>>>> I don't think of CGH analysis as "present/absent", but perhaps I am >>>>> not clear on what you mean by CGH analysis. ?For Agilent arrays, >>>>> presumably you have two colors, one representing the sample and the >>>>> other the reference. ?Simply make a ratio and then rank the probes >>>>> based on that. >>>> >>>> I'm making an assumption here that you are using some custom array >>>> based on an organism with no assembled genome. ?If there is an >>>> assembled genome, then you should map your probes to the genome using >>>> an alignment tool (blast, blat, etc.) and use those alignments for >>>> more standard CGH analysis. >>> >>> Thanks Sean for your reply. >>> >>> This is a custom bacterial pan-genome array. The problem is that many of the oligos target genes found in unfinished genome sequences (not the reference strain) and as such I don't really have a genome position. Also due to the nature of bacterial genomes when i hybridise DNA from unsequenced strains there is no guarantee that the gene arrangement would be exactly the same as the sequenced reference strain. >>> >>> in terms of "present/absent" i would like to score each gene sequence represented on the array as present or absent in the test strain. I guess this could be done by ranking the ratios and determining some cutoff for presence or absence, but the question is are there any tools that provide a more statistically sound approach to suggesting a good cutioff value to use? >>> >> >> Hi, Adam. >> >> There are many ways to go here, but one would really need to know the >> experimental design in more detail. ?If you have replicates, then >> there are MANY statistical methodologies that could be applied to find >> differences between the reference and the test. ?Any gene expression >> hypothesis testing packages could probably be applied. >> >> Sean > > thanks again for your reply Sean. > > for the arrays that have been performed so far, the design is simply test against reference strain (no biological replicates), 3 or more different oligos per gene, printed in duplicate. The problem with the reference design is that as I mentioned before many of the oligos map to genes that are not present in the reference strain, so there will be lots of features with little or no signal in the reference channel. We would in fact like to be able to do this with single colour data if possible. Are there any packages that could help with this? > Agilent generates several statistics that might be relevant. You might look at the Feature Extraction manual to determine which columns of output will help you determine if a single channel is thought to be above background. In any case, I don't think there are any bioconductor packages that will do exactly what you want without some creativity. Sean

ADD REPLY • link 15.9 years ago Sean Davis 21k

0

Entering edit mode

Hi, you really need the chromosome and position where the probes map in the genome. You can user readPositionalInfo function in snapCGH package to get that information from previously parsed Agilent txt files. Check the package vignette. Best, Daniel On Apr 6, 2010, at 7:58 PM, Sean Davis wrote: > On Tue, Apr 6, 2010 at 1:47 PM, adam_pgsql > <adam_pgsql at="" witneyweb.org=""> wrote: >> >> Hi, >> >> I am trying to do some CGH analysis with Agilent arrays, but all >> the analyses methods seem to require genome position information. >> Does anyone know of any packages that will call genes as present/ >> absent without the genome position? >> > > I don't think of CGH analysis as "present/absent", but perhaps I am > not clear on what you mean by CGH analysis. For Agilent arrays, > presumably you have two colors, one representing the sample and the > other the reference. Simply make a ratio and then rank the probes > based on that. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ******************************************** Daniel Rico Rodriguez, PhD. Structural Computational Biology Group Spanish National Cancer Research Center, CNIO Melchor Fernandez Almagro, 3. 28029 Madrid, Spain. Phone: +34 91 224 69 00 #3015 drico at cnio.es http://www.cnio.es ******************************************** **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y ...{{dropped:3}}

ADD REPLY • link 15.9 years ago Daniel Rico ▴ 110

Login before adding your answer.