Analysing expression with tiling arrays
2
0
Entering edit mode
@january-weiner-3999
Last seen 9.6 years ago
Dear all, I have two tiling arrays of a bacterial genome. Unfortunately, I do not have the original files (like the bpmap / cel files for Affy tiling chips), just lists of spot intensities in two conditions for each probe (i.e. two values for each probe), and a list of gene positions on the genome. Several probes map on a each gene. The genome is not publicly available yet. What would be the best way to tackle this? I thought that I might just calculate the logFC for each probe, and then, for each gene, run a one sample t-test of the corresponding probe logFC values; then correct for multiple testing. Would that make sense? I looked up the approach described in Toedling and Huber in 2008 PLoC Comp Biol (doi:10.1371/journal.pcbi.1000227) but this is not exactly what I had in mind; rather than looking for enriched regions, I'm more interested in focusing on the genes directly -- as a bacterial genome is densely packed with probes and genes (I have 10-30 probes per gene). Best regards, January -- -------- Dr. January Weiner 3 --------------------------------------
probe probe • 1.1k views
ADD COMMENT
0
Entering edit mode
@january-weiner-4252
Last seen 4.9 years ago
European Union
Dear all, I have two tiling arrays of a bacterial genome. Unfortunately, I do not have the original files (like the bpmap / cel files for Affy tiling chips), just lists of spot intensities in two conditions for each probe (i.e. two values for each probe), and a list of gene positions on the genome. Several probes map on a each gene. The genome is not publicly available yet. What would be the best way to tackle this? I thought that I might just calculate the logFC for each probe, and then, for each gene, run a one sample t-test of the corresponding probe logFC values; then correct for multiple testing. Would that make sense? I looked up the approach described in Toedling and Huber in 2008 PLoC Comp Biol (doi:10.1371/journal.pcbi.1000227) but this is not exactly what I had in mind; rather than looking for enriched regions, I'm more interested in focusing on the genes directly -- as a bacterial genome is densely packed with probes and genes (I have 10-30 probes per gene). Best regards, January -- -------- Dr. January Weiner 3 --------------------------------------
ADD COMMENT
0
Entering edit mode
January, On Sep/10/10 9:21 AM, January Weiner wrote: > Dear all, > > I have two tiling arrays of a bacterial genome. Unfortunately, I do > not have the original files (like the bpmap / cel files for Affy > tiling chips), just lists of spot intensities in two conditions for > each probe (i.e. two values for each probe), and a list of gene > positions on the genome. Several probes map on a each gene. The genome > is not publicly available yet. > > What would be the best way to tackle this? I thought that I might just > calculate the logFC for each probe, and then, for each gene, run a one > sample t-test of the corresponding probe logFC values; then correct > for multiple testing. this sounds reasonable, just be aware that the noise in the data from neighbouring probes is likely correlated, so that the t-distribution with the 'naive' degrees of freedom will give you optimistic (too small) p-values. You can still use them for ranking / prioritizing genes, and perhaps set the cutoff from known positive and negative control genes. > > Would that make sense? I looked up the approach described in Toedling > and Huber in 2008 PLoC Comp Biol (doi:10.1371/journal.pcbi.1000227) > but this is not exactly what I had in mind; rather than looking for > enriched regions, I'm more interested in focusing on the genes > directly -- as a bacterial genome is densely packed with probes and > genes (I have 10-30 probes per gene). > > Best regards, > > January > -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD REPLY
0
Entering edit mode
> this sounds reasonable, just be aware that the noise in the data from > neighbouring probes is likely correlated, so that the t-distribution with > the 'naive' degrees of freedom will give you optimistic (too small) > p-values. You can still use them for ranking / prioritizing genes, and > perhaps set the cutoff from known positive and negative control genes. Thanks for the answer, Wolfgang. I did the simple / naive t-test, and it still gave "reasonable" results (i.e. same as with a different approach). Regards, j. -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 I di
ADD REPLY
0
Entering edit mode
Edwin Groot ▴ 230
@edwin-groot-3606
Last seen 9.6 years ago
On Fri, 10 Sep 2010 09:30:11 +0200 January Weiner <january.weiner at="" mpiib-berlin.mpg.de=""> wrote: > Dear all, > > I have two tiling arrays of a bacterial genome. Unfortunately, I do > not have the original files (like the bpmap / cel files for Affy > tiling chips), just lists of spot intensities in two conditions for > each probe (i.e. two values for each probe), and a list of gene > positions on the genome. Several probes map on a each gene. The > genome > is not publicly available yet. > Hello January, If the tiling array is from Affymetrix, the bpmap files exist. To start with you should track them down because they give the necessary annotation and position information. I am assuming you want to measure RNA translation using this tiling array platform. That should be a fairly trivial analysis once you get the data into an Expression Set object. Is the data from GEO??? Edwin > What would be the best way to tackle this? I thought that I might > just > calculate the logFC for each probe, and then, for each gene, run a > one > sample t-test of the corresponding probe logFC values; then correct > for multiple testing. > > Would that make sense? I looked up the approach described in Toedling > and Huber in 2008 PLoC Comp Biol (doi:10.1371/journal.pcbi.1000227) > but this is not exactly what I had in mind; rather than looking for > enriched regions, I'm more interested in focusing on the genes > directly -- as a bacterial genome is densely packed with probes and > genes (I have 10-30 probes per gene). > > Best regards, > > January > > -- > -------- Dr. January Weiner 3 -------------------------------------- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945
ADD COMMENT

Login before adding your answer.

Traffic: 946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6