GOstats gene set size selection
0
0
Entering edit mode
alex lam RI ▴ 30
@alex-lam-ri-2752
Last seen 9.7 years ago
Hi Sean and other BioC users, Thanks for the replies a couple of weeks ago. Now I am trying to use Category as suggested and I think the underlying principles are better than Gostats for what I want to do, especially that I don't have to use an arbitary threshold on my test statistics to select a subset of genes. I followed the code in the vignette of Category until the matrix Z gets divided by sqrt(rowSums). Because what I am doing is an eQTL genome scan, at any one position I have the likelihood ratio test statistics for all probesets rather than two-sample t-statistics. I read in the vignette that X should be approximately normal. So, I figure that maybe I should standardize the likelihood ratio statistics to z-scores before multiplying with the adjacency matrix. Is it the correct thing to do? for(cM in 1:lengthOfGenome) { lrt <- LRT[expressedAffyIds, cM] # ... filter out duplicates entrezGenes and create adjacency matrix ... z.score <- (lrt - mean(lrt)) / sd(lrt) tA <- AmER2 %*% z.score tA <- tA / sqrt(rs2) names(tA) <- row.names(AmER2) qqnorm(tA) } Cheers, Alex -----Original Message----- From: Sean MacEachern [mailto:sean.maceach@gmail.com] Sent: 17 April 2008 17:07 To: alex lam (RI); bioconductor at stat.math.ethz.ch Subject: Re: [BioC] GOstats gene set size selection Hi Alex, I'm not too sure if this helps with your question, but I'll put my two cents in... I am working with chickens and trying to create a large list of genes for an eQTL study from an initial simple microarray design that compares resistant vs susceptible birds, due to the small number of genes that I have found with differential expression I have attempted to increase the size of my list by examining significant GO terms. Most of the GO terms I have pulled out using hyperGTest are not very helpful due to their breadth. I have found the Category package a little more helpful. Kegg pathways are a little more specific and you can create an adjacency matrix and use the rowSums() command to filter your dataset. I think you can also treat GO terms as categories if you need to. It might be a little of topic, but it could be worth looking at. Cheers, Sean On 4/17/08 7:28 AM, "alex lam (RI)" <alex.lam at="" roslin.ed.ac.uk=""> wrote: > Dear colleagues, > > I have been following the GOstats vignette to test GO terms association. > I would like to know whether it is possible to set limits on the > number of selected genes in GO term and the size of that term on my affy chip? > > For example, can I tell hyperGTest to skip testing a GO term if the > number of significant genes in that term is under, say, 3, or if there > are more than 400 genes of that GO term on the chip? > > Currently I found many of my significant GO terms not very specific. > As I am trying to incorporate GOstats to an expression QTL (eQTL) > genome scan, I get a lot of output. Therefore, ideally I would like to > filter out these terms before test rather than screening the results > after test. Is there such an option with hyperGTest? > > Many thanks for your advice, > Alex > >> sessionInfo() > R version 2.6.2 Patched (2008-03-24 r44882) x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US > .U > TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UT > F- > 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ > ID > ENTIFICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] GOstats_2.4.0 Category_2.4.0 genefilter_1.16.0 > [4] survival_2.34 RBGL_1.14.0 annotate_1.16.1 > [7] xtable_1.5-2 GO.db_2.0.2 AnnotationDbi_1.0.6 > [10] RSQLite_0.6-8 DBI_0.2-4 Biobase_1.16.3 > [13] graph_1.16.1 > > loaded via a namespace (and not attached): > [1] cluster_1.11.10 >> > > -------------------------------------------- > Alex C. Lam > Roslin Institute (Edinburgh) > Midlothian > EH25 9PS > United Kingdom > Tel: +44 131 5274471 > > Former email address: alex.lam at bbsrc.ac.uk New email address: > alex.lam at roslin.ed.ac.uk Both addresses are functional > > Roslin Institute is a company limited by guarantee, registered in > Scotland (registered number SC157100) and a Scottish Charity > (registered number SC023592). Our registered office is at Roslin, > Midlothian, EH25 9PS. VAT registration number 847380013. > > The information contained in this e-mail (including any attachments) is > confidential and is intended for the use of the addressee only. The > opinions expressed within this e-mail (including any attachments) are > the opinions of the sender and do not necessarily constitute those of > Roslin Institute (Edinburgh) ("the Institute") unless specifically > stated by a sender who is duly authorised to do so on behalf of the > Institute > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
Microarray GO GOstats Category Microarray GO GOstats Category • 769 views
ADD COMMENT

Login before adding your answer.

Traffic: 439 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6