Search
Question: How to retrieve 'conservation score' sequence?
0
gravatar for Guido Hooiveld
9.9 years ago by
Guido Hooiveld2.3k
Wageningen University, Wageningen, the Netherlands
Guido Hooiveld2.3k wrote:
Hi, I have a list of putative transcription factor binding sites, and in order to continue with the most relevant ones for further analyses i would like to filter on 'conservation score' (the assumption is that conserved sequences are more likely to be functional than less/non conserved sequences). I have read on this, and found out that both ENSEMBL (GERP score) and UCSC Browser (multiz alignment) provide this info (although calculated using different algorithms). Moreover, in both genome browsers i can view the score. However, i don't know how to retrieve the score for a list of sequences... I was thinking/hoping that e.g. biomart could be used for this, but i could not find the appropriate filter. I am not familiar enough yet with UCSC to find a suitable way of doing this. Therefore, any pointer on how to best tackle this issue would be appreciated! TIA, Guido ------------------------------------------------ Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 internet: http://nutrigene.4t.com <http: nutrigene.4t.com=""/> email: guido.hooiveld@wur.nl [[alternative HTML version deleted]]
ADD COMMENTlink modified 9.9 years ago by Michael Lawrence620 • written 9.9 years ago by Guido Hooiveld2.3k
0
gravatar for Sean Davis
9.9 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Tue, Dec 16, 2008 at 2:02 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > > Hi, > > I have a list of putative transcription factor binding sites, and in > order to continue with the most relevant ones for further analyses i > would like to filter on 'conservation score' (the assumption is that > conserved sequences are more likely to be functional than less/non > conserved sequences). I have read on this, and found out that both > ENSEMBL (GERP score) and UCSC Browser (multiz alignment) provide this > info (although calculated using different algorithms). Moreover, in both > genome browsers i can view the score. However, i don't know how to > retrieve the score for a list of sequences... > > I was thinking/hoping that e.g. biomart could be used for this, but i > could not find the appropriate filter. I am not familiar enough yet with > UCSC to find a suitable way of doing this. > Therefore, any pointer on how to best tackle this issue would be > appreciated! See here: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phastCons28way/ If you are in another species, you can go to that URL and find the phastCons data. There is information in the directory and on the UCSC site about how these are calculated. Hope that helps. Sean
ADD COMMENTlink written 9.9 years ago by Sean Davis21k
0
gravatar for Michael Lawrence
9.9 years ago by
Michael Lawrence620 wrote:
The rtracklayer package is capable of downloading the conservation scores from UCSC. This is particularly easy with the devel version of rtracklayer: > library(rtracklayer) Loading required package: RCurl > session <- browserSession() > track(session, "multiz28way", GenomicRanges(10000, 20000, "chr1", "hg18")) A UCSCData object with 1 cols on 9979 ranges in 1 sequences trackLine: track name=Conservation description="Vertebrate Multiz Alignment & PhastCons Conservation (28 Species)" type=wiggle_0 To get the phastCons values from 10000 to 20000 on chr1 in the human genome. I'm working on a way to make it fast to retrieve these values for a large number of ranges (e.g. genes). On Tue, Dec 16, 2008 at 11:02 AM, Hooiveld, Guido <guido.hooiveld@wur.nl>wrote: > > Hi, > > I have a list of putative transcription factor binding sites, and in > order to continue with the most relevant ones for further analyses i > would like to filter on 'conservation score' (the assumption is that > conserved sequences are more likely to be functional than less/non > conserved sequences). I have read on this, and found out that both > ENSEMBL (GERP score) and UCSC Browser (multiz alignment) provide this > info (although calculated using different algorithms). Moreover, in both > genome browsers i can view the score. However, i don't know how to > retrieve the score for a list of sequences... > > I was thinking/hoping that e.g. biomart could be used for this, but i > could not find the appropriate filter. I am not familiar enough yet with > UCSC to find a suitable way of doing this. > Therefore, any pointer on how to best tackle this issue would be > appreciated! > > TIA, > Guido > > > ------------------------------------------------ > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > internet: http://nutrigene.4t.com <http: nutrigene.4t.com=""/> > email: guido.hooiveld@wur.nl > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 9.9 years ago by Michael Lawrence620
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 405 users visited in the last hour