Human genomic sequences
1
0
Entering edit mode
@sheena-scroggins-4305
Last seen 9.6 years ago
Hi, I'm working on a project that is reviewing genes in the human genome project. I'm hoping you can point me to some BioConductor packages that will help me in my quest. The goal is: - Search through hg19 at UCSC (or other genome database) for particular genes of interest. - Increase the search by looking both upstream and downstream until another gene is hit - Be able to sort by the conservation of the upstream and downstream data. We are essentially looking for unknown promoters and other key pieces of the DNA that is not in the gene itself, but is conserved through the different Mammal genomes that are available at the different browsers. I found a program that does this (kind of ) with BLAST data, so I'm hoping you can point me to any useful package or other places I can search for a way to do this. Thanks for your time, Sheena [[alternative HTML version deleted]]
genomes genomes • 726 views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 5 weeks ago
United States
On Fri, Oct 15, 2010 at 2:48 PM, Sheena Scroggins <sheena.scroggins at="" gmail.com=""> wrote: > Hi, > > > I'm working on a project that is reviewing genes in the human genome > project. I'm hoping you can point me to some BioConductor packages that will > help me in my quest. The goal is: > > > ? - Search through hg19 at UCSC (or other genome database) for particular > ? genes of interest. org.Hs.eg.db can be used. if you are interested in, say, FLT1, you determine aspects of its location via > library(org.Hs.eg.db) > get("FLT1", revmap(org.Hs.egSYMBOL)) [1] "2321" > get("2321", org.Hs.egCHRLOC) 13 13 13 13 -28973181 -28959687 -28942233 -28874482 > get("2321", org.Hs.egCHRLOCEND) 13 13 13 13 -29069265 -29069265 -29069265 -29069265 The multiplicities of addresses are common, and you will need rules to resolve. If you wish to work at the transcript level, the GenomicFeatures package is relevant; makeTranscriptDbFromUCSC is a relevant function. > ? - Increase the search by looking both upstream and downstream until > ? another gene is hit This is a programming task. > ? - Be able to sort by the conservation of the upstream and downstream > ? data. You can use rtracklayer to import conservation scores into R. Example: > library(rtracklayer) > s2 = browserSession("UCSC") > ct = track(s2, "cons46way") > ct UCSC track 'Primate Cons' UCSCData with 9974 rows and 1 value column across 1 space space ranges | score <character> <iranges> | <numeric> 1 chr21 [33031597, 33031597] | 0.560087 2 chr21 [33031598, 33031598] | 0.560087 3 chr21 [33031599, 33031599] | 0.435717 4 chr21 [33031600, 33031600] | 0.435717 5 chr21 [33031601, 33031601] | 0.560087 6 chr21 [33031602, 33031602] | 0.560087 7 chr21 [33031603, 33031603] | 0.435717 8 chr21 [33031604, 33031604] | 0.435717 9 chr21 [33031605, 33031605] | 0.560087 ... ... ... ... ... 9966 chr21 [33041562, 33041562] | -0.266457 9967 chr21 [33041563, 33041563] | 0.433850 9968 chr21 [33041564, 33041564] | 0.507567 9969 chr21 [33041565, 33041565] | 0.655000 9970 chr21 [33041566, 33041566] | -0.155882 9971 chr21 [33041567, 33041567] | -0.340173 9972 chr21 [33041568, 33041568] | 0.507567 9973 chr21 [33041569, 33041569] | -1.740790 9974 chr21 [33041570, 33041570] | -1.925080 > length(ct$score) [1] 9974 > summary(ct$score) Min. 1st Qu. Median Mean 3rd Qu. Max. -4.62200 -0.33040 0.36470 0.03753 0.50520 0.65500 what is returned at this point depends on the state of the browser which you can set manually or programmatically; see the rtracklayer vignette. biomaRt package is also relevant. > > We are essentially looking for unknown promoters and other key pieces of the > DNA that is not in the gene itself, but is conserved through the different > Mammal genomes that are available at the different browsers. I found a > program that does this (kind of ) with BLAST data, so I'm hoping you can > point me to any useful package or other places I can search for a way to do > this. > > Thanks for your time, > > Sheena > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6