all human gene coordinates

1

Entering edit mode

Wim Kreinen ▴ 100

@wim-kreinen-5642

Last seen 9.6 years ago

Dear list, I am completly new to bioconductor and R. And I am looking for a tool (library) that provides the coordinates for all human genes. Does it exist? Thanks Wim [[alternative HTML version deleted]]

• 2.9k views

ADD COMMENT • link 11.4 years ago Wim Kreinen ▴ 100

1

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 11 hours ago

United States

Hi Wim, See the org.Hs.eg.db package. Best, Jim On 12/3/2012 4:47 PM, Wim Kreinen wrote: > Dear list, > > I am completly new to bioconductor and R. And I am looking for a tool > (library) that provides the coordinates for all human genes. Does it exist? > > Thanks > Wim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 11.4 years ago James W. MacDonald 65k

0

Entering edit mode

WATSON Mick ▴ 50

@watson-mick-5575

Last seen 9.3 years ago

United Kingdom

Yes, biomaRt. Not sent from an iPhone Wim Kreinen <wkreinen at="" gmail.com=""> wrote: Dear list, I am completly new to bioconductor and R. And I am looking for a tool (library) that provides the coordinates for all human genes. Does it exist? Thanks Wim [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

ADD COMMENT • link 11.4 years ago WATSON Mick ▴ 50

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 7.7 years ago

United States

Hi Wim, How about this: library(Homo.sapiens) ## This loads both of the packages just mentioned. ## This lists all the things you can retrieve: cols(Homo.sapiens) ## Then you can do something like this k = keys(Homo.sapiens, keytype="ENTREZID") res <- select(Homo.sapiens, keys = k, cols =c("TXSTART","TXEND"), keytype="ENTREZID") head(res) That would get you the starts and ends of all known transcripts for each gene in a data.frame. Marc On 12/03/2012 01:47 PM, Wim Kreinen wrote: > Dear list, > > I am completly new to bioconductor and R. And I am looking for a tool > (library) that provides the coordinates for all human genes. Does it exist? > > Thanks > Wim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 11.4 years ago Marc Carlson ★ 7.2k

0

Entering edit mode

Steve Lianoglou ★ 13k

@steve-lianoglou-2771

Last seen 14 months ago

United States

On Mon, Dec 3, 2012 at 4:47 PM, Wim Kreinen <wkreinen at="" gmail.com=""> wrote: > Dear list, > > I am completly new to bioconductor and R. And I am looking for a tool > (library) that provides the coordinates for all human genes. Does it exist? And the third option not yet mentioned is the TxDb.Hsapiens.* packages. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact

ADD COMMENT • link 11.4 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Wim Kreinen ▴ 100

@wim-kreinen-5642

Last seen 9.6 years ago

Thanks, is there a method to get all protein coding transcripts. With your method I get microRNAs as well. Thanks Wim 2012/12/5 Steve Lianoglou <mailinglist.honeypot@gmail.com> > Hi Wim, > > Please keep emails on the bioc list by hitting "reply all" -- this way > you can get more (and better help) by having more eyes on your > question, and also others can benefit as well. > > So: > > On Wed, Dec 5, 2012 at 11:29 AM, Wim Kreinen <wkreinen@gmail.com> wrote: > > This sounds promising. > > And principally I understand how it works but ... How do I define keys > if I > > want all transcripts? > > I defined via isActiveSeq the chr1...chr22, chrX, chrY as active > > chromosomes. > > > > I tried > > library ("TxDb.Hsapiens. UCSC.hg19.knownGenes") > > txdb->TxDb.Hsapiens. UCSC.hg19.knownGenes > > cols->c("TXCHROM", "TXSTRAND", "TXSTART", "TXEND") > > keys -> ? #How do I define keys if I want all transcripts? > > alltranscripts->select (txdb, keys=keys, cols=cols, keytype="TXID") > > First: what's up w/ the spaces in your "TxDb.Hsapiens.[SPACE]UCSC..." > > It's also ...knownGene -- not ...knownGeneS > > Also, a suggestion: use `<-` for assignment, and not `->` ... although > the latter works, if anybody else is meant to read your code, they're > likely going to be confused for a bit until they get used to your > "odd" (but correct) choice of assignment direction. > > Anyhow -- how about: > > R> library(BiocInstaller) > R> biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene") > R> library("TxDb.Hsapiens.UCSC.hg19.knownGene") > R> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene > R> txs <- transcripts(txdb) > R> head(txs) > R> head(txs) > GRanges with 6 ranges and 2 metadata columns: > seqnames ranges strand | tx_id tx_name > <rle> <iranges> <rle> | <integer> <character> > [1] chr1 [ 11874, 14409] + | 1 uc001aaa.3 > [2] chr1 [ 11874, 14409] + | 2 uc010nxq.1 > ... > > the ucsc id's are in the tx_name column. > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > [[alternative HTML version deleted]]

ADD COMMENT • link 11.4 years ago Wim Kreinen ▴ 100

0

Entering edit mode

Hi, On Thu, Dec 13, 2012 at 11:20 AM, Wim Kreinen <wkreinen at="" gmail.com=""> wrote: > Thanks, > > is there a method to get all protein coding transcripts. With your method I > get microRNAs as well. Here's one non-sophisticated way. The idea is to get the info for all coding exons grouped by tx_id, then filter the transcript list by ids that appear in the coding-exon list names: R> library("TxDb.Hsapiens.UCSC.hg19.knownGene") R> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene R> txs <- transcripts(txdb) R> cds <- cdsBy(txdb) R> txs.coding <- txs[mcols(txs)$tx_id %in% names(cds)] HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact

ADD REPLY • link 11.4 years ago Steve Lianoglou ★ 13k

Login before adding your answer.