Fwd: all human gene coordinates
1
0
Entering edit mode
@steve-lianoglou-2771
Last seen 22 months ago
United States
Hah -- forgot to CC bioc-list, even though I suggested you not forget that you should do the same ;-) ---------- Forwarded message ---------- From: Steve Lianoglou <mailinglist.honeypot@gmail.com> Date: Wed, Dec 5, 2012 at 12:20 PM Subject: Re: [BioC] all human gene coordinates To: Wim Kreinen <wkreinen at="" gmail.com=""> Hi Wim, Please keep emails on the bioc list by hitting "reply all" -- this way you can get more (and better help) by having more eyes on your question, and also others can benefit as well. So: On Wed, Dec 5, 2012 at 11:29 AM, Wim Kreinen <wkreinen at="" gmail.com=""> wrote: > This sounds promising. > And principally I understand how it works but ... How do I define keys if I > want all transcripts? > I defined via isActiveSeq the chr1...chr22, chrX, chrY as active > chromosomes. > > I tried > library ("TxDb.Hsapiens. UCSC.hg19.knownGenes") > txdb->TxDb.Hsapiens. UCSC.hg19.knownGenes > cols->c("TXCHROM", "TXSTRAND", "TXSTART", "TXEND") > keys -> ? #How do I define keys if I want all transcripts? > alltranscripts->select (txdb, keys=keys, cols=cols, keytype="TXID") First: what's up w/ the spaces in your "TxDb.Hsapiens.[SPACE]UCSC..." It's also ...knownGene -- not ...knownGeneS Also, a suggestion: use `<-` for assignment, and not `->` ... although the latter works, if anybody else is meant to read your code, they're likely going to be confused for a bit until they get used to your "odd" (but correct) choice of assignment direction. Anyhow -- how about: R> library(BiocInstaller) R> biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene") R> library("TxDb.Hsapiens.UCSC.hg19.knownGene") R> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene R> txs <- transcripts(txdb) R> head(txs) R> head(txs) GRanges with 6 ranges and 2 metadata columns: seqnames ranges strand | tx_id tx_name <rle> <iranges> <rle> | <integer> <character> [1] chr1 [ 11874, 14409] + | 1 uc001aaa.3 [2] chr1 [ 11874, 14409] + | 2 uc010nxq.1 ... the ucsc id's are in the tx_name column. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Cancer Cancer • 1.0k views
ADD COMMENT
0
Entering edit mode
Natasha ▴ 440
@natasha-4640
Last seen 10.3 years ago
Dear All, I am trying to annotate my DE gene list using biomart, but keep getting an error and an empty output. I can't seem to figure out where I have gone wrong in my code (I suspect it might be something really silly). Help much appreciated. Code below. ########## library("biomaRt") listMarts() listMarts(host='jul2012.archive.ensembl.org'ensembl68 = useMart(host='jul2012.archive.ensembl.org', biomart='ENSEMBL_MART_ENSEMBL') listDatasets(ensembl68) ensembl68 = useDataset("hsapiens_gene_ensembl", mart=ensembl68) listFilters(ensembl68) listAttributes(ensembl68) annot.tot = getBM(attributes=c('ensembl_gene_id','external_gene_id','h gnc_symbol','description','entrezgene','chromosome_name','start_positi on','end_position','strand'),filters='ensembl_gene_id',values= rownames(p12.ip$table),mart=ensembl68) ##### Warning message: In getBM(attributes = c("ensembl_gene_id", "external_gene_id", "hgnc_symbol", : Unable to match column names of BioMart output ########## > head(rownames(p12.ip$table)) [1] "ENSG00000111335" "ENSG00000165949" "ENSG00000187608" "ENSG00000157601" [5] "ENSG00000119922" "ENSG00000126709" ##### Many Thanks, Natasha SessionInfo: R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] biomaRt_2.14.0 gdata_2.12.0 WriteXLS_2.2.0 edgeR_2.6.10 limma_3.14.1 loaded via a namespace (and not attached): [1] gtools_2.7.0 RCurl_1.95-3 XML_3.95-0.1
ADD COMMENT

Login before adding your answer.

Traffic: 276 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6