how to do it with biomaRt
2
0
Entering edit mode
Alex Sanchez ▴ 90
@alex-sanchez-3227
Last seen 8.1 years ago
Hello I am trying to use biomaRt for what seems to be a simple query. I have a list of transcript IDs from affymetrix Rat Exon arrays. I would like to get some associated identifiers such as the entrez gene id or th gene symbol. I have done the following ###################### library("biomaRt") ### Seleccio de la base de dades i el 'dataset' (aquest darrer ve definit per l'organisme) ensemblMart<- useMart("ensembl") # listDatasets (ensemblMart) # omitted in the message ensemblMart <- useMart("ensembl", dataset="rnorvegicus_gene_ensembl") # ratFilters<-listFilters(ensemblMart) # omitted in the message filters1 <- "affy_raex_1_0_st_v1" transcriptIDs1 <- c("7241279","7332324","7241281","7199205","7112511") # listAttributes(ensemblMart) # omitted in the message attributes1 <-c("affy_raex_1_0_st_v1","entrezgene","rgd_symbol") getBM(attributes = attributes1, filters=filters1, values=transcriptIDs1, mart=ensemblMart) ####################### but I obtain an empty result [1] affy_raex_1_0_st_v1 entrezgene rgd_symbol <0 rows> (or 0-length row.names) ####################### I have used R 2.9 in Ubuntu and windows and I have obtained the same results. I presume I must be doing something wrong because these IDs do have entrez gene and symbol IDs (Verified in NetAffy) Any help will be appreciated. Thanks Alex Sánchez ---------------------------------------------------------------------- ------------------------------- Dr. Alex Sánchez. Statistics Department. University of Barcelona. Facultat de Biologia UB. Avda Diagonal 645. 08028 Barcelona. Spain asanchez_at_ub.edu Statistics and Bioinformatics Unit Institut de Recerca. Hospital Universitari Vall 'Hebron Passeig Vall d'Hebron 112-119. 08034 Barcelona asanchez_at_ir.vhebron.net ---------------------------------------------------------------------- ------------------------------ [[alternative HTML version deleted]]
biomaRt biomaRt • 1.7k views
ADD COMMENT
0
Entering edit mode
@michal-okoniewski-2676
Last seen 8.1 years ago
Hello Alex, The trick is that your IDs in the biomaRt filter are neither transcripts nor compatible with "affy_raex_1_0_st_v1". Your IDs are Affy transcript clusters - a bit old way of defining features on the Affy Exon chips, abandoned as far as I know in most of the software, perhaps except GeneSpring 10 and NetAffx. One transcript cluster consists of several probesets, eg your first transcript cluster has 5 probesets (check with NetAffx). Then - those probeset IDs are compatible with your "affy_raex_1_0_st_v1" filter and will give you results. Btw, the job of translating rat exon probesets into genes and transcripts is done most quickly with exonmap, assuming that you install a local copy of X:MAP database... Saludos! Michal -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch on behalf of Alex Sanchez Sent: Wed 11/4/2009 11:29 PM To: bioconductor at stat.math.ethz.ch Cc: jlmosquera at ir.vhebron.net; M. Carme Ruiz de Villa Subject: [BioC] how to do it with biomaRt Hello I am trying to use biomaRt for what seems to be a simple query. I have a list of transcript IDs from affymetrix Rat Exon arrays. I would like to get some associated identifiers such as the entrez gene id or th gene symbol. I have done the following ###################### library("biomaRt") ### Seleccio de la base de dades i el 'dataset' (aquest darrer ve definit per l'organisme) ensemblMart<- useMart("ensembl") # listDatasets (ensemblMart) # omitted in the message ensemblMart <- useMart("ensembl", dataset="rnorvegicus_gene_ensembl") # ratFilters<-listFilters(ensemblMart) # omitted in the message filters1 <- "affy_raex_1_0_st_v1" transcriptIDs1 <- c("7241279","7332324","7241281","7199205","7112511") # listAttributes(ensemblMart) # omitted in the message attributes1 <-c("affy_raex_1_0_st_v1","entrezgene","rgd_symbol") getBM(attributes = attributes1, filters=filters1, values=transcriptIDs1, mart=ensemblMart) ####################### but I obtain an empty result [1] affy_raex_1_0_st_v1 entrezgene rgd_symbol <0 rows> (or 0-length row.names) ####################### I have used R 2.9 in Ubuntu and windows and I have obtained the same results. I presume I must be doing something wrong because these IDs do have entrez gene and symbol IDs (Verified in NetAffy) Any help will be appreciated. Thanks Alex S?nchez ---------------------------------------------------------------------- ------------------------------- Dr. Alex S?nchez. Statistics Department. University of Barcelona. Facultat de Biologia UB. Avda Diagonal 645. 08028 Barcelona. Spain asanchez_at_ub.edu Statistics and Bioinformatics Unit Institut de Recerca. Hospital Universitari Vall 'Hebron Passeig Vall d'Hebron 112-119. 08034 Barcelona asanchez_at_ir.vhebron.net ---------------------------------------------------------------------- ------------------------------ [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Alex Sanchez ▴ 90
@alex-sanchez-3227
Last seen 8.1 years ago
Hello Michal > The trick is that your IDs in the biomaRt filter are neither transcripts > nor compatible with "affy_raex_1_0_st_v1". It explains the empty return > Your IDs are Affy transcript clusters - a bit old way of defining features > on the Affy Exon chips, abandoned as > far as I know in most of the software, perhaps except GeneSpring 10 and > NetAffx. It is also used by the "fastuos" Partek Genomics Suite. > One transcript cluster consists of several probesets, eg your first > transcript cluster has 5 probesets (check with NetAffx). > Then - those probeset IDs are compatible with your "affy_raex_1_0_st_v1" > filter and will give you results. > Btw, the job of translating rat exon probesets into genes and transcripts > is done most quickly with exonmap, > assuming that you install a local copy of X:MAP database... The point is that, what we often do, is to use Gene Array or Exon chips to study gene expression -not alternative splicing- so what I am looking for is a flexible way to get the annotations for these chips a the transcript cluster level. Thanks for the help Alex ----- Original Message ----- From: "Michal Okoniewski" <michal.okoniewski@fgcz.ethz.ch> To: "Alex Sanchez" <asanchez at="" ub.edu="">; <bioconductor at="" stat.math.ethz.ch=""> Cc: <jlmosquera at="" ir.vhebron.net="">; "M. Carme Ruiz de Villa" <mruiz_de_villa at="" ub.edu=""> Sent: Thursday, November 05, 2009 7:04 AM Subject: RE: [BioC] how to do it with biomaRt Hello Alex, The trick is that your IDs in the biomaRt filter are neither transcripts nor compatible with "affy_raex_1_0_st_v1". Your IDs are Affy transcript clusters - a bit old way of defining features on the Affy Exon chips, abandoned as far as I know in most of the software, perhaps except GeneSpring 10 and NetAffx. One transcript cluster consists of several probesets, eg your first transcript cluster has 5 probesets (check with NetAffx). Then - those probeset IDs are compatible with your "affy_raex_1_0_st_v1" filter and will give you results. Btw, the job of translating rat exon probesets into genes and transcripts is done most quickly with exonmap, assuming that you install a local copy of X:MAP database... Saludos! Michal -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch on behalf of Alex Sanchez Sent: Wed 11/4/2009 11:29 PM To: bioconductor at stat.math.ethz.ch Cc: jlmosquera at ir.vhebron.net; M. Carme Ruiz de Villa Subject: [BioC] how to do it with biomaRt Hello I am trying to use biomaRt for what seems to be a simple query. I have a list of transcript IDs from affymetrix Rat Exon arrays. I would like to get some associated identifiers such as the entrez gene id or th gene symbol. I have done the following ###################### library("biomaRt") ### Seleccio de la base de dades i el 'dataset' (aquest darrer ve definit per l'organisme) ensemblMart<- useMart("ensembl") # listDatasets (ensemblMart) # omitted in the message ensemblMart <- useMart("ensembl", dataset="rnorvegicus_gene_ensembl") # ratFilters<-listFilters(ensemblMart) # omitted in the message filters1 <- "affy_raex_1_0_st_v1" transcriptIDs1 <- c("7241279","7332324","7241281","7199205","7112511") # listAttributes(ensemblMart) # omitted in the message attributes1 <-c("affy_raex_1_0_st_v1","entrezgene","rgd_symbol") getBM(attributes = attributes1, filters=filters1, values=transcriptIDs1, mart=ensemblMart) ####################### but I obtain an empty result [1] affy_raex_1_0_st_v1 entrezgene rgd_symbol <0 rows> (or 0-length row.names) ####################### I have used R 2.9 in Ubuntu and windows and I have obtained the same results. I presume I must be doing something wrong because these IDs do have entrez gene and symbol IDs (Verified in NetAffy) Any help will be appreciated. Thanks Alex S?nchez ---------------------------------------------------------------------- ------------------------------- Dr. Alex S?nchez. Statistics Department. University of Barcelona. Facultat de Biologia UB. Avda Diagonal 645. 08028 Barcelona. Spain asanchez_at_ub.edu Statistics and Bioinformatics Unit Institut de Recerca. Hospital Universitari Vall 'Hebron Passeig Vall d'Hebron 112-119. 08034 Barcelona asanchez_at_ir.vhebron.net ---------------------------------------------------------------------- ------------------------------ [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
> >> Btw, the job of translating rat exon probesets into genes and >> transcripts is done most quickly with exonmap, >> assuming that you install a local copy of X:MAP database... > > The point is that, what we often do, is to use Gene Array or Exon > chips to study gene expression -not alternative splicing- so what I am > looking for is a flexible way to get the annotations for these chips a > the transcript cluster level. > > Thanks for the help > > Alex > > Then - as an approximation I use Brainarray CDFs for the Entrez or Ensembl gene level - however it comes at a price of loosing many genes as false negatives (same for transcript clusters, I suppose). Brainarray is updated quite often, so should be more precise than transcript clusters, I suppose. In the more precise version - get all the significant probesets in the full set (1M for rat ) and check with exonmap to which gene they belong. If there are several exons having the same direction and roughly similar magnitude of fold change - then such a gene is OK differentially expressed, although you might have missed it with Brainarray mapping or transcript cluster approach in Partek . Cheers, Michal
ADD REPLY

Login before adding your answer.

Traffic: 328 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6