Understanding the randomness of Biomart
2
0
Entering edit mode
@nathan-harmston-2904
Last seen 10.2 years ago
Hi everyone, I have been playing with the biomaRt package a bit more and I am trying to work out what is going on here: ensembl = useMart("ensembl_mart_47", dataset = "hsapiens_gene_ensembl", archive = TRUE) fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = c("chromosome_name", "start", "end"), values = list(as.numeric("9"), 19198907, 19357826), mart = ensembl) affy_hg_u133_plus_2 hgnc_symbol 1 226867_at 2 205684_s_at 3 226867_at DENND4C 4 205684_s_at DENND4C 5 234968_at 6 234968_at DENND4C fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = c("chromosome_name", "start", "end"), values = list(as.numeric("9"), 33925736, 34088257), mart = ensembl) affy_hg_u133_plus_2 hgnc_symbol 1 2 224789_at 3 224789_at WDR40A I cannot understand why I am getting 2 rows for some probesets one containing a hugo identifier and the other not? And whether there is any relevance to this result ( probeset 234968_at ) and why I have some results which don't show any probeset at all? Is there a specific reason for this or is this just a something that needs to be post filtered? Many thanks in advance. Nathan
biomaRt biomaRt • 1.1k views
ADD COMMENT
0
Entering edit mode
@nathan-harmston-2904
Last seen 10.2 years ago
Hi, I basically want to return a list of probe, hugo identifier (or ""). I have just tried the line of code you suggested and I get the following error: Error in values[[i]] : subscript out of bounds So I'm afraid it doesnt work. Any ideas what the change to it should be? I tried: getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = c("chromosome_name", "start", "end", "with_affy_hg_u133_plus_2"), values = list(9, 19198907, 19357826), mart = ensembl) getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = c("chromosome_name", "start", "end", "affy_hg_u133_plus_2"), values = list(9, 19198907, 19357826), mart = ensembl) Any ideas? I couldn't find a description of how to change this behaviour in the vignette. Nathan 2008/8/17 Stephen Henderson <to.stephen.henderson at="" googlemail.com="">: > Hi > getBM is returning all genes (or transcripts??) within that area (i.e. > filters =...) including some that do not have affy probes for them. If you > wanted to see them then you would have to put more 'attributes' into the > getBM function e.g. > getBM(attributes = c("affy_hg_u133_plus_2", "entrezgene"),... > Alternatively if you wanted only those transcripts with affy ids then you > would need to specify this in the filters e.g. > fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = > c("chromosome_name", "start", "end", "affy_hg_u133_plus_2"), values = > list(as.numeric("9"), > 19198907, 19357826), mart = ensembl) > The repeats are inherent to the ensembl database and arise for many reasons. > Stephen > > > > On 17 Aug 2008, at 20:23, Nathan Harmston wrote: > > Hi everyone, > > I have been playing with the biomaRt package a bit more and I am > trying to work out what is going on here: > > ensembl = useMart("ensembl_mart_47", dataset = > "hsapiens_gene_ensembl", archive = TRUE) > > fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = > c("chromosome_name", "start", "end"), values = list(as.numeric("9"), > 19198907, 19357826), mart = ensembl) > affy_hg_u133_plus_2 hgnc_symbol > 1 226867_at > 2 205684_s_at > 3 226867_at DENND4C > 4 205684_s_at DENND4C > 5 234968_at > 6 234968_at DENND4C > > fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = > c("chromosome_name", "start", "end"), values = list(as.numeric("9"), > 33925736, 34088257), mart = ensembl) > > affy_hg_u133_plus_2 hgnc_symbol > 1 > 2 224789_at > 3 224789_at WDR40A > > I cannot understand why I am getting 2 rows for some probesets one > containing a hugo identifier and the other not? And whether there is > any relevance to this result ( probeset 234968_at ) and why I have > some results which don't show any probeset at all? Is there a specific > reason for this or is this just a something that needs to be post > filtered? > > Many thanks in advance. > > Nathan > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT
0
Entering edit mode
@steffenstatberkeleyedu-2907
Last seen 10.2 years ago
Hi Nathan, This how these BioMart systems work. If you do: fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = c("chromosome_name", "start", "end", "with_hgnc_symbol"), values = list(as.numeric("9"),19198907, 19357826, TRUE), mart = ensembl) You'll only retrieve the affy ids once and each one of them will have an HGNC symbol. Cheers, Steffen > Hi everyone, > > I have been playing with the biomaRt package a bit more and I am > trying to work out what is going on here: > > ensembl = useMart("ensembl_mart_47", dataset = > "hsapiens_gene_ensembl", archive = TRUE) > > fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = > c("chromosome_name", "start", "end"), values = list(as.numeric("9"), > 19198907, 19357826), mart = ensembl) > affy_hg_u133_plus_2 hgnc_symbol > 1 226867_at > 2 205684_s_at > 3 226867_at DENND4C > 4 205684_s_at DENND4C > 5 234968_at > 6 234968_at DENND4C > > fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters = > c("chromosome_name", "start", "end"), values = list(as.numeric("9"), > 33925736, 34088257), mart = ensembl) > > affy_hg_u133_plus_2 hgnc_symbol > 1 > 2 224789_at > 3 224789_at WDR40A > > I cannot understand why I am getting 2 rows for some probesets one > containing a hugo identifier and the other not? And whether there is > any relevance to this result ( probeset 234968_at ) and why I have > some results which don't show any probeset at all? Is there a specific > reason for this or is this just a something that needs to be post > filtered? > > Many thanks in advance. > > Nathan > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 584 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6