Question: biomaRt: using a list as values. confused...
0
gravatar for J.delasHeras@ed.ac.uk
8.5 years ago by
United Kingdom
J.delasHeras@ed.ac.uk1.9k wrote:
I'm trying to obtain information about genes within a number of regions defined by a chromosome name, start and end coordinates. I understand that the way to specify multiple filters to be used together (a set of chr+start+end) is to use a list for 'values'. This seems to work ok when I have more than one region (I tested it using two regions first, before doing the proper search for >1000), but if I were to specify just one region, it does not work... and I'm wondering how I would do it in that case. Example: library("biomaRt") ensembl = useMart("ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl", host="www.ensembl.org") chrom<-c("1", "2") chr.start<-c(11401198, 86460656) chr.stop<-c(11694590, 86663869) attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name", "start_position", "end_position", "strand", "band") # extract both regions at once: getBM(attributes=attributes, filters=c("chromosome_name","start","end"), values=list(chrom,chr.start,chr.stop),mart=ensembl) #this works, returning 1939 rows of data, the first 1198 with chr1 #corresponding to teh first region, and the rest with chr2 to teh second. Good. #but how does one retrieve the data for just ONE region? # try this: getBM(attributes=attributes, filters=c("chromosome_name","start","end"), values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl) # it only returns one gene!!! (in two rows) so, when I just want to do a single search with multiple filters, how would I specify the values? Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
• 2.2k views
ADD COMMENTlink modified 8.5 years ago by Steffen Durinck540 • written 8.5 years ago by J.delasHeras@ed.ac.uk1.9k
Answer: biomaRt: using a list as values. confused...
0
gravatar for Steffen Durinck
8.5 years ago by
Steffen Durinck540 wrote:
Hi Jose, the combo filter chr + start + end is a special situation and is interpreted as give me everything in between. It is porbably not well documented but this however filter combo works only for a single region at a time so your second example is correct there are only few genes in your region on chr1. An alternative which does work for multiple regions is to use the chromosomal_region filter like: regions<-c("1:11401198:11694590", "2:86460656:86663869") attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name","start_position", "end_position", "strand", "band") getBM(attributes=attributes,filters="chromosomal_region",values=region s,mart=ensembl) Cheers, Steffen On Thu, Jun 9, 2011 at 8:49 AM, <j.delasheras at="" ed.ac.uk=""> wrote: > > I'm trying to obtain information about genes within a number of regions > defined by a chromosome name, start and end coordinates. > > I understand that the way to specify multiple filters to be used together (a > set of chr+start+end) is to use a list for 'values'. > > This seems to work ok when I have more than one region (I tested it using > two regions first, before doing the proper search for >1000), but if I were > to specify just one region, it does not work... and I'm wondering how I > would do it in that case. > > Example: > > library("biomaRt") > ensembl = useMart("ENSEMBL_MART_ENSEMBL", > ? dataset="hsapiens_gene_ensembl", > ? host="www.ensembl.org") > > chrom<-c("1", "2") > chr.start<-c(11401198, 86460656) > chr.stop<-c(11694590, 86663869) > > attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name", > "start_position", "end_position", "strand", "band") > > > # extract both regions at once: > getBM(attributes=attributes, > ? ? ?filters=c("chromosome_name","start","end"), > ? ? ?values=list(chrom,chr.start,chr.stop),mart=ensembl) > #this works, returning 1939 rows of data, the first 1198 with chr1 > #corresponding to teh first region, and the rest with chr2 to teh second. > Good. > > #but how does one retrieve the data for just ONE region? > # try this: > getBM(attributes=attributes, > ? ? ?filters=c("chromosome_name","start","end"), > ? ? ?values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl) > # it only returns one gene!!! (in two rows) > > so, when I just want to do a single search with multiple filters, how would > I specify the values? > > Jose > > -- > Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 > Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 8.5 years ago by Steffen Durinck540
Hi Stephen, many thanks for that. I was looking at the previous results that I said were ok and realised the ranges were wrong and that confused me even more! Thanks for teh tip about the chromosomal region, that's just what I needed! Jose Quoting Steffen Durinck <durinck.steffen at="" gene.com=""> on Thu, 9 Jun 2011 09:12:59 -0700: > Hi Jose, > > the combo filter chr + start + end is a special situation and is > interpreted as give me everything in between. It is porbably not well > documented but this however filter combo works only for a single > region at a time so your second example is correct there are only few > genes in your region on chr1. > > An alternative which does work for multiple regions is to use the > chromosomal_region filter like: > > regions<-c("1:11401198:11694590", "2:86460656:86663869") > attributes<-c("hgnc_symbol", "entrezgene", > "chromosome_name","start_position", "end_position", "strand", "band") > getBM(attributes=attributes,filters="chromosomal_region",values=regi ons,mart=ensembl) > > Cheers, > Steffen > > On Thu, Jun 9, 2011 at 8:49 AM, <j.delasheras at="" ed.ac.uk=""> wrote: >> >> I'm trying to obtain information about genes within a number of regions >> defined by a chromosome name, start and end coordinates. >> >> I understand that the way to specify multiple filters to be used together (a >> set of chr+start+end) is to use a list for 'values'. >> >> This seems to work ok when I have more than one region (I tested it using >> two regions first, before doing the proper search for >1000), but if I were >> to specify just one region, it does not work... and I'm wondering how I >> would do it in that case. >> >> Example: >> >> library("biomaRt") >> ensembl = useMart("ENSEMBL_MART_ENSEMBL", >> ? dataset="hsapiens_gene_ensembl", >> ? host="www.ensembl.org") >> >> chrom<-c("1", "2") >> chr.start<-c(11401198, 86460656) >> chr.stop<-c(11694590, 86663869) >> >> attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name", >> "start_position", "end_position", "strand", "band") >> >> >> # extract both regions at once: >> getBM(attributes=attributes, >> ? ? ?filters=c("chromosome_name","start","end"), >> ? ? ?values=list(chrom,chr.start,chr.stop),mart=ensembl) >> #this works, returning 1939 rows of data, the first 1198 with chr1 >> #corresponding to teh first region, and the rest with chr2 to teh second. >> Good. >> >> #but how does one retrieve the data for just ONE region? >> # try this: >> getBM(attributes=attributes, >> ? ? ?filters=c("chromosome_name","start","end"), >> ? ? ?values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl) >> # it only returns one gene!!! (in two rows) >> >> so, when I just want to do a single search with multiple filters, how would >> I specify the values? >> >> Jose >> >> -- >> Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: J.delasHeras at ed.ac.uk >> The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 >> Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 >> Swann Building, Mayfield Road >> University of Edinburgh >> Edinburgh EH9 3JR >> UK >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD REPLYlink written 8.4 years ago by J.delasHeras@ed.ac.uk1.9k
Hi Jose, I'll make biomaRt throw an error when someone tries the query you attempted. Cheers, Steffen On Thu, Jun 9, 2011 at 9:15 AM, <j.delasheras at="" ed.ac.uk=""> wrote: > > Hi Stephen, > > many thanks for that. I was looking at the previous results that I said were > ok and realised the ranges were wrong and that confused me even more! > > Thanks for teh tip about the chromosomal region, that's just what I needed! > > Jose > > > Quoting Steffen Durinck <durinck.steffen at="" gene.com=""> on Thu, 9 Jun 2011 > 09:12:59 -0700: > >> Hi Jose, >> >> the combo filter chr + start + end is a special situation and is >> interpreted as give me everything in between. ?It is porbably not well >> documented but this however filter combo works only for a single >> region at a time so your second example is correct there are only few >> genes in your region on chr1. >> >> An alternative which does work for multiple regions is to use the >> chromosomal_region filter like: >> >> regions<-c("1:11401198:11694590", "2:86460656:86663869") >> attributes<-c("hgnc_symbol", "entrezgene", >> "chromosome_name","start_position", "end_position", "strand", "band") >> >> getBM(attributes=attributes,filters="chromosomal_region",values=reg ions,mart=ensembl) >> >> Cheers, >> Steffen >> >> On Thu, Jun 9, 2011 at 8:49 AM, ?<j.delasheras at="" ed.ac.uk=""> wrote: >>> >>> I'm trying to obtain information about genes within a number of regions >>> defined by a chromosome name, start and end coordinates. >>> >>> I understand that the way to specify multiple filters to be used together >>> (a >>> set of chr+start+end) is to use a list for 'values'. >>> >>> This seems to work ok when I have more than one region (I tested it using >>> two regions first, before doing the proper search for >1000), but if I >>> were >>> to specify just one region, it does not work... and I'm wondering how I >>> would do it in that case. >>> >>> Example: >>> >>> library("biomaRt") >>> ensembl = useMart("ENSEMBL_MART_ENSEMBL", >>> ? dataset="hsapiens_gene_ensembl", >>> ? host="www.ensembl.org") >>> >>> chrom<-c("1", "2") >>> chr.start<-c(11401198, 86460656) >>> chr.stop<-c(11694590, 86663869) >>> >>> attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name", >>> "start_position", "end_position", "strand", "band") >>> >>> >>> # extract both regions at once: >>> getBM(attributes=attributes, >>> ? ? ?filters=c("chromosome_name","start","end"), >>> ? ? ?values=list(chrom,chr.start,chr.stop),mart=ensembl) >>> #this works, returning 1939 rows of data, the first 1198 with chr1 >>> #corresponding to teh first region, and the rest with chr2 to teh second. >>> Good. >>> >>> #but how does one retrieve the data for just ONE region? >>> # try this: >>> getBM(attributes=attributes, >>> ? ? ?filters=c("chromosome_name","start","end"), >>> ? ? ?values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl) >>> # it only returns one gene!!! (in two rows) >>> >>> so, when I just want to do a single search with multiple filters, how >>> would >>> I specify the values? >>> >>> Jose >>> >>> -- >>> Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: >>> J.delasHeras at ed.ac.uk >>> The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 >>> Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 >>> Swann Building, Mayfield Road >>> University of Edinburgh >>> Edinburgh EH9 3JR >>> UK >>> >>> >>> -- >>> The University of Edinburgh is a charitable body, registered in >>> Scotland, with registration number SC005336. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > > > > -- > Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 > Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > >
ADD REPLYlink written 8.5 years ago by Steffen Durinck540
THat's probably a good idea. Most people would realise the result is not the expected one, but it will be better to find an error and be safe. thank you! Jose Quoting Steffen Durinck <durinck.steffen at="" gene.com=""> on Thu, 9 Jun 2011 09:40:41 -0700: > Hi Jose, > > I'll make biomaRt throw an error when someone tries the query you attempted. > > Cheers, > Steffen > > On Thu, Jun 9, 2011 at 9:15 AM, <j.delasheras at="" ed.ac.uk=""> wrote: >> >> Hi Stephen, >> >> many thanks for that. I was looking at the previous results that I said were >> ok and realised the ranges were wrong and that confused me even more! >> >> Thanks for teh tip about the chromosomal region, that's just what I needed! >> >> Jose >> >> >> Quoting Steffen Durinck <durinck.steffen at="" gene.com=""> on Thu, 9 Jun 2011 >> 09:12:59 -0700: >> >>> Hi Jose, >>> >>> the combo filter chr + start + end is a special situation and is >>> interpreted as give me everything in between. ?It is porbably not well >>> documented but this however filter combo works only for a single >>> region at a time so your second example is correct there are only few >>> genes in your region on chr1. >>> >>> An alternative which does work for multiple regions is to use the >>> chromosomal_region filter like: >>> >>> regions<-c("1:11401198:11694590", "2:86460656:86663869") >>> attributes<-c("hgnc_symbol", "entrezgene", >>> "chromosome_name","start_position", "end_position", "strand", "band") >>> >>> getBM(attributes=attributes,filters="chromosomal_region",values=re gions,mart=ensembl) >>> >>> Cheers, >>> Steffen >>> >>> On Thu, Jun 9, 2011 at 8:49 AM, ?<j.delasheras at="" ed.ac.uk=""> wrote: >>>> >>>> I'm trying to obtain information about genes within a number of regions >>>> defined by a chromosome name, start and end coordinates. >>>> >>>> I understand that the way to specify multiple filters to be used together >>>> (a >>>> set of chr+start+end) is to use a list for 'values'. >>>> >>>> This seems to work ok when I have more than one region (I tested it using >>>> two regions first, before doing the proper search for >1000), but if I >>>> were >>>> to specify just one region, it does not work... and I'm wondering how I >>>> would do it in that case. >>>> >>>> Example: >>>> >>>> library("biomaRt") >>>> ensembl = useMart("ENSEMBL_MART_ENSEMBL", >>>> ? dataset="hsapiens_gene_ensembl", >>>> ? host="www.ensembl.org") >>>> >>>> chrom<-c("1", "2") >>>> chr.start<-c(11401198, 86460656) >>>> chr.stop<-c(11694590, 86663869) >>>> >>>> attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name", >>>> "start_position", "end_position", "strand", "band") >>>> >>>> >>>> # extract both regions at once: >>>> getBM(attributes=attributes, >>>> ? ? ?filters=c("chromosome_name","start","end"), >>>> ? ? ?values=list(chrom,chr.start,chr.stop),mart=ensembl) >>>> #this works, returning 1939 rows of data, the first 1198 with chr1 >>>> #corresponding to teh first region, and the rest with chr2 to teh second. >>>> Good. >>>> >>>> #but how does one retrieve the data for just ONE region? >>>> # try this: >>>> getBM(attributes=attributes, >>>> ? ? ?filters=c("chromosome_name","start","end"), >>>> ? ? ?values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl) >>>> # it only returns one gene!!! (in two rows) >>>> >>>> so, when I just want to do a single search with multiple filters, how >>>> would >>>> I specify the values? >>>> >>>> Jose >>>> >>>> -- >>>> Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: >>>> J.delasHeras at ed.ac.uk >>>> The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 >>>> Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 >>>> Swann Building, Mayfield Road >>>> University of Edinburgh >>>> Edinburgh EH9 3JR >>>> UK >>>> >>>> >>>> -- >>>> The University of Edinburgh is a charitable body, registered in >>>> Scotland, with registration number SC005336. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >> >> >> >> -- >> Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: J.delasHeras at ed.ac.uk >> The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 >> Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 >> Swann Building, Mayfield Road >> University of Edinburgh >> Edinburgh EH9 3JR >> UK >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> >> >> > > -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD REPLYlink written 8.4 years ago by J.delasHeras@ed.ac.uk1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 440 users visited in the last hour