Hi Steffen,
Hi Jim,
Thanks for your suggestions!
To avoid hard coding, I'll retrieve indeed the end position of the
last
transcript on each of the chromosomes. This is, relatively seen,
pretty
close to the real length of the chromosome.
An
-----Original Message-----
From: Steffen Durinck [mailto:durincks@mail.nih.gov]
Sent: Monday, 30 October 2006 21:17
To: James W. MacDonald
Cc: De Bondt, An-7114 [PRDBE]; 'bioconductor at stat.math.ethz.ch'
Subject: Re: [BioC] biomaRt: retrieve total chromosome lengths
Hi An,
There is no way to retrieve the chromosome lengths with biomaRt when
used with Ensembl.
The closest you'll get with biomaRt is to subtract the position of the
'first' transcript from the position of the 'last' transcript.
If you want to use the Ensembl data to get this information (you'll
need
to do some browser clicking), you can select your species of interest
at
http://www.ensembl.org/
for hsapiens:
http://www.ensembl.org/Homo_sapiens/index.html
then select a chromosome e.g.:
http://www.ensembl.org/Homo_sapiens/mapview?chr=1
and here you'll get the length.
Cheers,
Steffen
James W. MacDonald wrote:
> Hi An,
>
> De Bondt, An-7114 [PRDBE] wrote:
>
>> Hi,
>>
>> How can I retrieve, for a certain organism (e.g. human), the total
length
of
>> each of its chromosomes using biomaRt?
>> library(biomaRt)
>> mart <- useMart("ensembl")
>> mart <- useDataset("hsapiens_gene_ensembl", mart)
>> chr.lengths <- ???
>>
>
> Well, this doesn't agree exactly with what I see on this webpage:
>
>
http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/
faqs.s
html
>
> But it is pretty close. Of course I am finding the end of the 'last'
> transcript on a given chromosome rather than the end of the
chromosome
> itself, so there will likely be differences. However, I don't see an
> attribute that looks like it gives chromosomal information without
first
> being mapped through a gene, so I don't know if you can get exactly
what
> you want.
>
> If there is a way, Steffen Durinck will undoubtedly know what it is,
but
> I haven't seen a response from him as yet.
>
> Anyway, here is what I did.
>
> > mart <- useMart("ensembl", "hsapiens_gene_ensembl")
> Checking attributes and filters ... ok
> > a <-
getBM("hsapiens_gene_ensembl_structure.transcript_chrom_end",
> "chromosome_name", c(1:21, "x","y"), mart, output="list")
> > sapply(a[[1]], max)
> 1 2 3 4 5
> 247197891 242713278 199439629 191246650 180727832
> 6 7 8 9 10
> 170735623 158630410 146252219 140191642 135347681
> 11 12 13 14 15
> 134361903 132289533 114110907 106354309 100334282
> 16 17 18 19 20
> 88771793 78646005 76106388 63802660 62429769
> 21 x y
> 46935585 154908521 57767721
>
> Best,
>
> Jim
>
>
>
>> Thanks in advance!
>> An
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
--
Steffen Durinck, Ph.D.
Oncogenomics Section
Pediatric Oncology Branch
National Cancer Institute, National Institutes of Health
URL:
http://home.ccr.cancer.gov/oncology/oncogenomics/
Phone: 301-402-8103
Address:
Advanced Technology Center,
8717 Grovemont Circle
Gaithersburg, MD 20877