GRanges list and reduce function
2
0
Entering edit mode
Asma rabe ▴ 290
@asma-rabe-4697
Last seen 6.2 years ago
Japan
Hi , I need a Granges object with exons data for few chromosomes, i got Granges list of transcripts and their exons as follows: library("TxDb.Hsapiens.UCSC.hg19.knownGene") txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene tx_Exons<-exonsBy(txdb) 1-How to use reduce on Granges list?how to get the unique exons only and exclude redundant exons? 2-How to select exons of certain chromosomes only ex: chr10? i tried the following but i wonder why i got GRnages list with empty Grange lists?? chr10<-tx_Exons[seqnames(tx_Exons)=="chr10",] >chr10 GRangesList of length 80922: $1 GRanges with 0 ranges and 3 metadata columns: seqnames ranges strand | exon_id exon_name exon_rank <rle> <iranges> <rle> | <integer> <character> <integer> $2 GRanges with 0 ranges and 3 metadata columns: seqnames ranges strand | exon_id exon_name exon_rank $3 GRanges with 0 ranges and 3 metadata columns: seqnames ranges strand | exon_id exon_name exon_rank ... <80919 more elements> --- seqlengths: chr1 chr2 ... chrUn_gl000249 249250621 243199373 ... 38502 > length(chr10) [1] 80922 > length(tx_Exons) [1] 80922 Thank you [[alternative HTML version deleted]]
• 5.2k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 5 weeks ago
United States
On Fri, Aug 15, 2014 at 6:20 AM, Asma rabe <asma.rabe at="" gmail.com=""> wrote: > Hi , > > > I need a Granges object with exons data for few chromosomes, i got Granges > list of transcripts and their exons as follows: > > > library("TxDb.Hsapiens.UCSC.hg19.knownGene") > > txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene > > tx_Exons<-exonsBy(txdb) > > > > 1-How to use reduce on Granges list?how to get the unique exons only and > exclude redundant exons? > > r = reduce(tx_Exons) > > 2-How to select exons of certain chromosomes only ex: chr10? i tried the > following but i wonder why i got GRnages list with empty Grange lists?? > > one way ur = unlist(r) r10 = ur[which(seqnames(ur)=="chr10")] > > chr10<-tx_Exons[seqnames(tx_Exons)=="chr10",] > > > >chr10 > > GRangesList of length 80922: > > $1 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > <rle> <iranges> <rle> | <integer> <character> <integer> > > > $2 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > > $3 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > > ... > > <80919 more elements> > > --- > > seqlengths: > > chr1 chr2 ... chrUn_gl000249 > > 249250621 243199373 ... 38502 > > > > > length(chr10) > > [1] 80922 > > > length(tx_Exons) > > [1] 80922 > > > Thank you > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
0
Entering edit mode
@martin-morgan-1513
Last seen 17 hours ago
United States
On 08/15/2014 03:20 AM, Asma rabe wrote: > Hi , > > > I need a Granges object with exons data for few chromosomes, i got Granges > list of transcripts and their exons as follows: > > > library("TxDb.Hsapiens.UCSC.hg19.knownGene") > > txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene > > tx_Exons<-exonsBy(txdb) > > > > 1-How to use reduce on Granges list?how to get the unique exons only and > exclude redundant exons? > I'm not sure what this means -- you've asked for exons grouped by transcript, and there are not 'extra' exons in each transcript. Did you want exonsBy(txdb, "gene") ? reduce(tx_Exons) reduces within each transcript (list element); I'm not sure what you'd really like to do? > > 2-How to select exons of certain chromosomes only ex: chr10? i tried the > following but i wonder why i got GRnages list with empty Grange lists?? if you want to select transcripts where all exons are in certain chromosomes, note that seqnames(tx_Exonss) %in% "chr10" returns an RleList, and all(seqnames(tx_Exons) %in% "chr10") asks element-wise whether all elements of each Rle are TRUE, returning a logical vector of the same length as tx_Exons. So tx_Exons[all(seqnames(tx_Exons) %in% "chr10")] returns the transcripts with all exons on chr10. For exons group by _gene_, it's possible that genes are annotated to contain exons from different chromosomes > exByGn = exonsBy(txdb, "gene") > table(elementLengths(runLength(seqnames(exByGn)))) 1 2 3 4 5 6 7 8 23182 77 4 3 19 38 76 60 and only exons in chr10, preserving grouping by gene and removing genes without any exons in chr10, are > chr10 <- exByGn[seqnames(exByGn) %in% "chr10"] this is what you did below. The result is not empty, just contains the many transcripts with exons not in chr10 removed, plus those deep in the list that are on chr10. Here I remove the elements without 0 elements. > chr10[elementLengths(chr10) != 0] Martin > > > chr10<-tx_Exons[seqnames(tx_Exons)=="chr10",] > > >> chr10 > > GRangesList of length 80922: > > $1 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > <rle> <iranges> <rle> | <integer> <character> <integer> > > > $2 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > > $3 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > > ... > > <80919 more elements> > > --- > > seqlengths: > > chr1 chr2 ... chrUn_gl000249 > > 249250621 243199373 ... 38502 > > > >> length(chr10) > > [1] 80922 > >> length(tx_Exons) > > [1] 80922 > > > Thank you > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
Hi Vincent, Martin, Thank you very much for your kind explanation. For Martin: >For exons group by _gene_, it's possible that genes are annotated to contain exons from different chromosomes How genes can be annotated to contain exons from different many chromosomes? Best Regards, Asma On Fri, Aug 15, 2014 at 11:56 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > On 08/15/2014 03:20 AM, Asma rabe wrote: > >> Hi , >> >> >> I need a Granges object with exons data for few chromosomes, i got >> Granges >> list of transcripts and their exons as follows: >> >> >> library("TxDb.Hsapiens.UCSC.hg19.knownGene") >> >> txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene >> >> tx_Exons<-exonsBy(txdb) >> >> >> >> 1-How to use reduce on Granges list?how to get the unique exons only and >> exclude redundant exons? >> >> > I'm not sure what this means -- you've asked for exons grouped by > transcript, and there are not 'extra' exons in each transcript. Did you > want exonsBy(txdb, "gene") ? > > reduce(tx_Exons) reduces within each transcript (list element); I'm not > sure what you'd really like to do? > > > >> 2-How to select exons of certain chromosomes only ex: chr10? i tried the >> following but i wonder why i got GRnages list with empty Grange lists?? >> > > if you want to select transcripts where all exons are in certain > chromosomes, note that > > seqnames(tx_Exonss) %in% "chr10" > > returns an RleList, and > > all(seqnames(tx_Exons) %in% "chr10") > > asks element-wise whether all elements of each Rle are TRUE, returning a > logical vector of the same length as tx_Exons. So > > tx_Exons[all(seqnames(tx_Exons) %in% "chr10")] > > returns the transcripts with all exons on chr10. For exons group by > _gene_, it's possible that genes are annotated to contain exons from > different chromosomes > > exByGn = exonsBy(txdb, "gene") >> table(elementLengths(runLength(seqnames(exByGn)))) >> > > 1 2 3 4 5 6 7 8 > 23182 77 4 3 19 38 76 60 > > and only exons in chr10, preserving grouping by gene and removing genes > without any exons in chr10, are > > chr10 <- exByGn[seqnames(exByGn) %in% "chr10"] >> > > this is what you did below. The result is not empty, just contains the > many transcripts with exons not in chr10 removed, plus those deep in the > list that are on chr10. Here I remove the elements without 0 elements. > > chr10[elementLengths(chr10) != 0] >> > > Martin > > >> >> chr10<-tx_Exons[seqnames(tx_Exons)=="chr10",] >> >> >> chr10 >>> >> >> GRangesList of length 80922: >> >> $1 >> >> GRanges with 0 ranges and 3 metadata columns: >> >> seqnames ranges strand | exon_id exon_name exon_rank >> >> <rle> <iranges> <rle> | <integer> <character> <integer> >> >> >> $2 >> >> GRanges with 0 ranges and 3 metadata columns: >> >> seqnames ranges strand | exon_id exon_name exon_rank >> >> >> $3 >> >> GRanges with 0 ranges and 3 metadata columns: >> >> seqnames ranges strand | exon_id exon_name exon_rank >> >> >> ... >> >> <80919 more elements> >> >> --- >> >> seqlengths: >> >> chr1 chr2 ... chrUn_gl000249 >> >> 249250621 243199373 ... 38502 >> >> >> >> length(chr10) >>> >> >> [1] 80922 >> >> length(tx_Exons) >>> >> >> [1] 80922 >> >> >> Thank you >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> >> > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On 08/25/2014 05:31 AM, Asma rabe wrote: > Hi Vincent, Martin, > > Thank you very much for your kind explanation. > > For Martin: > >>For exons group by _gene_, it's possible that genes are annotated to contain exons from different chromosomes > > How genes can be annotated to contain exons from different many chromosomes? I don't know, but they are! You can see the reason for some of these; there are more interesting examples. > exByGn[elementLengths(unique(seqnames(exByGn))) > 1] GRangesList of length 277: $100126314 GRanges with 7 ranges and 2 metadata columns: seqnames ranges strand | exon_id exon_name <rle> <iranges> <rle> | <integer> <character> [1] chr6 [30552109, 30552194] + | 87067 <na> [2] chr6_cox_hap2 [ 2064162, 2064247] + | 278963 <na> [3] chr6_dbb_hap3 [ 1845750, 1845835] + | 280931 <na> [4] chr6_mann_hap4 [ 1900181, 1900266] + | 282770 <na> [5] chr6_mcf_hap5 [ 1933963, 1934048] + | 284213 <na> [6] chr6_qbl_hap6 [ 1845017, 1845102] + | 286075 <na> [7] chr6_ssto_hap7 [ 1884391, 1884476] + | 287961 <na> $100128977 GRanges with 4 ranges and 2 metadata columns: seqnames ranges strand | exon_id exon_name [1] chr17 [43920722, 43921527] - | 227980 <na> [2] chr17 [43972846, 43972879] - | 227981 <na> [3] chr17_ctg5_hap1 [ 894694, 894727] + | 289539 <na> [4] chr17_ctg5_hap1 [ 946013, 946818] + | 289540 <na> ... <275 more elements> --- seqlengths: chr1 chr2 ... chrUn_gl000249 249250621 243199373 ... 38502 > > > > Best Regards, > Asma > > > On Fri, Aug 15, 2014 at 11:56 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> <mailto:mtmorgan at="" fhcrc.org="">> wrote: > > On 08/15/2014 03:20 AM, Asma rabe wrote: > > Hi , > > > I need a Granges object with exons data for few chromosomes, i got Granges > list of transcripts and their exons as follows: > > > library("TxDb.Hsapiens.UCSC.__hg19.knownGene") > > txdb<-TxDb.Hsapiens.UCSC.hg19.__knownGene > > tx_Exons<-exonsBy(txdb) > > > > 1-How to use reduce on Granges list?how to get the unique exons only and > exclude redundant exons? > > > I'm not sure what this means -- you've asked for exons grouped by > transcript, and there are not 'extra' exons in each transcript. Did you want > exonsBy(txdb, "gene") ? > > reduce(tx_Exons) reduces within each transcript (list element); I'm not sure > what you'd really like to do? > > > > 2-How to select exons of certain chromosomes only ex: chr10? i tried the > following but i wonder why i got GRnages list with empty Grange lists?? > > > if you want to select transcripts where all exons are in certain > chromosomes, note that > > seqnames(tx_Exonss) %in% "chr10" > > returns an RleList, and > > all(seqnames(tx_Exons) %in% "chr10") > > asks element-wise whether all elements of each Rle are TRUE, returning a > logical vector of the same length as tx_Exons. So > > tx_Exons[all(seqnames(tx___Exons) %in% "chr10")] > > returns the transcripts with all exons on chr10. For exons group by _gene_, > it's possible that genes are annotated to contain exons from different > chromosomes > > exByGn = exonsBy(txdb, "gene") > table(elementLengths(__runLength(seqnames(exByGn)))) > > > 1 2 3 4 5 6 7 8 > 23182 77 4 3 19 38 76 60 > > and only exons in chr10, preserving grouping by gene and removing genes > without any exons in chr10, are > > chr10 <- exByGn[seqnames(exByGn) %in% "chr10"] > > > this is what you did below. The result is not empty, just contains the many > transcripts with exons not in chr10 removed, plus those deep in the list > that are on chr10. Here I remove the elements without 0 elements. > > chr10[elementLengths(chr10) != 0] > > > Martin > > > > chr10<-tx_Exons[seqnames(tx___Exons)=="chr10",] > > > chr10 > > > GRangesList of length 80922: > > $1 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > <rle> <iranges> <rle> | <integer> <character> <integer> > > > $2 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > > $3 > > GRanges with 0 ranges and 3 metadata columns: > > seqnames ranges strand | exon_id exon_name exon_rank > > > ... > > <80919 more elements> > > --- > > seqlengths: > > chr1 chr2 ... chrUn_gl000249 > > 249250621 243199373 ... 38502 > > > > length(chr10) > > > [1] 80922 > > length(tx_Exons) > > > [1] 80922 > > > Thank you > > [[alternative HTML version deleted]] > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 <tel:%28206%29%20667-2793> > > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLY

Login before adding your answer.

Traffic: 565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6