Question: Getting Introns Expression at a Per Gene Level
0
6.1 years ago by
Carl Baribault30 wrote:
Dear Valerie, I have a bed file of specific isoforms of interest. Can you please suggest a best approach for obtaining the intron extents? Thanks. Best, Carl Baribault [[alternative HTML version deleted]]
• 436 views
modified 6.1 years ago by Valerie Obenchain6.7k • written 6.1 years ago by Carl Baribault30
Answer: Getting Introns Expression at a Per Gene Level
0
6.1 years ago by
United States
Valerie Obenchain6.7k wrote:
Hi Carl, You can use import() from rtracklayer to read a bed in as a GRanges. gr <- import('myfile.bed', asRangedData=FALSE) I'm not sure what you've got in your file but let's say they are gene isoforms. Presumably there is an identifier in the file that would let you group the ranges by gene (or whatever grouping you are after). This will likely end up as one of the metadata columns in the GRanges after import. Create a GRangesList by grouping the GRanges by gene. grl <- split(gr, bySomeFactor) The introns are the gaps between the ranges in each list element of the GRangesList. To get at these we want the difference between the full range of the gene and the multiple elements (exons or transcripts etc.) of the gene. Create the gene ranges: geneRanges <- range(grl) Extract the differences: introns <- psetdiff(geneRanges, grl) If this doesn't help, I'll need to know more detail about the data in the isoform file. Valerie On 09/09/2013 07:31 PM, Carl Baribault wrote: > Dear Valerie, > I have a bed file of specific isoforms of interest. Can you please suggest > a best approach for obtaining the intron extents? Thanks. > > Best, > Carl Baribault > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
Hi, Bringing the conversation back to the list. There should be no need to lapply. psetdiff(x, y) is vectorized. It computes element-wise (parallel) asymmetric differences between 'x' and 'y'.'x' should be all gene ranges and 'y' a GRangesList (same length as 'x') containing the components of each gene. The result will be a GRangesList the same length as the number of genes. Valerie > On 09/10/2013 07:34 PM, Carl Baribault wrote:> Valerie, >> >> Thanks for your input. FYI, my bed file has only 1 preferred isoform >> per gene (subset from refSeq 05/01/2012 if I recall). I already have >> the import working, thank you. The following is just one element of >> what I want to obtain. I just need to lapply/vectorize the right way. >> Your thoughts? >> >> Best, >> Carl >> > psetdiff(range(ref1), blocks(ref1)) >> GRangesList of length 1: >> $1 >> GRanges with 2 ranges and 0 metadata columns: >> seqnames ranges strand >> <rle> <iranges> <rle> >> [1] chr1 [12228, 12612] + >> [2] chr1 [12722, 13220] + >> >> --- >> seqlengths: >> chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrX chrY >> chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 >> chr22 >> NA NA NA NA NA NA NA NA NA NA NA >> NA NA NA NA NA NA NA NA NA NA NA NA NA >> > blocks(ref1) >> GRangesList of length 1: >>$1 >> GRanges with 3 ranges and 0 metadata columns: >> seqnames ranges strand >> <rle> <iranges> <rle> >> [1] chr1 [11874, 12227] + >> [2] chr1 [12613, 12721] + >> [3] chr1 [13221, 14408] + >> >> --- >> seqlengths: >> chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrX chrY >> chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 >> chr22 >> NA NA NA NA NA NA NA NA NA NA NA >> NA NA NA NA NA NA NA NA NA NA NA NA NA >> >> > On 09/10/2013 09:31 AM, Valerie Obenchain wrote: > Hi Carl, > > You can use import() from rtracklayer to read a bed in as a GRanges. > > gr <- import('myfile.bed', asRangedData=FALSE) > > I'm not sure what you've got in your file but let's say they are gene > isoforms. Presumably there is an identifier in the file that would let > you group the ranges by gene (or whatever grouping you are after). This > will likely end up as one of the metadata columns in the GRanges after > import. Create a GRangesList by grouping the GRanges by gene. > > grl <- split(gr, bySomeFactor) > > The introns are the gaps between the ranges in each list element of the > GRangesList. To get at these we want the difference between the full > range of the gene and the multiple elements (exons or transcripts etc.) > of the gene. > > Create the gene ranges: > > geneRanges <- range(grl) > > Extract the differences: > > introns <- psetdiff(geneRanges, grl) > > > If this doesn't help, I'll need to know more detail about the data in > the isoform file. > > Valerie > > > On 09/09/2013 07:31 PM, Carl Baribault wrote: >> Dear Valerie, >> I have a bed file of specific isoforms of interest. Can you please >> suggest >> a best approach for obtaining the intron extents? Thanks. >> >> Best, >> Carl Baribault >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.