Getting Introns Expression at a Per Gene Level
1
0
Entering edit mode
@fong-chun-chan-5706
Last seen 9.7 years ago
Hi, I am using the R package, Genomic Features, to get Intron Expression and the function that I am using is the intronsByTranscript(). While this function is useful, it returns the number of raw reads that align to each intron grouped at the transcript level. Is there an easy way to get it to group it by a gene such that I am grabbing all the introns that fall in all the transcripts belong to a gene (and without double counting the intronic space). Similar, is there a way to easily get the raw read count at an individual intron level rather than having it grouped? The introns() function seems to be defunct now as it recommends that we use the intronsByTranscript() function. Thanks, Fong [[alternative HTML version deleted]]
• 777 views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States
Hi Fong, We don't have a single function call to get introns by gene but you can put one together in a few steps. library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene Specifying use.names = TRUE puts the transcript names on the GRangesList. introns <- intronsByTranscript(txdb, use.names=TRUE) Unlist and remove duplicate introns. ulst <- unlist(introns) intronsNoDups <- ulst[!duplicated(ulst)] The select() function can map txid to geneid (and others). See ?select. txnames <- names(intronsNoDups) map <- select(txdb, keys=txnames, keytype='TXNAME', cols='GENEID') Not all transcripts are associated with a gene id and some are associated with more than one. > head(map, 10) TXNAME GENEID 1 uc001aaa.3 <na> 2 uc001aaa.3 <na> 3 uc010nxq.1 <na> 4 uc010nxq.1 <na> 5 uc010nxr.1 <na> 6 uc010nxr.1 <na> 7 uc009vjk.2 100133331 8 uc009vjk.2 100133331 9 uc001aau.3 100132287 10 uc021oeh.1 100133331 To get a GRangesList of introns by gene, split the introns on the associated gene id. idx <- map$GENEID[!is.na(map$GENEID)] intronsByGene <- split(intronsNoDups[!is.na(map$GENEID)], idx) names(intronsByGene) <- unique(idx) The list names are geneid and the element rownames are txid. > intronsByGene GRangesList of length 19784: $100133331 GRanges with 13 ranges and 0 metadata columns: seqnames ranges strand <rle> <iranges> <rle> uc002qsd.4 chr19 [58858396, 58858718] - uc002qsd.4 chr19 [58859007, 58861735] - uc002qsd.4 chr19 [58862018, 58862756] - ... $100132287 GRanges with 1 range and 0 metadata columns: seqnames ranges strand uc003wyw.1 chr8 [18248856, 18257507] + To count reads against this annotation you can use summarizeOverlaps(). The 'reads' can be Bam files, GappedAlignments or GappedAlignmentPairs. The 'features' argument will be the GRangesList of introns by gene. The result will be counts per gene. To count just introns with no grouping you can still use summarizeOverlaps() but use the 'intronsNoDups' GRanges object as 'features'. Valerie On 01/29/2013 11:38 AM, Fong Chun Chan wrote: > Hi, > > I am using the R package, Genomic Features, to get Intron Expression and > the function that I am using is the intronsByTranscript(). While this > function is useful, it returns the number of raw reads that align to each > intron grouped at the transcript level. Is there an easy way to get it to > group it by a gene such that I am grabbing all the introns that fall in all > the transcripts belong to a gene (and without double counting the intronic > space). > > Similar, is there a way to easily get the raw read count at an individual > intron level rather than having it grouped? The introns() function seems to > be defunct now as it recommends that we use the intronsByTranscript() > function. > > Thanks, > > Fong > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 669 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6