Search
News: Changes to matchGenes in bumphunter
2
gravatar for rafa
3.0 years ago by
rafa60
United States
rafa60 wrote:

The function matchGenes in the bumphunter package found the nearest gene/transcript to each entry in a GRanges and annotated with information such as if it is in an intron, exon, which exon, if it overlaps, covers or is inside, as well as with gene symbols, and refseq IDs. For historical reasons, the matchGenes function in bumphunter was hardwired to use the hg19 TxDb. This has changed in the latest devel version (1.7.3) to be general. It is available right now from github (devtools::install_github("ririzarr/bumphunter"). Back compatibility is not supported. Here is how to use it going forward:

##island is an example of a Granges

islands <-read.delim("http://rafalab.jhsph.edu/CGI/model-based-cpg-islands-hg19.txt")

islands=makeGRangesFromDataFrame(islands[1:100,])

library(bumphunter)

library("TxDb.Hsapiens.UCSC.hg19.knownGene")

genes <- annotateTranscripts(TxDb.Hsapiens.UCSC.hg19.knownGene)

tab<- matchGenes(islands,genes)

Here is the first row:

   name                       annotation description region distance

1 TUBB8 NM_001164154 NM_177987 NP_817124 inside exon inside     1360

    subregion insideDistance exonnumber nexons                         UTR

1 inside exon              0          4      4 inside transcription region

  strand geneL codingL Entrez subjectHits

1      -  2350    2181 347688       30958

Note that annotateTranscript tries to infer the annotation package using the species method on the TxDB. But one can also supply it:

genes <- annotateTranscripts(TxDb.Hsapiens.UCSC.hg19.knownGene,"org.Hs.eg.db")

We also edited the annotateNearest function which runs nearest and then adds some information. It works on GRanges or data.frames with the right column names (chr, start, end)

None of this has been tested thoroughly so comments and bug reports are welcomed.

ADD COMMENTlink modified 3.0 years ago by Michael Love15k • written 3.0 years ago by rafa60

Fixed some minor bugs at https://github.com/ririzarr/bumphunter/pull/3. The main one is that if you used a data.frame for 'x' in matchGenes() or annotateNearest() you would get incorrect results.

ADD REPLYlink written 2.9 years ago by Leonardo Collado Torres560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 102 users visited in the last hour