Xmapcore package
1
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 5 months ago
United States
Tim, While annotating a list of probesets to exons, transcripts and genes, I noticed that there are more probesets (e.g.,4448480) mapped to genes than those mapped to transcripts and the least number of probesets mapped to the exons. Is this expected? I suppose if one probe is aligned to multiple exons in a gene, then the exon mapping was removed while the gene mapping was kept. Could you please elaborate? Thanks so much for your help! Best regards, Julie library(xmapcore) xmap.connect("mouse") >probeset.to.transcript("4448480", as.vector=FALSE) NULL > probeset.to.exon("4448480", as.vector=FALSE) NULL > probeset.to.gene("4448480", as.vector=FALSE) RangedData with 1 row and 9 value columns across 1 space space ranges | IN1 stable_id strand <character> <iranges> | <character> <character> <integer> 1 13 [92020005, 92901611] | 4448480 ENSMUSG00000021708 -1 biotype status <character> <character> 1 protein_coding KNOWN description <character> 1 RAS protein-specific guanine nucleotide-releasing factor 2 Gene [Source:MGI (curated);Acc:MGI:109137] db_display_name symbol <character> <character> 1 MGI (curated) Rasgrf2 symbol_description <character> 1 RAS protein-specific guanine nucleotide-releasing factor 2 Gene > temp= transcript.to.probeset(gene.to.transcript(probeset.to.gene("4448480", as.vector=TRUE), as.vector=TRUE), as.vector=FALSE) > temp[temp$stable_id == "4448480",] [1] IN1 stable_id [3] array_name probe_count [5] hit_score gene_score [7] transcript_score exon_score [9] est_gene_score est_transcript_score [11] est_exon_score prediction_transcript_score [13] prediction_exon_score protein_score [15] domain_score <0 rows> (or 0-length row.names) sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mouseexonpmcdf_1.1 xmapcore_1.2.8 digest_0.4.2 [4] IRanges_1.6.11 RMySQL_0.7-5 DBI_0.2-5 loaded via a namespace (and not attached): [1] tools_2.11.1
probe probe • 1.1k views
ADD COMMENT
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 5 months ago
United States
Hi Tim, Thanks so much for such a quick response! Here is a probeset ID that maps to a gene but not to any exon or transcript. library(xmapcore) xmap.connect("mouse") probeset.to.transcript("4448480", as.vector=FALSE) #NULL probeset.to.exon("4448480", as.vector=FALSE) #NULL probeset.to.gene("4448480", as.vector=TRUE) #[1] "ENSMUSG00000021708" I looked up the detailed information of this probeset as follows. probeset.details("4448480") stable_id array_name probe_count hit_score gene_score transcript_score 1 4448480 MoEx-1_0 4 1 1 0 exon_score est_gene_score est_transcript_score est_exon_score 1 0 0 0 0 prediction_transcript_score prediction_exon_score protein_score 1 0 0 0 domain_score 1 0 It looks like this probeset has one or more of its probes missing the transcript/exon target but uniquely aligned to a gene. Is it correct that this probeset is mapped to the un-transcribed region of the gene? Here is an example that a probeset is mapped to both gene and transcript but not to any exon. probeset.details("4305509") stable_id array_name probe_count hit_score gene_score transcript_score 1 4305509 MoEx-1_0 4 1 1 2 exon_score est_gene_score est_transcript_score est_exon_score 1 0 1 2 0 prediction_transcript_score prediction_exon_score protein_score 1 1 0 0 domain_score 1 0 Is it correct that this probeset is aligned to the intron region of the transcript? Thanks so much for your help! Best regards, Julie On 12/13/10 12:29 PM, "Tim Yates" <tyates@picr.man.ac.uk> wrote: > > Hi there! > > How are you doing the mapping from probeset to gene, exon, transcript, etc? > > Do you have an example where you believe something is wrong? > > Cheers :-) > > Tim > > > > ----- Reply message ----- > From: "Zhu, Lihua \(Julie\)" <julie.zhu@umassmed.edu> > Date: Mon, Dec 13, 2010 17:20 > Subject: Xmapcore package > To: "bioconductor@r-project.org" <bioconductor@r-project.org> > Cc: "Tim Yates" <tyates@picr.man.ac.uk> > > Tim, > > While annotating a list of probesets to exons, transcripts and genes, I > noticed that there are more probesets (e.g.,4448480) mapped to genes than > those mapped to transcripts and the least number of probesets mapped to the > exons. Is this expected? I suppose if one probe is aligned to multiple exons > in a gene, then the exon mapping was removed while the gene mapping was > kept. Could you please elaborate? Thanks so much for your help! > > Best regards, > > Julie > > library(xmapcore) > xmap.connect("mouse") >> probeset.to.transcript("4448480", as.vector=FALSE) > NULL >> probeset.to.exon("4448480", as.vector=FALSE) > NULL >> probeset.to.gene("4448480", as.vector=FALSE) > RangedData with 1 row and 9 value columns across 1 space > space ranges | IN1 stable_id > strand > <character> <iranges> | <character> <character> > <integer> > 1 13 [92020005, 92901611] | 4448480 ENSMUSG00000021708 > -1 > biotype status > <character> <character> > 1 protein_coding KNOWN > > description > > <character> > 1 RAS protein-specific guanine nucleotide-releasing factor 2 Gene > [Source:MGI (curated);Acc:MGI:109137] > db_display_name symbol > <character> <character> > 1 MGI (curated) Rasgrf2 > symbol_description > <character> > 1 RAS protein-specific guanine nucleotide-releasing factor 2 Gene >> temp= transcript.to.probeset(gene.to.transcript(probeset.to.gene("4448480", > as.vector=TRUE), as.vector=TRUE), as.vector=FALSE) > >> temp[temp$stable_id == "4448480",] > [1] IN1 stable_id > [3] array_name probe_count > [5] hit_score gene_score > [7] transcript_score exon_score > [9] est_gene_score est_transcript_score > [11] est_exon_score prediction_transcript_score > [13] prediction_exon_score protein_score > [15] domain_score > <0 rows> (or 0-length row.names) > > sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] mouseexonpmcdf_1.1 xmapcore_1.2.8 digest_0.4.2 > [4] IRanges_1.6.11 RMySQL_0.7-5 DBI_0.2-5 > > loaded via a namespace (and not attached): > [1] tools_2.11.1 > -------------------------------------------------------- > This email is confidential and intended solely for the...{{dropped:19}}
ADD COMMENT
0
Entering edit mode
Hi Julie, I just had a quick look at that probeset on X:Map - here: (http://xmap.picr.man.ac.uk). There's quite a bit of info here, including the hit location of each individual probe in the probeset - what comes back is that those probesets land in an intron. There's help pages on the website, but if you search for the probeset (you might need to set the species first), it will appear in the 'Selection Details' window in the middle below the browser. Clicking on the '[+]' by the probeset name expands the annotation tree to reveal each probe and then if you expand the tree under each probe, the places where they match to the genome. To the right of this window, you'll see some green arrows. If you click on these, it'll make the browser jump to the appropriate position... Sorry that's a lot easier to do than to write, I think! ...anyway, it seems that the probesets are annotated as 'intronic'. This means that one or more of the probes don't hit an exon, as defined by ENSEMBL... In R, the function call: > is.intronic(c('4448480','4305509')) 4448480 4305509 TRUE TRUE confirms this. (If this is the first time you've run this command, it might take a few seconds while it builds a local cache to (ultimately) speed things up. The second time you call it, it should be a lot quicker.) Crispin On 13/12/2010 17:45, "Zhu, Lihua (Julie)" <julie.zhu@umassmed.edu> wrote: > Hi Tim, > > Thanks so much for such a quick response! > > Here is a probeset ID that maps to a gene but not to any exon or transcript. > > library(xmapcore) > xmap.connect("mouse") > probeset.to.transcript("4448480", as.vector=FALSE) > #NULL > probeset.to.exon("4448480", as.vector=FALSE) > #NULL > probeset.to.gene("4448480", as.vector=TRUE) > #[1] "ENSMUSG00000021708" > > I looked up the detailed information of this probeset as follows. > probeset.details("4448480") > stable_id array_name probe_count hit_score gene_score transcript_score > 1 4448480 MoEx-1_0 4 1 1 0 > exon_score est_gene_score est_transcript_score est_exon_score > 1 0 0 0 0 > prediction_transcript_score prediction_exon_score protein_score > 1 0 0 0 > domain_score > 1 0 > > It looks like this probeset has one or more of its probes missing the > transcript/exon target but uniquely aligned to a gene. Is it correct that this > probeset is mapped to the un-transcribed region of the gene? > > Here is an example that a probeset is mapped to both gene and transcript but > not to any exon. > > probeset.details("4305509") > stable_id array_name probe_count hit_score gene_score transcript_score > 1 4305509 MoEx-1_0 4 1 1 2 > exon_score est_gene_score est_transcript_score est_exon_score > 1 0 1 2 0 > prediction_transcript_score prediction_exon_score protein_score > 1 1 0 0 > domain_score > 1 0 > > Is it correct that this probeset is aligned to the intron region of the > transcript? > > Thanks so much for your help! > > Best regards, > > Julie > > -------------------------------------------------------- This email is confidential and intended solely for the u...{{dropped:15}}
ADD REPLY

Login before adding your answer.

Traffic: 933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6