duplicate entries in org.Hs.egCHRLOC?
1
0
Entering edit mode
Paul Shannon ★ 1.1k
@paul-shannon-578
Last seen 9.6 years ago
If I ask for the chromosome location of the start of geneID 5004, I get: > get ('5004', org.Hs.egCHRLOC) 9 116125123 But geneID 20 gets me two identical values: > get ('20', org.Hs.egCHRLOC) 9 9 -139021506 -139021506 I noticed this after using toTable to create a data.frame from this environment: I get two identical rows for geneID 20: > subset (toTable (org.Hs.egCHRLOC), gene_id=='20') gene_id start_location Chromosome 14 20 -139021506 9 15 20 -139021506 9 Is this the expected behavior? Thanks! - Paul
• 582 views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Mon, Mar 9, 2009 at 8:26 AM, Paul Shannon <pshannon@systemsbiology.org>wrote: > If I ask for the chromosome location of the start of geneID 5004, I get: > > > get ('5004', org.Hs.egCHRLOC) > 9 > 116125123 > > But geneID 20 gets me two identical values: > > > get ('20', org.Hs.egCHRLOC) > 9 9 > -139021506 -139021506 > > I noticed this after using toTable to create a data.frame from this > environment: I get two identical rows for geneID 20: > > > subset (toTable (org.Hs.egCHRLOC), gene_id=='20') > gene_id start_location Chromosome > 14 20 -139021506 9 > 15 20 -139021506 9 > > Is this the expected behavior? > Yep. Gene ID "20" has two transcripts and both start at the same location. Gene ID "5004" has only one transcript. So, the CHRLOCs are really the transcript starts associated with the gene. Sean [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 941 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6