Bimap Subsetting
1
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 14 hours ago
Australia
Hi, Why does Rkeys gives all of the gene symbols, not just the first 6 ? > head(names(geneTranscripts)) # Entrez IDs. [1] "1" "10" "100" "1000" "10000" "100008586" > length(org.Hs.egSYMBOL[head(names(geneTranscripts))]) [1] 6 > length(Rkeys(org.Hs.egSYMBOL[head(names(geneTranscripts))])) [1] 42075 GenomicFeatures_1.10.0 AnnotationDbi_1.20.2 -------------------------------------- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia
• 468 views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 2 days ago
Seattle, WA, United States
Hi Dario, org.Hs.egSYMBOL is a "direct" map, i.e. it maps from left to right. This means that keys() is equivalent to Lkeys() (the "keys" are actually the "left keys"). Subsetting a Bimap by a given set of keys only reduces its set of "keys" ("left keys" if a direct map, "right keys" otherwise). In the case of a "direct" map, it means that the resulting map is now mapping the reduced set of "left keys" to the original set of "right keys". The set of "right keys" remains untouched but the number of right keys that are actually mapped to something on the left is of course smaller: > Rlength(org.Hs.egSYMBOL) [1] 43051 > count.mappedRkeys(org.Hs.egSYMBOL) [1] 43051 > Rlength(org.Hs.egSYMBOL[mykeys]) [1] 43051 > count.mappedRkeys(org.Hs.egSYMBOL[mykeys]) [1] 6 If you want to reduce both, the set of left keys and the set of right keys, consider using subset(). See ?`subset,Bimap-method` for the details. Hope this helps, H. On 10/21/2012 11:00 PM, Dario Strbenac wrote: > Hi, > > Why does Rkeys gives all of the gene symbols, not just the first 6 ? > >> head(names(geneTranscripts)) # Entrez IDs. > [1] "1" "10" "100" "1000" "10000" "100008586" > >> length(org.Hs.egSYMBOL[head(names(geneTranscripts))]) > [1] 6 > >> length(Rkeys(org.Hs.egSYMBOL[head(names(geneTranscripts))])) > [1] 42075 > > GenomicFeatures_1.10.0 AnnotationDbi_1.20.2 > > -------------------------------------- > Dario Strbenac > PhD Student > University of Sydney > Camperdown NSW 2050 > Australia > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT
0
Entering edit mode
On 10/22/2012 09:48 AM, Hervé Pagès wrote: > Hi Dario, > > org.Hs.egSYMBOL is a "direct" map, i.e. it maps from left > to right. This means that keys() is equivalent to Lkeys() > (the "keys" are actually the "left keys"). > > Subsetting a Bimap by a given set of keys only reduces its > set of "keys" ("left keys" if a direct map, "right keys" > otherwise). In the case of a "direct" map, it means that > the resulting map is now mapping the reduced set of "left keys" > to the original set of "right keys". The set of "right keys" > remains untouched but the number of right keys that are > actually mapped to something on the left is of course smaller: > > > Rlength(org.Hs.egSYMBOL) > [1] 43051 > > count.mappedRkeys(org.Hs.egSYMBOL) > [1] 43051 > > > Rlength(org.Hs.egSYMBOL[mykeys]) > [1] 43051 > > count.mappedRkeys(org.Hs.egSYMBOL[mykeys]) > [1] 6 > FWIW, an analogy can be made with subsetting a factor where, by default, the unused levels are not dropped: > x <- factor(letters[1:10]) > x [1] a b c d e f g h i j Levels: a b c d e f g h i j > x[1:6] [1] a b c d e f Levels: a b c d e f g h i j unless you use drop=TRUE: > x[1:6, drop=TRUE] [1] a b c d e f Levels: a b c d e f So we could support a similar feature on Bimap objects thru the 'drop' argument, which is ignored at the moment: > Rlength(org.Hs.egSYMBOL[mykeys, drop=FALSE]) [1] 43051 > Rlength(org.Hs.egSYMBOL[mykeys, drop=TRUE]) [1] 43051 However, an easy way to drop unused (i.e. unmapped) keys is: Lkeys(x) <- mappedLkeys(x) # drop unused left keys Rkeys(x) <- mappedRkeys(x) # drop unused right keys or (as mentioned earlier): subset(x, Lkeys=mappedLkeys(x), Rkeys=mappedRkeys(x)) Cheers, H. > If you want to reduce both, the set of left keys and the set > of right keys, consider using subset(). See ?`subset,Bimap-method` > for the details. > > Hope this helps, > H. > > > On 10/21/2012 11:00 PM, Dario Strbenac wrote: >> Hi, >> >> Why does Rkeys gives all of the gene symbols, not just the first 6 ? >> >>> head(names(geneTranscripts)) # Entrez IDs. >> [1] "1" "10" "100" "1000" "10000" >> "100008586" >> >>> length(org.Hs.egSYMBOL[head(names(geneTranscripts))]) >> [1] 6 >> >>> length(Rkeys(org.Hs.egSYMBOL[head(names(geneTranscripts))])) >> [1] 42075 >> >> GenomicFeatures_1.10.0 AnnotationDbi_1.20.2 >> >> -------------------------------------- >> Dario Strbenac >> PhD Student >> University of Sydney >> Camperdown NSW 2050 >> Australia >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLY

Login before adding your answer.

Traffic: 392 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6