strange behavior with hsahomology
2
0
Entering edit mode
Francois Pepin ★ 1.3k
@francois-pepin-1012
Last seen 9.6 years ago
Hi everyone, I've been playing with the hsahomology package and gone through the vignette and there is something I don't understand. Sometimes it seems the scores are in homoURL instead of homoPS. It seems like a bug to me, or is it a feature I don't understand? > homo <- unlist(as.list(hsahomologyDATA)) #one arbitrary case where this is the case > x<-grep("396050", homo) > as.matrix(homo[(x-2):(x+2)]) [,1] 6337.9031.homoOrg "gga" 6337.9031.homoType "m" 6337.9031.homoHGID "396050" 6337.9031.homoPS NA 6337.9031.homoURL "66.84" There seems to be a lot of cases where this happens, and it would seem to me that most of the code using those scores would be rather confused by this. > sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i386-apple-darwin8.8.1 locale: C attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" [7] "base" other attached packages: hsahomology "1.14.0" Francois
• 630 views
ADD COMMENT
0
Entering edit mode
Nianhua Li ▴ 870
@nianhua-li-1606
Last seen 9.6 years ago
Hi, Francois, > Sometimes it seems the scores are in homoURL instead of homoPS. It seems > like a bug to me, or is it a feature I don't understand? Yes, it is a bug. Thanks for the report. > > > homo <- unlist(as.list(hsahomologyDATA)) > #one arbitrary case where this is the case > > x<-grep("396050", homo) > > as.matrix(homo[(x-2):(x+2)]) > [,1] > 6337.9031.homoOrg "gga" > 6337.9031.homoType "m" > 6337.9031.homoHGID "396050" > 6337.9031.homoPS NA > 6337.9031.homoURL "66.84" > > There seems to be a lot of cases where this happens, I can bet that you see the problem when and only when homoType=="m". Anyway, I have updated all the homology data packages (30 of them) in bioc 2.0 repository. The source packages and windows binaries in bioc 1.9 should be available before the end of the day. You will get something in "homoURL" only when homoType == "c" and in that case homoPS will be NA. In all other cases, "homoURL" will always be NA. In hsahomology, homoType can only be "m", "B" or "b", i.e. homoURL is always NA. > and it would seem > to me that most of the code using those scores would be rather confused > by this. Hope those codes are not confused any more. Now, seriously, do you know any packages that use those scores? I am curious about this because the source data that we used to generate hsahomology will be deprecated in Jan, 2007. It is not easy to get those scores from the new data format. So, I am curious how heavily this data is being used. Thanks. nianhua
ADD COMMENT
0
Entering edit mode
Hi Nianhua, > Hope those codes are not confused any more. Now, seriously, do you know any packages that use those > scores? I am curious about this because the source data that we used to generate hsahomology will be > deprecated in Jan, 2007. It is not easy to get those scores from the new data format. So, I am > curious how heavily this data is being used. I don't know of any package that use it. I just wanted a quick and dirty way to convert gene list across species and started playing around with those packages. The vignette worked properly but if I deviated slightly then I got those strange results. If I'm the first person to realize this, I would think that not many people are using them. Francois
ADD REPLY
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 9.6 years ago
Nianhua Li <nli at="" fhcrc.org=""> writes: > I can bet that you see the problem when and only when > homoType=="m". Anyway, I have updated all the homology data packages > (30 of them) in bioc 2.0 repository. The source packages and windows > binaries in bioc 1.9 should be available before the end of the > day. Thanks, Nianhua, for the prompt fixes :-) > Hope those codes are not confused any more. Now, seriously, do you > know any packages that use those scores? I am curious about this > because the source data that we used to generate hsahomology will be > deprecated in Jan, 2007. As I understand it, the data is already deprecated and in Jan 2007 it will be _defunct_, that is, no longer available. > It is not easy to get those scores from the new data format. So, I > am curious how heavily this data is being used. Thanks. Yes, it would be nice to access how useful these are before putting a lot of resources towards building them :-) + seth
ADD COMMENT

Login before adding your answer.

Traffic: 488 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6