similarity between two gene lists with varied length
4
0
Entering edit mode
Weiwei Shi ★ 1.2k
@weiwei-shi-1407
Last seen 10.2 years ago
Dear listers, a little off-topic: I am looking for and compare algorithms which can calculate "distance" or "similarity" between two gene lists with different lengths. Any paper, any implementation in R and any suggestion is welcome! Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III [[alternative HTML version deleted]]
• 1.5k views
ADD COMMENT
0
Entering edit mode
@shannon-william-2930
Last seen 10.2 years ago
First thought is a similarity can be based on the ratio of the number of genes in the intersection of the two lists divided by the number of genes in the union of the two lists. If the two lists are identical the similarity is 1 and if they have no genes in common they have a similarity of 0. Of course this won't take into account the length of the gene lists. You would have to think through what would happen to the similarity for cases where some genes are in both lists. Bill Shannon Associate Professor of Biostatistics in Medicine Washington University School of Medicine President-Elect, Classification Society ________________________________________ From: bioconductor-bounces@stat.math.ethz.ch [bioconductor- bounces@stat.math.ethz.ch] On Behalf Of Weiwei Shi [helprhelp@gmail.com] Sent: Saturday, August 23, 2008 7:55 PM To: r-help at stat.math.ethz.ch Cc: Bioconductor Subject: [BioC] similarity between two gene lists with varied length Dear listers, a little off-topic: I am looking for and compare algorithms which can calculate "distance" or "similarity" between two gene lists with different lengths. Any paper, any implementation in R and any suggestion is welcome! Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Actually, making this question more nontrivial is, 1. the lengths for the two gene lists are very different; 2. I could add another list as gene weight to any gene, for example, of union of two gene lists. On Sun, Aug 24, 2008 at 9:15 AM, Shannon, William <wshannon@dom.wustl.edu>wrote: > First thought is a similarity can be based on the ratio of the number of > genes in the intersection of the two lists divided by the number of genes in > the union of the two lists. If the two lists are identical the similarity > is 1 and if they have no genes in common they have a similarity of 0. Of > course this won't take into account the length of the gene lists. > > You would have to think through what would happen to the similarity for > cases where some genes are in both lists. > > > Bill Shannon > Associate Professor of Biostatistics in Medicine > Washington University School of Medicine > > President-Elect, Classification Society > > ________________________________________ > From: bioconductor-bounces@stat.math.ethz.ch [ > bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Weiwei Shi [ > helprhelp@gmail.com] > Sent: Saturday, August 23, 2008 7:55 PM > To: r-help@stat.math.ethz.ch > Cc: Bioconductor > Subject: [BioC] similarity between two gene lists with varied length > > Dear listers, > > a little off-topic: > > I am looking for and compare algorithms which can calculate "distance" or > "similarity" between two gene lists with different lengths. > > Any paper, any implementation in R and any suggestion is welcome! > > Thanks, > > -- > Weiwei Shi, Ph.D > Research Scientist > GeneGO, Inc. > > "Did you always know?" > "No, I did not. But I believed..." > ---Matrix III > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Seth Falcon ▴ 150
@seth-falcon-2443
Last seen 10.2 years ago
* On 2008-08-24 at 08:55 +0800 Weiwei Shi wrote: > Dear listers, [please don't cross-post] > a little off-topic: > > I am looking for and compare algorithms which can calculate "distance" or > "similarity" between two gene lists with different lengths. > > Any paper, any implementation in R and any suggestion is welcome! You might the following paper of interest: Combining Results of Microarray Experiments: A Rank Aggregation Approach http://www.bepress.com/sagmb/vol5/iss1/art15/ + seth -- Seth Falcon | http://userprimary.net/user/
ADD COMMENT
0
Entering edit mode
>> I am looking for and compare algorithms which can calculate "distance" or >> "similarity" between two gene lists with different lengths. >> >> Any paper, any implementation in R and any suggestion is welcome! > Also, Section 22.4.1. of this book might be of interest: Gentleman R., Carey V., Huber W., Irizarry R. and Dudoit S. (2005) Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Published by Springer. Best wishes Wolfgang -- ---------------------------------------------------- Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber
ADD REPLY
0
Entering edit mode
@cesare-furlanello-2999
Last seen 10.2 years ago
Dear Weiwei you may be interested in the approach in G. Jurman, S. Merler, A. Barla, S. Paoli, A. Galea, and C. Furlanello. Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics, 24(2):258-264, 2008. with standalone software at: https://biodcv.fbk.eu/static/listspy.html and a Python package at: https://mlpy.fbk.eu/ best regards // cesare ________________________________________ From: bioconductor-bounces@stat.math.ethz.ch [bioconductor- bounces@stat.math.ethz.ch] On Behalf Of Weiwei Shi [helprhelp@gmail.com] Sent: Sunday, August 24, 2008 2:55 AM To: r-help at stat.math.ethz.ch Cc: Bioconductor Subject: [BioC] similarity between two gene lists with varied length Dear listers, a little off-topic: I am looking for and compare algorithms which can calculate "distance" or "similarity" between two gene lists with different lengths. Any paper, any implementation in R and any suggestion is welcome! Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
xie weibo ▴ 60
@xie-weibo-3000
Last seen 10.2 years ago
Hi, Weiwei, I think you can try fisher exact test simply. Take all genes of your organism as sample pool and your question is that: when you selected two gene lists from the sample pool, how to judge the independence of the two gene lists. The fisher exact test worked for this type of question. Best wishes, Weibo On Sun, Aug 24, 2008 at 8:55 AM, Weiwei Shi <helprhelp@gmail.com> wrote: > Dear listers, > > a little off-topic: > > I am looking for and compare algorithms which can calculate "distance" or > "similarity" between two gene lists with different lengths. > > Any paper, any implementation in R and any suggestion is welcome! > > Thanks, > > -- > Weiwei Shi, Ph.D > Research Scientist > GeneGO, Inc. > > "Did you always know?" > "No, I did not. But I believed..." > ---Matrix III > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- ################################################## Weibo Xie National Center of Plant Gene Research (Wuhan). National Key Laboratory of Crop Genetic Improvement Huazhong Agricultural University Wuhan 430070, China Phone: 86-27-61324632 [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6