venn diagram
2
0
Entering edit mode
Anthony Bosco ▴ 500
@anthony-bosco-517
Last seen 9.6 years ago
Hi. I actually want to compare lists of gene names (not expression data) using venn diagram tools. For example if I have a cell line and stimulate with several different treatments I want to know which genes are differentially expressed in all treatments or only some of the treatments. I would also like to look at this graphically to get an overview of which treatments are more similar. I realise that heatmap functions etc would show similarities/differences b/w treatments but in this particular case I want to use venn diagrams. Regards Anthony -- ______________________________________________ Anthony Bosco - PhD Student Institute for Child Health Research (Company Limited by Guarantee ACN 009 278 755) Subiaco, Western Australia, 6008 Ph 61 8 9489 , Fax 61 8 9489 7700 email anthonyb@ichr.uwa.edu.au
• 1.3k views
ADD COMMENT
0
Entering edit mode
@arnemulleraventiscom-466
Last seen 9.6 years ago
Hi, The problem with Venn diagrams is that for > 3 sets the visualization it gets messy. Maybe you can just go for a tabular representation instead of graphics. With a dendogram you could viz. your genelist similarity using vector similarity. Take the union of all m sets. This is a super set with n elements. Create a m*n matrix (m columns) where the row names represent gene names. The value for each cell is either 0 or 1 depending whether the gene x is present in set y. You can then create a distance matrix from this by calculating all pariwise combinations of the length normalized cosine between the vectors: > a <- c(1,1,0,0,1,0) > b <- c(0,1,1,0,1,1) > x <- a%*%b / (length(a) * length(b)) > x [,1] [1,] 0.05555556 x is a measure for the similarity between vectors a and b. This is used is a standard procedure in text/document comparison. Since one want s to create a distance matrix one still needs to somehow "invert" this matrix so that high similqrity gets small values! Once you've your matrix M of cosines (this is a symmetric matrix m). You convert this via as.dist(M), and pass it to the hclust routine. I'd be interested in the outcome (does it make sense?) - if you're interested. You should only try it if you've got *many* sets to test, so that a real Venn approach gets too complex. good luck and let me know how it goes, +regards, Arne -- Arne Muller, Ph.D. Toxicogenomics, Aventis Pharma arne dot muller domain=aventis com > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Anthony > Bosco > Sent: 28 April 2004 10:41 > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] venn diagram > > > Hi. > > I actually want to compare lists of gene names (not expression data) > using venn diagram tools. > > For example if I have a cell line and stimulate with several > different treatments I want to know which genes are differentially > expressed in all treatments or only some of the treatments. > > I would also like to look at this graphically to get an overview of > which treatments are more similar. > > I realise that heatmap functions etc would show > similarities/differences b/w treatments but in this particular case I > want to use venn diagrams. > > > Regards > > > Anthony > -- > ______________________________________________ > > Anthony Bosco - PhD Student > > Institute for Child Health Research > (Company Limited by Guarantee ACN 009 278 755) > Subiaco, Western Australia, 6008 > > Ph 61 8 9489 , Fax 61 8 9489 7700 > email anthonyb@ichr.uwa.edu.au > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
Hi If you are comparing categorical data such as this, maybe try Multiple Correspondence Analysis (available in the ade4 package). Aedin -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Arne.Muller@aventis.com Sent: 28 April 2004 10:17 To: anthonyb@ichr.uwa.edu.au; bioconductor@stat.math.ethz.ch Subject: RE: [BioC] venn diagram Hi, The problem with Venn diagrams is that for > 3 sets the visualization it gets messy. Maybe you can just go for a tabular representation instead of graphics. With a dendogram you could viz. your genelist similarity using vector similarity. Take the union of all m sets. This is a super set with n elements. Create a m*n matrix (m columns) where the row names represent gene names. The value for each cell is either 0 or 1 depending whether the gene x is present in set y. You can then create a distance matrix from this by calculating all pariwise combinations of the length normalized cosine between the vectors: > a <- c(1,1,0,0,1,0) > b <- c(0,1,1,0,1,1) > x <- a%*%b / (length(a) * length(b)) > x [,1] [1,] 0.05555556 x is a measure for the similarity between vectors a and b. This is used is a standard procedure in text/document comparison. Since one want s to create a distance matrix one still needs to somehow "invert" this matrix so that high similqrity gets small values! Once you've your matrix M of cosines (this is a symmetric matrix m). You convert this via as.dist(M), and pass it to the hclust routine. I'd be interested in the outcome (does it make sense?) - if you're interested. You should only try it if you've got *many* sets to test, so that a real Venn approach gets too complex. good luck and let me know how it goes, +regards, Arne -- Arne Muller, Ph.D. Toxicogenomics, Aventis Pharma arne dot muller domain=aventis com > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Anthony > Bosco > Sent: 28 April 2004 10:41 > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] venn diagram > > > Hi. > > I actually want to compare lists of gene names (not expression data) > using venn diagram tools. > > For example if I have a cell line and stimulate with several > different treatments I want to know which genes are differentially > expressed in all treatments or only some of the treatments. > > I would also like to look at this graphically to get an overview of > which treatments are more similar. > > I realise that heatmap functions etc would show > similarities/differences b/w treatments but in this particular case I > want to use venn diagrams. > > > Regards > > > Anthony > -- > ______________________________________________ > > Anthony Bosco - PhD Student > > Institute for Child Health Research > (Company Limited by Guarantee ACN 009 278 755) > Subiaco, Western Australia, 6008 > > Ph 61 8 9489 , Fax 61 8 9489 7700 > email anthonyb@ichr.uwa.edu.au > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 9 hours ago
WEHI, Melbourne, Australia
At 06:41 PM 28/04/2004, Anthony Bosco wrote: >Hi. > >I actually want to compare lists of gene names (not expression data) using >venn diagram tools. > >For example if I have a cell line and stimulate with several different >treatments I want to know which genes are differentially expressed in all >treatments or only some of the treatments. Mmm, judging which genes are differentially expressed without considering expression data would be a good trick! :) There is an example of using venn diagrams for the purpose you describe in the User's Guide of the most recent versions of limma. If you're using R 1.8.1, get limma from http://bioinf.wehi.edu.au/limma, if you're using R 1.9.0, get it from the BioC development area. Gordon >I would also like to look at this graphically to get an overview of which >treatments are more similar. > >I realise that heatmap functions etc would show similarities/differences >b/w treatments but in this particular case I want to use venn diagrams. > > >Regards >Anthony >-- >______________________________________________ > >Anthony Bosco - PhD Student > >Institute for Child Health Research >(Company Limited by Guarantee ACN 009 278 755) >Subiaco, Western Australia, 6008 > >Ph 61 8 9489 , Fax 61 8 9489 7700 >email anthonyb@ichr.uwa.edu.au > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT

Login before adding your answer.

Traffic: 764 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6