How to combine VCF-class and data.frame with annotations
1
0
Entering edit mode
@stefan-dentro-5471
Last seen 7.1 years ago
Hello, I'm trying to read in a VCF file containing mutation information from one sample, annotate it with Ensembl gene and GO information and then plot using ggbio. But I keep running into the problem of how to combine all information in one single GRange object. So I've got a VCF-class object and a data.frame containing for each mutation whether it is exonic, intronic or intergenic, a gene identifier (possibly NA) and a GO identifier (possibly NA). ggbio accepts GRange-class objects so I would like to merge the VCF-class and data.frame into one GRange object containing all information. I can think of multiple ways of doing this, but none really work or are satisfactory: 1) read in the VCF, convert it into a GRange object. cbind elementMetadata with the data.frame and create a new GRange object. Problem: elementMetadata cannot be merged with a data.frame: Error in FUN(X[[3L]], ...) : conversion of list columns to a data.frame is not supported 2) Directly annotate the VCF file through cbind, again: Error in FUN(X[[3L]], ...) : conversion of list columns to a data.frame is not supported 3) Convert the VCF to GRange and add each column in the data.frame separately: gr$external_gene_id=df$external_gene_id There must be a simpler way to do this. 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools and read it in as a data.frame. Merge both data.frames and create a GRange. I would think it is all possible within R, without converting the VCF- file first. This one comes really close though. It boils down to the following question: What is the proper way of doing this using the available R genomics packages? Best wishes, Stefan [[alternative HTML version deleted]]
GO annotate convert ggbio GO annotate convert ggbio • 2.2k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 11 days ago
United States
On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro@gmail.com> wrote: > Hello, > > I'm trying to read in a VCF file containing mutation information from one > sample, annotate it with Ensembl gene and GO information and then plot > using ggbio. But I keep running into the problem of how to combine all > information in one single GRange object. > > So I've got a VCF-class object and a data.frame containing for each > mutation whether it is exonic, intronic or intergenic, a gene identifier > (possibly NA) and a GO identifier (possibly NA). ggbio accepts > GRange-class objects so I would like to merge the VCF-class and data.frame > into one GRange object containing all information. > > I can think of multiple ways of doing this, but none really work or are > satisfactory: > 1) read in the VCF, convert it into a GRange object. cbind elementMetadata > with the data.frame and create a new GRange object. > > Problem: elementMetadata cannot be merged with a data.frame: > Error in FUN(X[[3L]], ...) : > conversion of list columns to a data.frame is not supported > > try a DataFrame instance > 2) Directly annotate the VCF file through cbind, again: > Error in FUN(X[[3L]], ...) : > conversion of list columns to a data.frame is not supported > > 3) Convert the VCF to GRange and add each column in the data.frame > separately: > gr$external_gene_id=df$external_gene_id > > There must be a simpler way to do this. > > 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools > and read it in as a data.frame. Merge both data.frames and create a GRange. > > I would think it is all possible within R, without converting the VCF-file > first. This one comes really close though. > > It boils down to the following question: What is the proper way of doing > this using the available R genomics packages? > > Best wishes, > Stefan > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
0
Entering edit mode
That works well! vcf.df = as(elementMetadata(vcf), "DataFrame") anno.df = as(anno, "DataFrame") elementMetadata(vcf) = cbind(vcf.df, anno.df) Cheers! Stefan On Wed, Aug 29, 2012 at 3:24 PM, Vincent Carey <stvjc@channing.harvard.edu>wrote: > > > On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro@gmail.com> wrote: > >> Hello, >> >> I'm trying to read in a VCF file containing mutation information from one >> sample, annotate it with Ensembl gene and GO information and then plot >> using ggbio. But I keep running into the problem of how to combine all >> information in one single GRange object. >> >> So I've got a VCF-class object and a data.frame containing for each >> mutation whether it is exonic, intronic or intergenic, a gene identifier >> (possibly NA) and a GO identifier (possibly NA). ggbio accepts >> GRange-class objects so I would like to merge the VCF-class and data.frame >> into one GRange object containing all information. >> >> I can think of multiple ways of doing this, but none really work or are >> satisfactory: >> 1) read in the VCF, convert it into a GRange object. cbind elementMetadata >> with the data.frame and create a new GRange object. >> >> Problem: elementMetadata cannot be merged with a data.frame: >> Error in FUN(X[[3L]], ...) : >> conversion of list columns to a data.frame is not supported >> >> > try a DataFrame instance > > > >> 2) Directly annotate the VCF file through cbind, again: >> Error in FUN(X[[3L]], ...) : >> conversion of list columns to a data.frame is not supported >> >> 3) Convert the VCF to GRange and add each column in the data.frame >> separately: >> gr$external_gene_id=df$external_gene_id >> >> There must be a simpler way to do this. >> >> 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools >> and read it in as a data.frame. Merge both data.frames and create a >> GRange. >> >> I would think it is all possible within R, without converting the VCF-file >> first. This one comes really close though. >> >> It boils down to the following question: What is the proper way of doing >> this using the available R genomics packages? >> >> Best wishes, >> Stefan >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On 08/29/2012 07:34 AM, Stefan Dentro wrote: > That works well! > > vcf.df = as(elementMetadata(vcf), "DataFrame") for what it's worth, elementMetadata(vcf) is already a DataFrame, so no need for the coercion in the line above. Martin > anno.df = as(anno, "DataFrame") > elementMetadata(vcf) = cbind(vcf.df, anno.df) > > Cheers! > > Stefan > > On Wed, Aug 29, 2012 at 3:24 PM, Vincent Carey > <stvjc at="" channing.harvard.edu="">wrote: > >> >> >> On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro at="" gmail.com=""> wrote: >> >>> Hello, >>> >>> I'm trying to read in a VCF file containing mutation information from one >>> sample, annotate it with Ensembl gene and GO information and then plot >>> using ggbio. But I keep running into the problem of how to combine all >>> information in one single GRange object. >>> >>> So I've got a VCF-class object and a data.frame containing for each >>> mutation whether it is exonic, intronic or intergenic, a gene identifier >>> (possibly NA) and a GO identifier (possibly NA). ggbio accepts >>> GRange-class objects so I would like to merge the VCF-class and data.frame >>> into one GRange object containing all information. >>> >>> I can think of multiple ways of doing this, but none really work or are >>> satisfactory: >>> 1) read in the VCF, convert it into a GRange object. cbind elementMetadata >>> with the data.frame and create a new GRange object. >>> >>> Problem: elementMetadata cannot be merged with a data.frame: >>> Error in FUN(X[[3L]], ...) : >>> conversion of list columns to a data.frame is not supported >>> >>> >> try a DataFrame instance >> >> >> >>> 2) Directly annotate the VCF file through cbind, again: >>> Error in FUN(X[[3L]], ...) : >>> conversion of list columns to a data.frame is not supported >>> >>> 3) Convert the VCF to GRange and add each column in the data.frame >>> separately: >>> gr$external_gene_id=df$external_gene_id >>> >>> There must be a simpler way to do this. >>> >>> 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools >>> and read it in as a data.frame. Merge both data.frames and create a >>> GRange. >>> >>> I would think it is all possible within R, without converting the VCF-file >>> first. This one comes really close though. >>> >>> It boils down to the following question: What is the proper way of doing >>> this using the available R genomics packages? >>> >>> Best wishes, >>> Stefan >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLY

Login before adding your answer.

Traffic: 351 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6