Search
Question: How to combine VCF-class and data.frame with annotations
0
gravatar for Stefan Dentro
4.7 years ago by
Stefan Dentro70 wrote:
Hello, I'm trying to read in a VCF file containing mutation information from one sample, annotate it with Ensembl gene and GO information and then plot using ggbio. But I keep running into the problem of how to combine all information in one single GRange object. So I've got a VCF-class object and a data.frame containing for each mutation whether it is exonic, intronic or intergenic, a gene identifier (possibly NA) and a GO identifier (possibly NA). ggbio accepts GRange-class objects so I would like to merge the VCF-class and data.frame into one GRange object containing all information. I can think of multiple ways of doing this, but none really work or are satisfactory: 1) read in the VCF, convert it into a GRange object. cbind elementMetadata with the data.frame and create a new GRange object. Problem: elementMetadata cannot be merged with a data.frame: Error in FUN(X[[3L]], ...) : conversion of list columns to a data.frame is not supported 2) Directly annotate the VCF file through cbind, again: Error in FUN(X[[3L]], ...) : conversion of list columns to a data.frame is not supported 3) Convert the VCF to GRange and add each column in the data.frame separately: gr$external_gene_id=df$external_gene_id There must be a simpler way to do this. 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools and read it in as a data.frame. Merge both data.frames and create a GRange. I would think it is all possible within R, without converting the VCF- file first. This one comes really close though. It boils down to the following question: What is the proper way of doing this using the available R genomics packages? Best wishes, Stefan [[alternative HTML version deleted]]
ADD COMMENTlink modified 4.7 years ago by Vincent J. Carey, Jr.6.1k • written 4.7 years ago by Stefan Dentro70
0
gravatar for Vincent J. Carey, Jr.
4.7 years ago by
United States
Vincent J. Carey, Jr.6.1k wrote:
On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro@gmail.com> wrote: > Hello, > > I'm trying to read in a VCF file containing mutation information from one > sample, annotate it with Ensembl gene and GO information and then plot > using ggbio. But I keep running into the problem of how to combine all > information in one single GRange object. > > So I've got a VCF-class object and a data.frame containing for each > mutation whether it is exonic, intronic or intergenic, a gene identifier > (possibly NA) and a GO identifier (possibly NA). ggbio accepts > GRange-class objects so I would like to merge the VCF-class and data.frame > into one GRange object containing all information. > > I can think of multiple ways of doing this, but none really work or are > satisfactory: > 1) read in the VCF, convert it into a GRange object. cbind elementMetadata > with the data.frame and create a new GRange object. > > Problem: elementMetadata cannot be merged with a data.frame: > Error in FUN(X[[3L]], ...) : > conversion of list columns to a data.frame is not supported > > try a DataFrame instance > 2) Directly annotate the VCF file through cbind, again: > Error in FUN(X[[3L]], ...) : > conversion of list columns to a data.frame is not supported > > 3) Convert the VCF to GRange and add each column in the data.frame > separately: > gr$external_gene_id=df$external_gene_id > > There must be a simpler way to do this. > > 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools > and read it in as a data.frame. Merge both data.frames and create a GRange. > > I would think it is all possible within R, without converting the VCF-file > first. This one comes really close though. > > It boils down to the following question: What is the proper way of doing > this using the available R genomics packages? > > Best wishes, > Stefan > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 4.7 years ago by Vincent J. Carey, Jr.6.1k
That works well! vcf.df = as(elementMetadata(vcf), "DataFrame") anno.df = as(anno, "DataFrame") elementMetadata(vcf) = cbind(vcf.df, anno.df) Cheers! Stefan On Wed, Aug 29, 2012 at 3:24 PM, Vincent Carey <stvjc@channing.harvard.edu>wrote: > > > On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro@gmail.com> wrote: > >> Hello, >> >> I'm trying to read in a VCF file containing mutation information from one >> sample, annotate it with Ensembl gene and GO information and then plot >> using ggbio. But I keep running into the problem of how to combine all >> information in one single GRange object. >> >> So I've got a VCF-class object and a data.frame containing for each >> mutation whether it is exonic, intronic or intergenic, a gene identifier >> (possibly NA) and a GO identifier (possibly NA). ggbio accepts >> GRange-class objects so I would like to merge the VCF-class and data.frame >> into one GRange object containing all information. >> >> I can think of multiple ways of doing this, but none really work or are >> satisfactory: >> 1) read in the VCF, convert it into a GRange object. cbind elementMetadata >> with the data.frame and create a new GRange object. >> >> Problem: elementMetadata cannot be merged with a data.frame: >> Error in FUN(X[[3L]], ...) : >> conversion of list columns to a data.frame is not supported >> >> > try a DataFrame instance > > > >> 2) Directly annotate the VCF file through cbind, again: >> Error in FUN(X[[3L]], ...) : >> conversion of list columns to a data.frame is not supported >> >> 3) Convert the VCF to GRange and add each column in the data.frame >> separately: >> gr$external_gene_id=df$external_gene_id >> >> There must be a simpler way to do this. >> >> 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools >> and read it in as a data.frame. Merge both data.frames and create a >> GRange. >> >> I would think it is all possible within R, without converting the VCF-file >> first. This one comes really close though. >> >> It boils down to the following question: What is the proper way of doing >> this using the available R genomics packages? >> >> Best wishes, >> Stefan >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLYlink written 4.7 years ago by Stefan Dentro70
On 08/29/2012 07:34 AM, Stefan Dentro wrote: > That works well! > > vcf.df = as(elementMetadata(vcf), "DataFrame") for what it's worth, elementMetadata(vcf) is already a DataFrame, so no need for the coercion in the line above. Martin > anno.df = as(anno, "DataFrame") > elementMetadata(vcf) = cbind(vcf.df, anno.df) > > Cheers! > > Stefan > > On Wed, Aug 29, 2012 at 3:24 PM, Vincent Carey > <stvjc at="" channing.harvard.edu="">wrote: > >> >> >> On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro at="" gmail.com=""> wrote: >> >>> Hello, >>> >>> I'm trying to read in a VCF file containing mutation information from one >>> sample, annotate it with Ensembl gene and GO information and then plot >>> using ggbio. But I keep running into the problem of how to combine all >>> information in one single GRange object. >>> >>> So I've got a VCF-class object and a data.frame containing for each >>> mutation whether it is exonic, intronic or intergenic, a gene identifier >>> (possibly NA) and a GO identifier (possibly NA). ggbio accepts >>> GRange-class objects so I would like to merge the VCF-class and data.frame >>> into one GRange object containing all information. >>> >>> I can think of multiple ways of doing this, but none really work or are >>> satisfactory: >>> 1) read in the VCF, convert it into a GRange object. cbind elementMetadata >>> with the data.frame and create a new GRange object. >>> >>> Problem: elementMetadata cannot be merged with a data.frame: >>> Error in FUN(X[[3L]], ...) : >>> conversion of list columns to a data.frame is not supported >>> >>> >> try a DataFrame instance >> >> >> >>> 2) Directly annotate the VCF file through cbind, again: >>> Error in FUN(X[[3L]], ...) : >>> conversion of list columns to a data.frame is not supported >>> >>> 3) Convert the VCF to GRange and add each column in the data.frame >>> separately: >>> gr$external_gene_id=df$external_gene_id >>> >>> There must be a simpler way to do this. >>> >>> 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools >>> and read it in as a data.frame. Merge both data.frames and create a >>> GRange. >>> >>> I would think it is all possible within R, without converting the VCF-file >>> first. This one comes really close though. >>> >>> It boils down to the following question: What is the proper way of doing >>> this using the available R genomics packages? >>> >>> Best wishes, >>> Stefan >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLYlink written 4.7 years ago by Martin Morgan ♦♦ 19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 260 users visited in the last hour