Entering edit mode
Stefan Dentro
▴
70
@stefan-dentro-5471
Last seen 10.4 years ago
Hello,
I'm trying to read in a VCF file containing mutation information from
one
sample, annotate it with Ensembl gene and GO information and then plot
using ggbio. But I keep running into the problem of how to combine all
information in one single GRange object.
So I've got a VCF-class object and a data.frame containing for each
mutation whether it is exonic, intronic or intergenic, a gene
identifier
(possibly NA) and a GO identifier (possibly NA). ggbio accepts
GRange-class objects so I would like to merge the VCF-class and
data.frame
into one GRange object containing all information.
I can think of multiple ways of doing this, but none really work or
are
satisfactory:
1) read in the VCF, convert it into a GRange object. cbind
elementMetadata
with the data.frame and create a new GRange object.
Problem: elementMetadata cannot be merged with a data.frame:
Error in FUN(X[[3L]], ...) :
conversion of list columns to a data.frame is not supported
2) Directly annotate the VCF file through cbind, again:
Error in FUN(X[[3L]], ...) :
conversion of list columns to a data.frame is not supported
3) Convert the VCF to GRange and add each column in the data.frame
separately:
gr$external_gene_id=df$external_gene_id
There must be a simpler way to do this.
4) Convert the VCF into a tab delimited file using vcf-to-tab in
vcftools
and read it in as a data.frame. Merge both data.frames and create a
GRange.
I would think it is all possible within R, without converting the VCF-
file
first. This one comes really close though.
It boils down to the following question: What is the proper way of
doing
this using the available R genomics packages?
Best wishes,
Stefan
[[alternative HTML version deleted]]