Search
Question: Import of vcf with VariantAnnotation
0
gravatar for Anna N.
4.1 years ago by
Anna N. 0
France
Anna N. 0 wrote:

Hello again,

 

after installing succesfully the packages and reproducing all examples, I would like to analyze my own mutation calls.

I have a problem importing my vcf file - sorry if these are naive questions but I am really new here.

 

> fl=readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")
Error in FUN(X[[2L]], ...) : subscript out of bounds
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames
> traceback()
29: lapply(X = X, FUN = FUN, ...)
28: FUN(X[[1L]], ...)
27: FUN(X[[1L]], ...)
26: lapply(elt, sapply, "[[", 2)
25: lapply(elt, sapply, "[[", 2)
24: FUN(X[[3L]], ...)
23: lapply(X = split(X, group), FUN = FUN, ...)
22: tapply(keyval, tags, function(elt) {
        keys <- lapply(elt, sapply, "[[", 1)
        vals0 <- lapply(elt, sapply, "[[", 2)
        vals <- Map("names<-", vals0, keys)
        cols <- unique(unlist(keys))
        entries <- Map(function(k) as.vector(sapply(vals, "[", k)), 
            cols)
        desc <- which("DESCRIPTION" == toupper(names(entries)))
        if (1L == length(desc)) 
            entries[[desc]] <- gsub("\"", "", entries[[desc]])
        id <- which("ID" == toupper(names(entries)))
        if (length(id) > 0L) {
            if (any(duplicated(entries[[id]]))) 
                warning("duplicate ID's in header will be forced to unique ", 
                    "rownames")
            df <- DataFrame(entries[-id], row.names = make.unique(entries[[id]]))
        }
        else {
            df <- DataFrame(entries)
        }
        df
    })
21: tapply(keyval, tags, function(elt) {
        keys <- lapply(elt, sapply, "[[", 1)
        vals0 <- lapply(elt, sapply, "[[", 2)
        vals <- Map("names<-", vals0, keys)
        cols <- unique(unlist(keys))
        entries <- Map(function(k) as.vector(sapply(vals, "[", k)), 
            cols)
        desc <- which("DESCRIPTION" == toupper(names(entries)))
        if (1L == length(desc)) 
            entries[[desc]] <- gsub("\"", "", entries[[desc]])
        id <- which("ID" == toupper(names(entries)))
        if (length(id) > 0L) {
            if (any(duplicated(entries[[id]]))) 
                warning("duplicate ID's in header will be forced to unique ", 
                    "rownames")
            df <- DataFrame(entries[-id], row.names = make.unique(entries[[id]]))
        }
        else {
            df <- DataFrame(entries)
        }
        df
    })
20: .bcfHeaderAsSimpleList(header)
19: scanBcfHeader(bf)
18: scanBcfHeader(bf)
17: (function (file, mode) 
    {
        bf <- open(BcfFile(file, character(0), ...))
        on.exit(close(bf))
        scanBcfHeader(bf)
    })(dots[[1L]][[1L]])
16: mapply(FUN = f, ..., SIMPLIFY = FALSE)
15: .Method(..., f = f)
14: eval(expr, envir, enclos)
13: eval(.dotsCall, env)
12: eval(.dotsCall, env)
11: standardGeneric("Map")
10: Map(function(file, mode) {
        bf <- open(BcfFile(file, character(0), ...))
        on.exit(close(bf))
        scanBcfHeader(bf)
    }, file, ...)
9: scanBcfHeader(file, ...)
8: scanBcfHeader(file, ...)
7: scanVcfHeader(file)
6: scanVcfHeader(file)
5: .scanVcfToVCF(scanVcf(file, param = param, row.names = row.names), 
       file, genome, param)
4: .readVcf(file, genome, param = ScanVcfParam(), row.names = row.names, 
       ...)
3: .local(file, genome, param, ...)
2: readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")
1: readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")


 

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by Anna N. 0
1
gravatar for Anna N.
4.1 years ago by
Anna N. 0
France
Anna N. 0 wrote:

Hello Valerie,

 

The file is in my working directory, even if I include the full pathway I have the same problems...

I am sending you the file in your email

Thank you very much

Anna

ADD COMMENTlink written 4.1 years ago by Anna N. 0
1

Thanks for sending the file.

scanVcfHeader() was not recognizing the (relatively) new header fields 'SAMPLE' and 'PEDIGREE'. 

The problem originated in Rsamtools so for the fix you'll need both Rsamtools 1.19.4 and VariantAnnotation 1.13.4.

Thanks for reporting this bug!

Valerie

ADD REPLYlink written 4.1 years ago by Valerie Obenchain ♦♦ 6.6k
0
gravatar for Valerie Obenchain
4.1 years ago by
Valerie Obenchain ♦♦ 6.6k
United States
Valerie Obenchain ♦♦ 6.6k wrote:

Hi Anna,

What version of VariantAnnotation are you using? Please provide the output of

sessionInfo()

It might be helpful to inspect the header information and look for duplicate variable names in info and geno fields. For example,

> fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
> hdr <- scanVcfHeader(fl)
> hdr
class: VCFHeader
samples(3): NA00001 NA00002 NA00003
meta(5): fileformat fileDate source reference phasing
fixed(1): FILTER
info(6): NS DP ... DB H2
geno(4): GT GQ DP HQ
> info(hdr)
DataFrame with 6 rows and 3 columns
        Number        Type                 Description
   <character> <character>                 <character>
NS           1     Integer Number of Samples With Data
DP           1     Integer                 Total Depth
AF           A       Float            Allele Frequency
AA           1      String            Ancestral Allele
DB           0        Flag dbSNP membership, build 129
H2           0        Flag          HapMap2 membership

Are you able to read the file with scanVcf()? Assuming the file isn't too big for available memory.

scanVcf(fl)

 

Valerie

 

ADD COMMENTlink written 4.1 years ago by Valerie Obenchain ♦♦ 6.6k
0
gravatar for Anna N.
4.1 years ago by
Anna N. 0
France
Anna N. 0 wrote:

 Hi Valerie,

no I am not able to read the file with scanVcf(fl) and it is not a big file (325 kb)
I am pasting the sessionInfo()


> fl=readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")
Error in FUN(X[[2L]], ...) : subscript out of bounds
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames
> scanVcf(fl)
Error in scanVcf(fl) : 
  erreur d'évaluation de l'argument 'file' lors de la sélection d'une méthode pour la fonction 'scanVcf' : Error: object 'fl' not found
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
[5] LC_TIME=French_France.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] VariantAnnotation_1.12.2 Rsamtools_1.18.1         Biostrings_2.34.0        XVector_0.6.0            GenomicRanges_1.18.1    
[6] GenomeInfoDb_1.2.2       IRanges_2.0.0            S4Vectors_0.4.0          BiocGenerics_0.12.0     

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.0    base64enc_0.1-2         BatchJobs_1.4           BBmisc_1.7              Biobase_2.26.0         
 [6] BiocParallel_1.0.0      biomaRt_2.22.0          bitops_1.0-6            brew_1.0-6              BSgenome_1.34.0        
[11] checkmate_1.5.0         codetools_0.2-9         DBI_0.3.1               digest_0.6.4            fail_1.2               
[16] foreach_1.4.2           GenomicAlignments_1.2.0 GenomicFeatures_1.18.1  iterators_1.0.7         RCurl_1.95-4.3         
[21] RSQLite_1.0.0           rtracklayer_1.26.1      sendmailR_1.2-1         stringr_0.6.2           tools_3.1.1            
[26] XML_3.98-1.1            zlibbioc_1.12.0        
 

>

ADD COMMENTlink written 4.1 years ago by Anna N. 0

'fl' should be the path to your file.

fl <- "Ac216_H69_1_A549_mutect_confident_eff.vcf" ## include the full path

Did you try looking for duplicate variable names in the header?

hdr <- scanVcfHeader(fl)

names(info(hdr))

If these hints don't help I can take a look at the file. Can you share it off-line (vobencha@fhcrc.org) or maybe I can download it from somewhere?

Valerie

ADD REPLYlink written 4.1 years ago by Valerie Obenchain ♦♦ 6.6k
0
gravatar for Anna N.
4.1 years ago by
Anna N. 0
France
Anna N. 0 wrote:

Thank you for all the help!

I am waiting to update my packages and start over

Anna

 

ADD COMMENTlink written 4.1 years ago by Anna N. 0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 354 users visited in the last hour