Search
Question: VariantAnnotation Errors and void outputs
0
gravatar for pifferdavide
3.4 years ago by
pifferdavide0 wrote:

The readVcf function apparently works fine on phase 3 1000 Genomes data (see output at the bottom). However, when I try to use some functions mentioned in the manual, I get various error messages. 

head(rowRanges(vcf), 3)
Error in head(rowRanges(vcf), 3) : 
  error in evaluating the argument 'x' in selecting a method for function 'head': Error in rowRanges(vcf) : Argument 'x' must be a matrix or a vector.

or even empty outputs, such as: 

geno(vcf)
List of length 0
names(0): 
> sapply(geno(vcf), class)
named list()
> info(vcf)[1:4, 1:5]
DataFrame with 4 rows and 5 columns
          CIEND         CIPOS          CS       END IMPRECISE
  <IntegerList> <IntegerList> <character> <integer> <logical>
1         NA,NA         NA,NA          NA        NA     FALSE
2         NA,NA         NA,NA          NA        NA     FALSE
3         NA,NA         NA,NA          NA        NA     FALSE
4         NA,NA         NA,NA          NA        NA     FALSE

 

Appendix:

 fl <-  ("ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz")
> vcf <- readVcf(fl, "hg38")
> vcf
class: CollapsedVCF 
dim: 1103547 0 
rowData(vcf):
  GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
  DataFrame with 27 columns: CIEND, CIPOS, CS, END, IMPRECISE, MC, MEINFO, MEND, MLEN, MSTART, SVLEN,...

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by pifferdavide0
0
gravatar for Valerie Obenchain
3.4 years ago by
Valerie Obenchain ♦♦ 6.6k
United States
Valerie Obenchain ♦♦ 6.6k wrote:

Hi,

The 'rowData' name (slot, argument and accessor) has recently been replaced with 'rowRanges' in the SummarizedExperiment class which the VCF class inherits from. If you're using devel you should be using rowRanges(); for release use rowData().

Please make sure you're packages are up to date with biocLite() and try again. If you still get an error provide the output of sessionInfo() so we can see what versions you're using.

Valerie

ADD COMMENTlink written 3.4 years ago by Valerie Obenchain ♦♦ 6.6k

rowData works fine. However, other functions provide empty outputs. For example,see output below. I downloaded the package this morning  using: source("http://bioconductor.org/biocLite.R")

biocLite("VariantAnnotation")

so it should be up to date.

 geno(vcf)
List of length 0
names(0): 

info(vcf)[1:4, 1:5]
DataFrame with 4 rows and 5 columns
          CIEND         CIPOS          CS       END IMPRECISE
  <IntegerList> <IntegerList> <character> <integer> <logical>
1         NA,NA         NA,NA          NA        NA     FALSE
2         NA,NA         NA,NA          NA        NA     FALSE
3         NA,NA         NA,NA          NA        NA     FALSE
4         NA,NA         NA,NA          NA        NA     FALSE

 

ADD REPLYlink written 3.4 years ago by pifferdavide0

biocLite() will download the package version appropriate for your version of R / Bioconductor. If you're using release it will get VariantAnnotation 1.14.1 (Bioconductor 3.1). If you are using devel it will get VariantAnnotation 1.15.9 (Bioconductor 3.2). The output of sessionInfo() let's us see not only the version of the package you're having trouble with but other dependencies as well. Please provide this when you ask a question on the support site.

The dimensions of 'vcf' are

dim: 1103547 0 

There are  1103547 rows (positions) and no samples. When there are no samples there is no geno() so the empty list is expected. It is also expected that an sapply over the classes in geno() would return an empty list.

The call info(vcf)[1:4, 1:5] returns a DataFrame of the appropriate variables and it looks like your VCF file doesn't have data for these fields for the first 4 variants. That doesn't mean all the info() fields are empty. Have you looked at your raw file to confirm?

I misspoke and it looks like the replacement of 'rowData' with 'rowRanges' was back-ported to release. Version 1.14.1 does expect the use of rowRanges() as an accessor and you should get a warning if you try to use rowData():

> fl <- system.file("extdata", "structural.vcf", package="VariantAnnotation")
> vcf <- readVcf(fl, genome="hg19")
> rd = rowData(vcf)
Warning message:
'rowData' is deprecated.
Use 'rowRanges' instead.
See help("Deprecated") 

 

If you are using 1.15.9 you should see a defunct message:

> rd = rowData(vcf)
Error: 'rowData' is defunct.
Use 'rowRanges' instead.
See help("Defunct")

 

You're code doesn't show either warning. I need the output of sessionInfo() to help you further. It would also be useful to see the error from just rowRanges(vcf) without wrapping it in head().

Valerie

ADD REPLYlink written 3.4 years ago by Valerie Obenchain ♦♦ 6.6k

Here is the error I get from rowRanges and sessionInfo() output:

rowRanges(vcf)

Error in rowRanges(vcf) : Argument 'x' must be a matrix or a vector.

sessionInfo()

R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252    LC_MONETARY=Italian_Italy.1252
[4] LC_NUMERIC=C                   LC_TIME=Italian_Italy.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] matrixStats_0.14.0       VariantAnnotation_1.12.9 Rsamtools_1.18.3         Biostrings_2.34.1       
 [5] XVector_0.6.0            GenomicRanges_1.18.4     GenomeInfoDb_1.2.5       IRanges_2.0.1           
 [9] S4Vectors_0.4.0          BiocGenerics_0.12.1     

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.2    base64enc_0.1-2         BatchJobs_1.6           BBmisc_1.9             
 [5] Biobase_2.26.0          BiocParallel_1.0.3      biomaRt_2.22.0          bitops_1.0-6           
 [9] brew_1.0-6              BSgenome_1.34.1         checkmate_1.5.3         codetools_0.2-8        
[13] DBI_0.3.1               digest_0.6.8            fail_1.2                foreach_1.4.2          
[17] GenomicAlignments_1.2.2 GenomicFeatures_1.18.7  iterators_1.0.7         magrittr_1.5           
[21] RCurl_1.95-4.6          RSQLite_1.0.0           rtracklayer_1.26.3      sendmailR_1.2-1        
[25] stringi_0.4-1           stringr_1.0.0           tools_3.1.1             XML_3.98-1.1           
[29] zlibbioc_1.12.0        

 

 

 

ADD REPLYlink written 3.4 years ago by pifferdavide0

Thanks for providing sessionInfo(). Your R/Bioconductor is out of date. The current Bioconductor release and devel use R 3.2. VariantAnnotation 1.12.9 is from 2 release cycles ago and does not support rowRanges() as an accessor. If you look at the man page

?VCF

there is no mention of rowRanges(), only rowData(). So it makes sense that rowRanges() on a VCF object throws an error.

If you need help updating R/Bioconductor instructions can be found here:

http://www.bioconductor.org/install/

Valerie

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Valerie Obenchain ♦♦ 6.6k

Ok thanks. I installed the new version of R and updated Bioconductor. I am wondering if VariantAnnotation is the package I need. It was suggested to me by a user on Biostars. I work mainly with population genetics concepts. I need to calculate Fst values from 1000 genomes VCF files or LD. Are there packages that I can use to do this?

ADD REPLYlink written 3.4 years ago by pifferdavide0

After updating R and VariantAnnotation, I get the following error message:

library(VariantAnnotation)
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  there is no package called ‘XML’
Error: package or namespace load failed for ‘VariantAnnotation’

 

sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252    LC_MONETARY=Italian_Italy.1252
[4] LC_NUMERIC=C                   LC_TIME=Italian_Italy.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] matrixStats_0.14.0   Rsamtools_1.20.2     Biostrings_2.36.1    XVector_0.8.0       
 [5] GenomicRanges_1.20.3 GenomeInfoDb_1.4.0   IRanges_2.2.1        S4Vectors_0.6.0     
 [9] BiocGenerics_0.14.0  BiocInstaller_1.18.2

loaded via a namespace (and not attached):
[1] bitops_1.0-6    DBI_0.3.1       RSQLite_1.0.0   zlibbioc_1.14.0 tools_3.2.0     Biobase_2.28.0 
[7] RCurl_1.95-4.6 

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by pifferdavide0

For built in fst() and ld() functions you might try  snpStats or SNPrelate.

  biocLite(c("snpStats", "SNPrelate"))
  browseVignettes("snpStats")

For your case, VariantAnnotation does not have built in fst() and ld() but does have a flexible interface for reading/parsing data and manipulation. Data in a VCF object can be used in downstream analysis in snpRelate and snpStats. If you're interested in a certain region (chromosome or position) or specific INFO fields use ScanVcfParam() when calling readVcf():

  readVcf(myfile, mygenome, param = ScanVcfParam(...))

Other functions in VariantAnnotation you may find useful:

  ?isSVN
  ?genotypeToSnpMatrix

FYI, a useful way to search for topic-related software and annotation packages is through the BiocViews interface.

  http://www.bioconductor.org/packages/release/BiocViews.html#___Software

As for the XML, did you get an error with biocLite("XML")? 

Valerie

ADD REPLYlink written 3.4 years ago by Valerie Obenchain ♦♦ 6.6k

The function readVcf now works fine. However, I still have problems with 

 head(rowRange(vcf), 3)
Error in head(rowRange(vcf), 3) : 
  error in evaluating the argument 'x' in selecting a method for function 'head': Error: could not find function "rowRange"

ADD REPLYlink written 3.4 years ago by pifferdavide0

The error message tells you what's wrong - there is no rowRange() function. You're missing an 's'.

ADD REPLYlink written 3.4 years ago by Valerie Obenchain ♦♦ 6.6k

We're back to square one. I used rowRanges() and still get the same error message I had at the beginning and which prompted me to open this thread:

head(rowRanges(vcf), 3)
Error in head(rowRanges(vcf), 3) : 
  error in evaluating the argument 'x' in selecting a method for function 'head': Error in rowRanges(vcf) : Argument 'x' must be a matrix or a vector.

Sorry if this is tiring your patience I understand, but it's tiring mine too. I really appreciate your help though.

Thanks

 

 

ADD REPLYlink written 3.4 years ago by pifferdavide0

hi, notice that this operation works fine with one of the example VCF files in VariantAnnotation:

library(VariantAnnotation)
example(readVcf)

head(rowRanges(vcf), 3)
GRanges object with 3 ranges and 5 metadata columns:
               seqnames             ranges strand | paramRangeID            REF
                  <Rle>          <IRanges>  <Rle> |     <factor> <DNAStringSet>
     rs6054257       20 [  14370,   14370]      * |        geneA              G
  20:17330_T/A       20 [  17330,   17330]      * |        geneA              T
     rs6040355       20 [1110696, 1110696]      * |        geneB              A
                              ALT      QUAL      FILTER
               <DNAStringSetList> <numeric> <character>
     rs6054257                  A        29        PASS
  20:17330_T/A                  A         3         q10
     rs6040355                G,T        67        PASS
  -------
  seqinfo: 1 sequence from hg19 genome

so, there might be something going on on your particular VCF object. Could you paste the output of the function 'traceback()' calling it right after the error happens? Could you also paste the output of how your VCF object is shown after typing its name ('vcf') on the R shell?

ADD REPLYlink written 3.4 years ago by Robert Castelo2.2k

I am using Chr.22 from phase 3 1000 Genomes: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/

> traceback()
1: head(rowRanges(vcf), 3)

> vcf
class: CollapsedVCF 
dim: 1103547 0 
rowRanges(vcf):
  GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
  DataFrame with 27 columns: CIEND, CIPOS, CS, END, IMPRECISE, MC, MEINFO, MEND, MLEN, M...
info(header(vcf)):
                 Number Type    Description                                                
   CIEND         2      Integer Confidence interval around END for imprecise variants      
   CIPOS         2      Integer Confidence interval around POS for imprecise variants      
   CS            1      String  Source call set.                                           
   END           1      Integer End coordinate of this variant                             
   IMPRECISE     0      Flag    Imprecise structural variation                             
   MC            .      String  Merged calls.                                              
   MEINFO        4      String  Mobile element info of the form NAME,START,END<POLARITY;...
   MEND          1      Integer Mitochondrial end coordinate of inserted sequence          
   MLEN          1      Integer Estimated length of mitochondrial insert                   
   MSTART        1      Integer Mitochondrial start coordinate of inserted sequence        
   SVLEN         .      Integer SV length. It is only calculated for structural variatio...
   SVTYPE        1      String  Type of structural variant                                 
   TSD           1      String  Precise Target Site Duplication for bases, if unknown, v...
   AC            A      Integer Total number of alternate alleles in called genotypes      
   AF            A      Float   Estimated allele frequency in the range (0,1)              
   NS            1      Integer Number of samples with data                                
   AN            1      Integer Total number of alleles in called genotypes                
   EAS_AF        A      Float   Allele frequency in the EAS populations calculated from ...
   EUR_AF        A      Float   Allele frequency in the EUR populations calculated from ...
   AFR_AF        A      Float   Allele frequency in the AFR populations calculated from ...
   AMR_AF        A      Float   Allele frequency in the AMR populations calculated from ...
   SAS_AF        A      Float   Allele frequency in the SAS populations calculated from ...
   DP            1      Integer Total read depth; only low coverage data were counted to...
   AA            1      String  Ancestral Allele. Format: AA|REF|ALT|IndelType. AA: Ance...
   VT            .      String  indicates what type of variant the line represents         
   EX_TARGET     0      Flag    indicates whether a variant is within the exon pull down...
   MULTI_ALLELIC 0      Flag    indicates whether a site is multi-

 

ADD REPLYlink written 3.4 years ago by pifferdavide0

hi, since the space for comment replies gets smaller and smaller, I'm gonna jump to an answer below to facilitate reading output.
 

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Robert Castelo2.2k
0
gravatar for Robert Castelo
3.4 years ago by
Robert Castelo2.2k
Spain/Barcelona/Universitat Pompeu Fabra
Robert Castelo2.2k wrote:

hi, I cannot reproduce the problem, your code works fine with my R/BioC installation and my local copy of the 1000 Genomes Project genotype VFC files:

library(VariantAnnotation)

vcf <- readVcf("ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", genome="hg19")
vcf

class: CollapsedVCF
dim: 1103547 0
rowRanges(vcf):
  GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
  DataFrame with 27 columns: CIEND, CIPOS, CS, END, IMPRECISE, MC, MEINFO, MEND, MLEN, MSTART, SVLEN, SVTYPE, TSD, AC, AF, N...
info(header(vcf)):
                 Number Type    Description                                                                                    
   CIEND         2      Integer Confidence interval around END for imprecise variants                                          
   CIPOS         2      Integer Confidence interval around POS for imprecise variants                                          
   CS            1      String  Source call set.                                                                               
   END           1      Integer End coordinate of this variant                                                                 
   IMPRECISE     0      Flag    Imprecise structural variation                                                                 
   MC            .      String  Merged calls.                                                                                  
   MEINFO        4      String  Mobile element info of the form NAME,START,END<POLARITY; If there is only 5' OR 3' support f...
   MEND          1      Integer Mitochondrial end coordinate of inserted sequence                                              
   MLEN          1      Integer Estimated length of mitochondrial insert                                                       
   MSTART        1      Integer Mitochondrial start coordinate of inserted sequence                                            
   SVLEN         .      Integer SV length. It is only calculated for structural variation MEIs. For other types of SVs; one ...
   SVTYPE        1      String  Type of structural variant                                                                     
   TSD           1      String  Precise Target Site Duplication for bases, if unknown, value will be NULL                      
   AC            A      Integer Total number of alternate alleles in called genotypes                                          
   AF            A      Float   Estimated allele frequency in the range (0,1)                                                  
   NS            1      Integer Number of samples with data                                                                    
   AN            1      Integer Total number of alleles in called genotypes                                                    
   EAS_AF        A      Float   Allele frequency in the EAS populations calculated from AC and AN, in the range (0,1)          
   EUR_AF        A      Float   Allele frequency in the EUR populations calculated from AC and AN, in the range (0,1)          
   AFR_AF        A      Float   Allele frequency in the AFR populations calculated from AC and AN, in the range (0,1)          
   AMR_AF        A      Float   Allele frequency in the AMR populations calculated from AC and AN, in the range (0,1)          
   SAS_AF        A      Float   Allele frequency in the SAS populations calculated from AC and AN, in the range (0,1)          
   DP            1      Integer Total read depth; only low coverage data were counted towards the DP, exome data were not used
   AA            1      String  Ancestral Allele. Format: AA|REF|ALT|IndelType. AA: Ancestral allele, REF:Reference Allele, ...
   VT            .      String  indicates what type of variant the line represents                                             
   EX_TARGET     0      Flag    indicates whether a variant is within the exon pull down target boundaries                     
   MULTI_ALLELIC 0      Flag    indicates whether a site is multi-allelic                                                      
geno(vcf):
  SimpleList of length 0:

head(rowRanges(vcf), 3)
GRanges object with 3 ranges and 5 metadata columns:
              seqnames               ranges strand | paramRangeID            REF             ALT      QUAL      FILTER
                 <Rle>            <IRanges>  <Rle> |     <factor> <DNAStringSet> <CharacterList> <numeric> <character>
  rs587697622       22 [16050075, 16050075]      * |         <NA>              A               G       100        PASS
  rs587755077       22 [16050115, 16050115]      * |         <NA>              G               A       100        PASS
  rs587654921       22 [16050213, 16050213]      * |         <NA>              C               T       100        PASS
  -------
  seqinfo: 86 sequences from hg19 genome

sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Fedora release 12 (Constantine)

locale:
 [1] LC_CTYPE=en_US.UTF8       LC_NUMERIC=C              LC_TIME=en_US.UTF8        LC_COLLATE=en_US.UTF8    
 [5] LC_MONETARY=en_US.UTF8    LC_MESSAGES=en_US.UTF8    LC_PAPER=en_US.UTF8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C            LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] VariantAnnotation_1.14.1 Rsamtools_1.20.2         Biostrings_2.36.1        XVector_0.8.0           
 [5] GenomicRanges_1.20.3     GenomeInfoDb_1.4.0       IRanges_2.2.1            S4Vectors_0.6.0         
 [9] BiocGenerics_0.14.0      vimcom_1.2-3             setwidth_1.0-3           colorout_1.1-0          

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.30.1    zlibbioc_1.14.0         GenomicAlignments_1.4.1 BiocParallel_1.2.1      BSgenome_1.36.0        
 [6] tools_3.2.0             Biobase_2.28.0          DBI_0.3.1               lambda.r_1.1.7          futile.logger_1.4.1    
[11] rtracklayer_1.28.2      futile.options_1.0.0    bitops_1.0-6            RCurl_1.95-4.6          biomaRt_2.24.0         
[16] RSQLite_1.0.0           GenomicFeatures_1.20.1  XML_3.98-1.1           


the output of your call to traceback() seems incomplete to me. It should show all the intermediate function calls that lead to the error. See the following toy example:

g <- function(x) 2*x
f <- function(x) g(x)
f("hello world\n")
Error in 2 * x : non-numeric argument to binary operator
traceback()
2: g(x) at #1
1: f("hello world\n")

although if the software actually works for me, it is likely that the problem may be related to your local copy of the VCF file. Could you try to download it again and see whether it works then?

cheers,

robert.

ADD COMMENTlink written 3.4 years ago by Robert Castelo2.2k

hi, just realized you were using the argument genome="hg38" instead of genome="hg19" (1000 Genomes VCF files are based on "hg19"). I though that could have been the source of the problem but it still runs fine in my system:

vcf <- readVcf("ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", genome="hg38")
> head(rowRanges(vcf), 3)
GRanges object with 3 ranges and 5 metadata columns:
              seqnames               ranges strand | paramRangeID            REF             ALT      QUAL      FILTER
                 <Rle>            <IRanges>  <Rle> |     <factor> <DNAStringSet> <CharacterList> <numeric> <character>
  rs587697622       22 [16050075, 16050075]      * |         <NA>              A               G       100        PASS
  rs587755077       22 [16050115, 16050115]      * |         <NA>              G               A       100        PASS
  rs587654921       22 [16050213, 16050213]      * |         <NA>              C               T       100        PASS
  -------
  seqinfo: 86 sequences from hg38 genome

ADD REPLYlink written 3.4 years ago by Robert Castelo2.2k

It's unlikely that the problem has to do with the file, as I downloaded it yesterday. However I am downloading it and I'll retry it.In the meantime, I attach the output of sessionInfo(). Could it be that there are problems related to Windows 7? I noticed yours is running on Linux.

R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252   
[3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C                  
[5] LC_TIME=Italian_Italy.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] shiny_0.12.0             matrixStats_0.14.0       VariantAnnotation_1.14.1
 [4] Rsamtools_1.20.2         Biostrings_2.36.1        XVector_0.8.0           
 [7] GenomicRanges_1.20.3     GenomeInfoDb_1.4.0       IRanges_2.2.1           
[10] S4Vectors_0.6.0          BiocGenerics_0.14.0     

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.6             rstudioapi_0.3.1        AnnotationDbi_1.30.1   
 [4] zlibbioc_1.14.0         GenomicAlignments_1.4.1 BiocParallel_1.2.1     
 [7] xtable_1.7-4            BSgenome_1.36.0         R6_2.0.1               
[10] tools_3.2.0             Biobase_2.28.0          DBI_0.3.1              
[13] lambda.r_1.1.7          futile.logger_1.4.1     htmltools_0.2.6        
[16] digest_0.6.8            rtracklayer_1.28.2      futile.options_1.0.0   
[19] bitops_1.0-6            RCurl_1.95-4.6          biomaRt_2.24.0         
[22] mime_0.3                RSQLite_1.0.0           GenomicFeatures_1.20.1 
[25] XML_3.98-1.1            jsonlite_0.9.16         httpuv_1.3.2           
[28] rstudio_0.98.1062      

 

ADD REPLYlink written 3.4 years ago by pifferdavide0

I downloaded the file again and got the same error message: head(rowRanges(vcf), 3)
Error in head(rowRanges(vcf), 3) : 
  error in evaluating the argument 'x' in selecting a method for function 'head': Error in rowRanges(vcf) : Argument 'x' must be a matrix or a vector.

ADD REPLYlink written 3.4 years ago by pifferdavide0
> vcf <- readVcf("ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", genome="hg38")
> head(rowRanges(vcf), 3)
GRanges object with 3 ranges and 5 metadata columns:
              seqnames               ranges strand | paramRangeID            REF             ALT      QUAL      FILTER
                 <Rle>            <IRanges>  <Rle> |     <factor> <DNAStringSet> <CharacterList> <numeric> <character>
  rs587697622       22 [16050075, 16050075]      * |         <NA>              A               G       100        PASS
  rs587755077       22 [16050115, 16050115]      * |         <NA>              G               A       100        PASS
  rs587654921       22 [16050213, 16050213]      * |         <NA>              C               T       100        PASS
  -------
  seqinfo: 86 sequences from hg38 genome
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] VariantAnnotation_1.14.1 Rsamtools_1.20.2         Biostrings_2.36.1        XVector_0.8.0            GenomicRanges_1.20.3     GenomeInfoDb_1.4.0      
 [7] IRanges_2.2.1            S4Vectors_0.6.0          BiocGenerics_0.14.0      BiocInstaller_1.18.2    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.30.1    zlibbioc_1.14.0         GenomicAlignments_1.4.1 BiocParallel_1.2.1      BSgenome_1.36.0         tools_3.2.0            
 [7] Biobase_2.28.0          DBI_0.3.1               lambda.r_1.1.7          futile.logger_1.4.1     rtracklayer_1.28.2      futile.options_1.0.0   
[13] bitops_1.0-6            RCurl_1.95-4.6          biomaRt_2.24.0          RSQLite_1.0.0           GenomicFeatures_1.20.1  XML_3.98-1.1   
ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by James W. MacDonald47k

Alright the problem apparently was in Rstudio's version I was using. I have downloaded a new version of Rstudio and now it works fine: head(rowRanges(vcf), 3)
GRanges object with 3 ranges and 5 metadata columns:
              seqnames               ranges strand | paramRangeID            REF
                 <Rle>            <IRanges>  <Rle> |     <factor> <DNAStringSet>
  rs587697622       22 [16050075, 16050075]      * |         <NA>              A
  rs587755077       22 [16050115, 16050115]      * |         <NA>              G
  rs587654921       22 [16050213, 16050213]      * |         <NA>              C
                          ALT      QUAL      FILTER
              <CharacterList> <numeric> <character>
  rs587697622               G       100        PASS
  rs587755077               A       100        PASS
  rs587654921               T       100        PASS
  -------
  seqinfo: 86 sequences from hg19 genome

ADD REPLYlink written 3.4 years ago by pifferdavide0
0
gravatar for pifferdavide
3.4 years ago by
pifferdavide0 wrote:

Many thanks for your help!

ADD COMMENTlink written 3.4 years ago by pifferdavide0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 183 users visited in the last hour