CustomProDB InputVcf() - Issue getting the GRanges INDEL object from VCF file
0
0
Entering edit mode
Elena.M • 0
@elenam-21324
Last seen 5.4 years ago
Switzerland

Hello,

I am trying to use the package CustomProDB to retrieve the mutated peptides from a VCF file. When I use InputVcf(), the GRanges object does not include the INDEL column that I need for making the index and use Varlocation(). I have generated the VCF file using GATK HaplotypeCaller. Any idea why?

Thank you very much in advance, E.

> vcf <- InputVcf("/Users/elena/Desktop/Paulino/4T07_DNA.vcf")
Warning message:
In rbind(...) :
  number of columns of result is not a multiple of vector length (arg 1)
> length(vcf)
[1] 1
> table(values(vcf[[1]])[['INDEL']])
< table of extent 0 >
> vcf[[1]][1:3]
GRanges object with 3 ranges and 32 metadata columns:
                seqnames    ranges strand |         REF         ALT      QUAL      FILTER     AC     AF        AN BaseQRankSum        DP
                   <Rle> <IRanges>  <Rle> | <character> <character> <numeric> <character> <list> <list> <integer>    <numeric> <integer>
  1:3026194_C/A        1   3026194      * |           C           A     72.28           .      2      1         2         <NA>         2
  1:3053912_G/T        1   3053912      * |           G           T     111.8           .      2      1         2         <NA>         3
  1:3070694_C/A        1   3070694      * |           C           A      55.6           .      1    0.5         2            0         6
                       DS ExcessHet        FS InbreedingCoeff  MLEAC  MLEAF        MQ MQRankSum        QD ReadPosRankSum       SOR          GT
                <logical> <numeric> <numeric>       <numeric> <list> <list> <numeric> <numeric> <numeric>      <numeric> <numeric> <character>
  1:3026194_C/A     FALSE    3.0103         0            <NA>      1    0.5        60      <NA>     25.36           <NA>     2.303         1/1
  1:3053912_G/T     FALSE    3.0103         0            <NA>      1    0.5        60      <NA>     28.73           <NA>     2.833         1/1
  1:3070694_C/A     FALSE    3.0103         0            <NA>      1    0.5      49.2    -0.842      9.27          1.834     0.307         0/1
                         AD        AD.1        AD.2        DP.1          GQ          PL        PL.1        PL.2        PL.3        PL.4
                <character> <character> <character> <character> <character> <character> <character> <character> <character> <character>
  1:3026194_C/A           0           2           0           2           6          84           6           0          84           6
  1:3053912_G/T           0           3           0           3           9         125           9           0         125           9
  1:3070694_C/A           4           2           4           6          63          63           0         149          63           0
                       PL.5
                <character>
  1:3026194_C/A           0
  1:3053912_G/T           0
  1:3070694_C/A         149
  -------
  seqinfo: 22 sequences from an unspecified genome; no seqlengths
vcf snv gatk customprodb inputvcf • 1.0k views
ADD COMMENT
0
Entering edit mode

´´´

sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding

locale: [1] C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] rtracklayer1.44.0 TxDb.Mmusculus.UCSC.mm10.knownGene3.4.7 VariantAnnotation1.30.1
[4] Rsamtools
2.0.0 Biostrings2.52.0 XVector0.24.0
[7] SummarizedExperiment1.14.0 DelayedArray0.10.0 BiocParallel1.18.0
[10] matrixStats
0.54.0 customProDB1.24.0 biomaRt2.40.1
[13] GenomicFeatures1.36.3 AnnotationDbi1.46.0 Biobase2.44.0
[16] GenomicRanges
1.36.0 GenomeInfoDb1.20.0 IRanges2.18.1
[19] S4Vectors0.22.0 BiocGenerics0.30.0

loaded via a namespace (and not attached): [1] Rcpp1.0.1 BiocManager1.30.4 plyr1.8.4 compiler3.6.0 prettyunits1.0.2
[6] bitops
1.0-6 tools3.6.0 progress1.2.2 zlibbioc1.30.0 digest0.6.20
[11] bit1.1-14 AhoCorasickTrie0.1.0 BSgenome1.52.0 lattice0.20-38 RSQLite2.1.1
[16] memoise
1.1.0 pkgconfig2.0.2 rlang0.4.0 Matrix1.2-17 DBI1.0.0
[21] curl3.3 GenomeInfoDbData1.2.1 stringr1.4.0 httr1.4.0 hms0.4.2
[26] grid
3.6.0 bit640.9-7 R62.4.0 XML3.98-1.20 blob1.1.1
[31] magrittr1.5 GenomicAlignments1.20.1 assertthat0.2.1 stringi1.4.3 RCurl1.95-4.12
[36] crayon
1.3.4
´´´

ADD REPLY

Login before adding your answer.

Traffic: 507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6