(SessionInfo() is at the bottom of the question)
I'm getting this error whenever I run the function createSNPBlacklist(vcf.files, n = min(10, length(vcf.files)), low.af = 0.025, high.af = 0.1, genome = "hg19"):
Error in `$<-.data.frame`(`*tmp*`, "Count.G", value = numeric(0)) :
replacement has 0 rows, data has 113897
In addition: Warning message:
In is.na(xx$Count.G) :
is.na() applied to non-(list or vector) of type 'NULL'
I found the code for the createSNPBlacklist() function in PureCN: http://search.bioconductor.jp/codes/8833
I was going through the code for this function, and found the error came from this line:
xx$Count.G[is.na(xx$Count.G)] <- 0
The problem seemed to stem from these statements:
xx <- data.frame(Count=xx)
xx <- cbind(xx, Count.G=xx.s[rownames(xx)])
xx is initially a table with the rownames as the snp ids and the values as the frequency. The first statement creates a data.frame with the rownames as 1:lenght(xx[,1]) and columns Count.Var1 and Count.Freq. Count.Var1 contained the snp ids (initially the rownames of the table), and Count.Freq contained the frequencies. The second statement created two more columns Count.G.Var1 and Count.G.Freq.
Therefore, when I ran xx$Count.G[is.na(xx$Count.G)] <- 0 I received the error because xx$Count does not exist. I managed to get the function working for me by changing xx and xx.s data frames to have the rownames = snp ids and the columns Count and Count.G contain the frequencies for xx and xx.s respectively.
The error may be caused on something in my input files, but I couldn't determine what it would be. The rest of the script worked well up to this point, and it seems to be reading my Vcf files fine. I also ran the runAbsoluteCN() function with my vcf files fine. However, if there might be something about my input vcfs that are causing the issue with the original code, any help would be appreciated on what I might be able to change.
Thank you for the help,
Kaitlyn Minchella
sessionInfo()
R version 3.3.0 beta (2016-04-24 r70543)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] rtracklayer_1.32.1 PureCN_1.1.24
[3] VariantAnnotation_1.18.1 Rsamtools_1.24.0
[5] Biostrings_2.40.2 XVector_0.12.0
[7] SummarizedExperiment_1.2.3 Biobase_2.32.0
[9] GenomicRanges_1.24.2 GenomeInfoDb_1.8.1
[11] IRanges_2.6.1 S4Vectors_0.10.1
[13] BiocGenerics_0.18.0 DNAcopy_1.46.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.34.3 zlibbioc_1.18.0 GenomicAlignments_1.8.3
[4] BiocParallel_1.6.2 BSgenome_1.40.1 tools_3.3.0
[7] data.table_1.9.6 DBI_0.4-1 RColorBrewer_1.1-2
[10] bitops_1.0-6 RCurl_1.95-4.8 biomaRt_2.28.0
[13] RSQLite_1.0.0 GenomicFeatures_1.24.3 XML_3.98-1.4
[16] chron_2.3-47