Bug Report/Question for PureCN: createSNPBlacklist() Error
Entering edit mode
klminch ▴ 10
Last seen 4.8 years ago

(SessionInfo()  is at the bottom of the question)

I'm getting this error whenever I run the function createSNPBlacklist(vcf.files, n = min(10, length(vcf.files)), low.af = 0.025, high.af = 0.1, genome = "hg19"):  

Error in `$<-.data.frame`(`*tmp*`, "Count.G", value = numeric(0)) : 
  replacement has 0 rows, data has 113897
In addition: Warning message:
In is.na(xx$Count.G) :
  is.na() applied to non-(list or vector) of type 'NULL'


I found the code for the createSNPBlacklist() function in PureCN: http://search.bioconductor.jp/codes/8833

I was going through the code for this function, and found the error came from this line:

xx$Count.G[is.na(xx$Count.G)] <- 0 


The problem seemed to stem from these statements:

xx <- data.frame(Count=xx)

xx <- cbind(xx, Count.G=xx.s[rownames(xx)])    


xx is initially a table with the rownames as the snp ids and the values as the frequency.  The first statement creates a data.frame with the rownames as 1:lenght(xx[,1]) and columns Count.Var1 and Count.Freq.  Count.Var1 contained the snp ids (initially the rownames of the table), and Count.Freq contained the frequencies.  The second statement created two more columns Count.G.Var1 and Count.G.Freq.  

Therefore, when I ran xx$Count.G[is.na(xx$Count.G)] <- 0 I received the error because xx$Count does not exist.   I managed to get the function working for me by changing xx and xx.s data frames to have the rownames = snp ids and the columns Count and Count.G  contain the frequencies for xx and xx.s respectively. 

The error may be caused on something in my input files, but I couldn't determine what it would be. The rest of the script worked well up to this point, and it seems to be reading my Vcf files fine.  I also ran the runAbsoluteCN() function with my vcf files fine.  However, if there might be something about my input vcfs that are causing the issue with the original code, any help would be appreciated on what I might be able to change.

Thank you for the help,

Kaitlyn Minchella


R version 3.3.0 beta (2016-04-24 r70543)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] rtracklayer_1.32.1         PureCN_1.1.24             
 [3] VariantAnnotation_1.18.1   Rsamtools_1.24.0          
 [5] Biostrings_2.40.2          XVector_0.12.0            
 [7] SummarizedExperiment_1.2.3 Biobase_2.32.0            
 [9] GenomicRanges_1.24.2       GenomeInfoDb_1.8.1        
[11] IRanges_2.6.1              S4Vectors_0.10.1          
[13] BiocGenerics_0.18.0        DNAcopy_1.46.0            

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.34.3    zlibbioc_1.18.0         GenomicAlignments_1.8.3
 [4] BiocParallel_1.6.2      BSgenome_1.40.1         tools_3.3.0            
 [7] data.table_1.9.6        DBI_0.4-1               RColorBrewer_1.1-2     
[10] bitops_1.0-6            RCurl_1.95-4.8          biomaRt_2.28.0         
[13] RSQLite_1.0.0           GenomicFeatures_1.24.3  XML_3.98-1.4           
[16] chron_2.3-47           



PureCN • 446 views
Entering edit mode
Last seen 28 days ago

Thanks Kaitlyn. I could reproduce with R 3.3 (in 3.2 it worked - looks like this needs a test case.)

Should be fixed in current devel github.


Login before adding your answer.

Traffic: 477 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6