getPromoterSeq error "Error in value[[3L]](cond) : record 1 (1:-XXXX-XXX) was truncated"
Entering edit mode
Last seen 3.2 years ago

Hi, I am Hao. I have an issue with the "getPromoterSeq" function from "GenomicFeatures" package.

I was trying to obtain potential promoter sequences for all genes in maize chromosome 1. I have the sequence file "Zea_mays.AGPv4.dna.chromosome.1.fa", and the total genome annotation file "Zea_mays.AGPv4.40.gff3".

I arbitrarily defined promoter region as 2,000 bp upstream from TSS and 500 bp downstream of TSS. The code I wrote is as follows.


FaFile=FaFile("Zea_mays.AGPv4.dna.chromosome.1.fa") #subject


gffRangedData=read.gff("Zea_mays.AGPv4.40.gff3", na.strings = c(".", "?")) 
myGranges<-as(gffRangedData, "GRanges")   #query

Promoter=getPromoterSeq(myGranges,FaFile,upstream=2000, downstream=500)  

However, it returned me an error after using the "getPromoterSeq" function (see below). 

Error in value[[3L]](cond) :  record 1 (1:-1999-500) was truncated
  file: Zea_mays.AGPv4.dna.chromosome.1.fa
In addition: Warning message:
In .local(x, upstream, downstream, ...) : '*' ranges were treated as '+'

Below is the sessionInfo() output

> sessionInfo() 
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicFeatures_1.30.3 AnnotationDbi_1.40.0   Biobase_2.38.0         ape_5.1               
 [5] Rsamtools_1.30.0       Biostrings_2.46.0      XVector_0.18.0         GenomicRanges_1.30.3  
 [9] GenomeInfoDb_1.14.0    IRanges_2.12.0         S4Vectors_0.16.0       BiocGenerics_0.24.0   
[13] BiocInstaller_1.28.0  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18               compiler_3.4.4             prettyunits_1.0.2         
 [4] bitops_1.0-6               tools_3.4.4                zlibbioc_1.24.0           
 [7] progress_1.2.0             biomaRt_2.34.2             digest_0.6.16             
[10] bit_1.1-14                 RSQLite_2.1.1              memoise_1.1.0             
[13] nlme_3.1-131.1             lattice_0.20-35            pkgconfig_2.0.2           
[16] rlang_0.2.2                Matrix_1.2-12              DelayedArray_0.4.1        
[19] DBI_1.0.0                  rstudioapi_0.7             GenomeInfoDbData_1.0.0    
[22] rtracklayer_1.38.3         httr_1.3.1                 stringr_1.3.1             
[25] hms_0.4.2                  bit64_0.9-7                grid_3.4.4                
[28] R6_2.2.2                   XML_3.98-1.16              RMySQL_0.10.15            
[31] BiocParallel_1.12.0        magrittr_1.5               blob_1.1.1                
[34] matrixStats_0.54.0         GenomicAlignments_1.14.2   SummarizedExperiment_1.8.1
[37] assertthat_0.2.0           stringi_1.1.7              RCurl_1.95-4.11           
[40] crayon_1.3.4              

I was wondering if anyone could help me out? Many thanks!





getPromoterSeq GRanges GenomicFeatures • 843 views
Entering edit mode

Please update to a current version of R/Bioconductor (release is R 3.5 / Bioc 3.7). See this page for help:

If you still see the error after updating, post back and show your sessionInfo() again. Likely the error is triggered by specific ranges in myGranges. It would be good to identify them if possible.



Login before adding your answer.

Traffic: 237 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6