Question: getPromoterSeq error "Error in value[[3L]](cond) : record 1 (1:-XXXX-XXX) was truncated"
gravatar for jerrywu1987
15 months ago by
jerrywu19870 wrote:

Hi, I am Hao. I have an issue with the "getPromoterSeq" function from "GenomicFeatures" package.

I was trying to obtain potential promoter sequences for all genes in maize chromosome 1. I have the sequence file "Zea_mays.AGPv4.dna.chromosome.1.fa", and the total genome annotation file "Zea_mays.AGPv4.40.gff3".

I arbitrarily defined promoter region as 2,000 bp upstream from TSS and 500 bp downstream of TSS. The code I wrote is as follows.


FaFile=FaFile("Zea_mays.AGPv4.dna.chromosome.1.fa") #subject


gffRangedData=read.gff("Zea_mays.AGPv4.40.gff3", na.strings = c(".", "?")) 
myGranges<-as(gffRangedData, "GRanges")   #query

Promoter=getPromoterSeq(myGranges,FaFile,upstream=2000, downstream=500)  

However, it returned me an error after using the "getPromoterSeq" function (see below). 

Error in value[[3L]](cond) :  record 1 (1:-1999-500) was truncated
  file: Zea_mays.AGPv4.dna.chromosome.1.fa
In addition: Warning message:
In .local(x, upstream, downstream, ...) : '*' ranges were treated as '+'

Below is the sessionInfo() output

> sessionInfo() 
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicFeatures_1.30.3 AnnotationDbi_1.40.0   Biobase_2.38.0         ape_5.1               
 [5] Rsamtools_1.30.0       Biostrings_2.46.0      XVector_0.18.0         GenomicRanges_1.30.3  
 [9] GenomeInfoDb_1.14.0    IRanges_2.12.0         S4Vectors_0.16.0       BiocGenerics_0.24.0   
[13] BiocInstaller_1.28.0  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18               compiler_3.4.4             prettyunits_1.0.2         
 [4] bitops_1.0-6               tools_3.4.4                zlibbioc_1.24.0           
 [7] progress_1.2.0             biomaRt_2.34.2             digest_0.6.16             
[10] bit_1.1-14                 RSQLite_2.1.1              memoise_1.1.0             
[13] nlme_3.1-131.1             lattice_0.20-35            pkgconfig_2.0.2           
[16] rlang_0.2.2                Matrix_1.2-12              DelayedArray_0.4.1        
[19] DBI_1.0.0                  rstudioapi_0.7             GenomeInfoDbData_1.0.0    
[22] rtracklayer_1.38.3         httr_1.3.1                 stringr_1.3.1             
[25] hms_0.4.2                  bit64_0.9-7                grid_3.4.4                
[28] R6_2.2.2                   XML_3.98-1.16              RMySQL_0.10.15            
[31] BiocParallel_1.12.0        magrittr_1.5               blob_1.1.1                
[34] matrixStats_0.54.0         GenomicAlignments_1.14.2   SummarizedExperiment_1.8.1
[37] assertthat_0.2.0           stringi_1.1.7              RCurl_1.95-4.11           
[40] crayon_1.3.4              

I was wondering if anyone could help me out? Many thanks!





ADD COMMENTlink written 15 months ago by jerrywu19870

Please update to a current version of R/Bioconductor (release is R 3.5 / Bioc 3.7). See this page for help:

If you still see the error after updating, post back and show your sessionInfo() again. Likely the error is triggered by specific ranges in myGranges. It would be good to identify them if possible.


ADD REPLYlink written 15 months ago by Valerie Obenchain6.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 227 users visited in the last hour