Question: Assembly report error for hg19 in GenomeInfoDb
0
gravatar for roshniroy16
7 weeks ago by
roshniroy160 wrote:

I am working with HTGTS Transloc-pipeline. While working with the example dataset, I managed to run the first two steps TranslocPreprocess.pl and TranslocWrapper.pl) and got the desired output files (tlx files)

TranslocPreprocess.pl tutorialmetadata.txt preprocess/ --read1 pooledR1.fq.gz --read2 pooledR2.fq.gz TranslocWrapper.pl tutorialmetadata.txt preprocess/ results/ --threads 2

In the next step, while running TranslocHotSpots.R with command-

$ TranslocHotSpots.R /scratch/royr6/results/RAG1CSRep1/RAG1CSRep1result.tlx /scratch/royr6/results/RAG1CSRep1/output I kept getting this error message-

Error in .make_assembly_report_URL(assembly_accession) :
  don't know where to find assembly report for GCF_000001405.13
Calls: Seqinfo ... FUN -> fetch_assembly_report -> .make_assembly_report_URL

On looking around, I find that this error is associated with GenomeInfoDb (https://rdrr.io/bioc/GenomeInfoDb/src/R/assembly-utils.R) as I get the same message when I type this in R-

> BiocManager::install(c("GenomeInfoDb","BSgenome"))
> options(download.file.method="libcurl")
> library("GenomeInfoDb")
> library("BSgenome)
> GenomeInfoDb::Seqinfo(genome = "hg19")

Error in .make_assembly_report_URL(assembly_accession) :
  don't know where to find assembly report for GCF_000001405.13

I am stuck and would really appreciate any help in this regard.

ADD COMMENTlink modified 11 days ago by Martin Morgan ♦♦ 24k • written 7 weeks ago by roshniroy160

Hi,

2 problems with your post:

  1. The tag you used (software error) is too general. Please use a package specific tag (GenomeInfoDb in this case). This will help other users of the support site find questions/answers about the package and will notify the GenomeInfoDb maintainers that a questions was asked about this package.

  2. Please show the code you used that generates the error you got. Ideally you should try to provide a minimal self-contained working example. And also don't forget to provide your sessionInfo().

More details about these things in our Posting Guide (make sure you read it).

Thanks,

H.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Hervé Pagès ♦♦ 14k
Answer: Assembly report error for hg19 in GenomeInfoDb
0
gravatar for Martin Morgan
11 days ago by
Martin Morgan ♦♦ 24k
United States
Martin Morgan ♦♦ 24k wrote:

This followup was posted through a different channel:

    > sessionInfo()
    R version 3.6.0 (2019-04-26)
    Platform: x86_64-pc-linux-gnu (64-bit)
    Running under: CentOS Linux 7 (Core)

    Matrix products: default
    BLAS/LAPACK: /usr/local/intel/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64_lin/libmkl_rt.so

    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
     [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
     [9] LC_ADDRESS=C               LC_TELEPHONE=C
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

    attached base packages:
    [1] stats4    parallel  stats     graphics  grDevices utils     datasets
    [8] methods   base

    other attached packages:
    [1] GenomeInfoDb_1.21.2 IRanges_2.18.3      S4Vectors_0.22.1
    [4] BiocGenerics_0.31.6

    loaded via a namespace (and not attached):
    [1] compiler_3.6.0         GenomeInfoDbData_1.2.1 RCurl_1.95-4.12
    [4] bitops_1.0-6

The code tries to read

url = "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/"
xx = RCurl::getURL(url)

and for me I get a directory listing

> cat(xx)
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.10_NCBI34
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.11_NCBI35
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.12_NCBI36
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.13_GRCh37
...

What do you get? (please use the 'ADD COMMENT' button below this post to reply...)

ADD COMMENTlink modified 11 days ago • written 11 days ago by Martin Morgan ♦♦ 24k

Thank you for your reply Martin. When I type the commands-

url = "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/"
xx = RCurl::getURL(url)
cat(xx)

I get this HUUGE list of comments- (reduced a few lines due to space constraints)

http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta type="copyright" content="Copyright (C) 1996-2016 The Squid Software Foundation and contributors">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Directory: <a href="<a href=" rel="nofollow">ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/</a>" rel="nofollow"><a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/" rel="nofollow">ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/</a>
<style type="text/css">

</body></html>
ADD REPLYlink modified 11 days ago by Martin Morgan ♦♦ 24k • written 11 days ago by roshniroy160

Well, I know what's going on but don't know how to solve it. Your institution is using 'squid', which is a proxy that caches web pages. For some reason, it has cached the ftp request as an html document, or is trying to say that it can't display the document, perhaps because a port is blocked -- you could try to cut and paste the url into a browser and see what happens...

I think my advice is to reach out to your local help desk with the minimal example

url = "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/"
xx = RCurl::getURL(url)

indicating that you are trying to use curl to access an ftp web site; the command line equivalent and expected output is simply

$ curl ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.10_NCBI34
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.11_NCBI35
dr-xr-xr-x   2 ftp      anonymous     4096 Oct 13  2016 GCF_000001405.12_NCBI36
...
ADD REPLYlink written 11 days ago by Martin Morgan ♦♦ 24k

I will definitely get in touch with the help desk. Thank you for your feedback.

ADD REPLYlink written 11 days ago by roshniroy160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 449 users visited in the last hour