Package needs internet access
1
0
Entering edit mode
firestar ▴ 20
@rmf-13755
Last seen 1 hour ago
Sweden

I am using this package BSgenome.Hsapiens.UCSC.hg38 in a container on a compute cluster without internet access. When using this package, it attempts to download data and fails:

Error in download.file(url, destfile, quiet = TRUE) :
  cannot open URL 'https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/chromInfo.txt.gz'
Calls: seqlevelsStyle<- ... .fetch_chrom_sizes_from_UCSC_database -> fetch_table_dump_from_UCSC -> fetch_table_from_url
Execution halted

Is there a way to point to a local path? What is the best way to deal with such issues in an offline environment?

Update: Added more complete code.

This is my code for running Cicero on Seurat objects based on this Signac vignette.

library(Seurat)
library(Signac)
library(cicero)
library(BSgenome.Hsapiens.UCSC.hg38)
library(dplyr)

seqlevelsStyle(BSgenome.Hsapiens.UCSC.hg38) <- "NCBI"
seqnames(BSgenome.Hsapiens.UCSC.hg38) <- BSgenome.Hsapiens.UCSC.hg38@seqinfo@seqnames

sf <- readRDS(file.path(path,"seurat.rds"))
mo <- SeuratWrappers::as.cell_data_set(sf)
co <- make_cicero_cds(mo, reduced_coordinates = reducedDims(mo)$UMAP)

# get the chromosome sizes from the Seurat object
genome <- as.data.frame(seqinfo(BSgenome.Hsapiens.UCSC.hg38)) %>%
  tibble::rownames_to_column("chr") %>%
  select(chr,seqlengths) %>%
  slice(1:25)

conns <- run_cicero(co, genomic_coords = genome, sample_num = 100)
ccans <- generate_ccans(conns)
links <- ConnectionsToLinks(conns = conns, ccans = ccans)
Links(sf) <- links
BSgenome BSgenome.Hsapiens.UCSC.hg38 Offline • 51 views
ADD COMMENT
0
Entering edit mode

It's not clear to me why (or that) seqlevelsStyle should be called on a BSgenome object. You will need to provide more code that precedes the error so we can understand what you are trying to do.

ADD REPLY
0
Entering edit mode

Updated with more code.

ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 12 minutes ago
United States

It's going to need internet access. When you switch to NCBI seqlevels, you are calling this function

.fetch_chrom_sizes_from_UCSC_database <- function(genome,
    goldenPath.url=getOption("UCSC.goldenPath.url"))
{
    col2class <- c(chrom="character", size="integer", fileName="NULL")
    ans <- fetch_table_dump_from_UCSC(genome, "chromInfo",
                                      col2class=col2class,
                                      goldenPath.url=goldenPath.url)
    ## Some sanity checks that should never fail.
    in_what <- paste0("\"chromInfo\" table for UCSC genome ", genome)
    .check_chrom_sizes(ans, in_what)
    ans
}

Which is in the GenomeInfoDb package. If you had a local UCSC genome browser DB running, you could use that, but it would be easier to just modify the seqlevels on that BSgenome object on a laptop or whatever and save as an RDS object that you could then, uh, put on a thumb drive and plug into your airgapped computer?

ADD COMMENT

Login before adding your answer.

Traffic: 555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6