please also include the results of running the following in an R session

Question

'path' must be string

0

Entering edit mode

cpierce2 • 0

@18626c0b

Last seen 4 months ago

United States

Not sure how to remedy this error

datadir <- file.path("~/Desktop/DSP DATA")
DCCFiles <- dir(file.path(datadir, "DCC"), pattern = ".dcc$", full.names = TRUE, recursive = TRUE) 
PKCFiles <- dir(file.path(datadir, "PKC"), pattern = ".pkc$",full.names = TRUE, recursive = TRUE) 
DCCFiles <- as.character(DCCFiles)
PKCFiles <- as.character(PKCFiles)
SampleAnnotationFile <- dir(file.path(datadir, "annotation"), pattern = ".xlsx$", full.names = TRUE, recursive = TRUE)
demoData <- readNanoStringGeoMxSet(dccFiles = DCCFiles, pkcFiles = PKCFiles, phenoDataFile = SampleAnnotationFile, phenoDataSheet = "Sheet1", phenoDataDccColName = "ROI_ID", protocolDataColNames = c("Segment Tag", "Scan Name"))

Output: Error: path must be a string

please also include the results of running the following in an R session

sessionInfo( )

```R version 4.4.1 (2024-06-14) Platform: aarch64-apple-darwin20 Running under: macOS Monterey 12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York tzcode source: internal

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] GeoMxWorkflows_1.10.0 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[6] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 tidyverse_2.0.0
[11] BiocManager_1.30.23 GeomxTools_3.8.0 NanoStringNCTools_1.12.0 ggplot2_3.5.1 S4Vectors_0.42.1
[16] Biobase_2.64.0 BiocGenerics_0.50.0

loaded via [1] readxl_1.4.3 [6] systemfonts_1.1.0 [11] fastmap_1.2.0 [16] UCSC.utils_1.0.0 [21] GenomeInfoDb_1.40.1 [26] parallel_4.4.1 [31] GGally_2.2.1 [36] knitr_1.48 [41] timechange_0.3.0 [46] rstudioapi_0.16.0 [51] plyr_1.8.9 [56] future_1.33.2 [61] generics_0.1.3 [66] minqa_1.2.7 [71] tools_4.4.1 [76] dotCall64_1.1-1 [81] networkD3_0.4 [86] vipor_0.4.7 [91] gtable_0.3.5 [96] farver_2.1.2 [101] lifecycle_1.0.4 a namespace (and not attached): rlang_1.1.4 magrittr_2.0.3 compiler_4.4.1 png_0.1-8
vctrs_0.6.5 reshape2_1.4.4 pkgconfig_2.0.3 crayon_1.5.3
XVector_0.44.0 utf8_1.2.4 rmarkdown_2.27 tzdb_0.4.0
nloptr_2.1.1 ggbeeswarm_0.7.2 xfun_0.46 zlibbioc_1.50.0
jsonlite_1.8.8 EnvStats_2.8.1 tweenr_2.0.3 uuid_1.2-0
R6_2.5.1 stringi_1.8.4 RColorBrewer_1.1-3 reticulate_1.38.0
parallelly_1.37.1 boot_1.3-30 cellranger_1.1.0 numDeriv_2016.8-1.1
Rcpp_1.0.13 future.apply_1.11.2 IRanges_2.38.1 igraph_2.0.3
Matrix_1.7-0 splines_4.4.1 tidyselect_1.2.1 yaml_2.3.9
codetools_0.2-20 listenv_0.9.1 lattice_0.22-6 lmerTest_3.1-3
withr_3.0.0 askpass_1.2.0 evaluate_0.24.0 Rtsne_0.17
polyclip_1.10-7 ggstats_0.6.0 Biostrings_2.72.1 pillar_1.9.0
sp_2.1-4 hms_1.1.3 munsell_0.5.1 scales_1.3.0
BiocStyle_2.32.1 globals_0.16.3 glue_1.7.0 pheatmap_1.0.12
data.table_1.15.4 lme4_1.1-35.5 RSpectra_0.16-2 ggiraph_0.8.10
cowplot_1.1.3 grid_4.4.1 umap_0.2.10.0 colorspace_2.1-0
nlme_3.1-165 GenomeInfoDbData_1.2.12 ggforce_0.4.2 beeswarm_0.4.0
cli_3.6.3 spam_2.10-0 fansi_1.0.6 ggthemes_5.1.0
outliers_0.15 digest_0.6.36 progressr_0.14.0 ggrepel_0.9.5
rjson_0.2.21 htmlwidgets_1.6.4 SeuratObject_5.0.2 htmltools_0.5.8.1
httr_1.4.7 openssl_2.2.0 MASS_7.3-61

HELP path • 1.2k views

ADD COMMENT • link updated 4 months ago by James W. MacDonald 67k • written 4 months ago by cpierce2 • 0

0

Entering edit mode

What is the output from traceback() right after you get the error?

ADD REPLY • link 4 months ago James W. MacDonald 67k

0

Entering edit mode

4: stop("`path` must be a string", call. = FALSE)
3: check_file(path)
2: readxl::read_xlsx(phenoDataFile, col_names = TRUE, sheet = phenoDataSheet, 
       ...)
1: readNanoStringGeoMxSet(dccFiles = DCCFiles, pkcFiles = PKCFiles, 
       phenoDataFile = SampleAnnotationFile, phenoDataSheet = "Sheet1", 
       phenoDataDccColName = "ROI_ID", protocolDataColNames = c("Segment Tag", 
           "Scan Name"))

Thanks for helping!

ADD REPLY • link updated 4 months ago by James W. MacDonald 67k • written 4 months ago by cpierce2 • 0

0

Entering edit mode

And this is the file path I get from print(sampleAnnotationFile)

[1] "/users/colinpierce/Desktop/DSP DATA/annotation/~$annotation.xlsx"

[2] "/users/colinpierce/Desktop/DSP DATA/annotation/annotation.xlsx"

ADD REPLY • link updated 4 months ago by James W. MacDonald 67k • written 4 months ago by cpierce2 • 0

0

Entering edit mode

In addition, I just tried to use an absolute path to the excel sheet SampleAnnotationFile <- file.path("/users/colinpierce/Desktop/DSP DATA/annotation") , and then I ran the same readNanoStringGeoMxSet line and got this output:

Error in utils::unzip(zip_path, list = TRUE) : 
  zip file '/Users/colinpierce/Desktop/DSP DATA/annotation' cannot be opened. Then, I ran `traceback()` again and got this: 8: utils::unzip(zip_path, list = TRUE)
7: (function (zip_path, file_path) 
   {
       files <- utils::unzip(zip_path, list = TRUE)
       indx <- match(file_path, files$Name)
       if (is.na(indx)) {
           stop("Couldn't find '", file_path, "' in '", zip_path, 
               "'", call. = FALSE)
       }
       size <- files$Length[indx]
       con <- unz(zip_path, file_path, open = "rb")
       on.exit(close(con), add = TRUE)
       readBin(con, raw(), n = size)
   })("/Users/colinpierce/Desktop/DSP DATA/annotation", "_rels/.rels")
6: sheets_fun(path)
5: sheet %in% sheet_names
4: standardise_sheet(sheet, range, sheets_fun(path))
3: read_excel_(path = path, sheet = sheet, range = range, col_names = col_names, 
       col_types = col_types, na = na, trim_ws = trim_ws, skip = skip, 
       n_max = n_max, guess_max = guess_max, progress = progress, 
       .name_repair = .name_repair, format = "xlsx")
2: readxl::read_xlsx(phenoDataFile, col_names = TRUE, sheet = phenoDataSheet, 
       ...)
1: readNanoStringGeoMxSet(dccFiles = DCCFiles, pkcFiles = PKCFiles, 
       phenoDataFile = SampleAnnotationFile, phenoDataSheet = "Sheet1", 
       phenoDataDccColName = "ROI_ID", protocolDataColNames = c("Segment Tag", 
           "Scan Name"))

Thanks again, I hope this helps

ADD REPLY • link updated 4 months ago by James W. MacDonald 67k • written 4 months ago by cpierce2 • 0

score 0 · Answer 1 · 2024-07-25

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 15 hours ago

United States

When you are posting code, please either add a triple backtick (the top left key on a QWERTY keyboard) to the line preceding the code and to the line below the code. There is always a box below the submission box that shows what your post will look like, and if it's a garbled mess (like all of your posts thus far, that I have already fixed), then you need to adjust your post to not be garbled.

You can also select a code section and then hit the CODE button above the submission box.

There are two possible problems. First, one of your files (the ~$annotation.xlsx one) isn't really a file. That's a thing that Excel does to let Excel know that the workbook is already opened, and I don't think read_xlsx is going to be able to open it. So you should make sure that you don't have that workbook open in Excel, and if the tilde file name is still in your directory, just delete it.

Second, spaces in paths are generally not a great idea on Linux variants. You might rename that whole directory to have an underscore instead of a space and see if that helps.

ADD COMMENT • link 4 months ago James W. MacDonald 67k

0

Entering edit mode

Sorry for the garbled code. One more thing, again I appreciate it.

Error in startsWith(probeAssay[["RTS_ID"]][1L], "RNA") :                                                                                    
  non-character object(s)

this error again comes from this:

demoData <-
  readNanoStringGeoMxSet(dccFiles = renamed_files,
                         pkcFiles = PKCFiles,
                         phenoDataFile = SampleAnnotationFile, 
                         phenoDataSheet="Sheet1", 
                         phenoDataDccColName = "Sample_ID", 
                         protocolDataColNames = c("ROI_ID" , "Cores"))

ADD REPLY • link 4 months ago cpierce2 • 0

0

Entering edit mode

That's a tough one the probeAssay object is made internally, and that step is meant to convert old names. The error arises because the particular column of the probeAssay object doesn't contain character objects, so startsWith can't parse them. The only way to debug is to run readNanoStringGeoMxSet under the debugger, and step through to that point and look to see what is actually in probeAssay.

debug(readNanoStringGeoMxSet)
demoData <-
  readNanoStringGeoMxSet(dccFiles = renamed_files,
                         pkcFiles = PKCFiles,
                         phenoDataFile = SampleAnnotationFile, 
                         phenoDataSheet="Sheet1", 
                         phenoDataDccColName = "Sample_ID", 
                         protocolDataColNames = c("ROI_ID" , "Cores"))

## now step through the code until you generate the probeAssay object and check it out

ADD REPLY • link 4 months ago James W. MacDonald 67k

0

Entering edit mode

I ran the debugger and this is the line that causes the error:

if (any(startsWith(pkcData[["RTS_ID"]][1L], "RTS")) & any(startsWith(probeAssay[["RTS_ID"]][1L], 
    "RNA"))) {
    probeAssay[["RTS_ID"]] <- gsub("^RNA", "RTS00", probeAssay[["RTS_ID"]])
  }
  probeAssay <- probeAssay[probeAssay[["RTS_ID"]] %in% pkcData[["RTS_ID"]],

ADD REPLY • link 4 months ago cpierce2 • 0

0

Entering edit mode

Yes. We already knew that from the error message, and I told you that in my previous email. The point wasn't to figure out where the error occurred, but why.

The error arises because startsWith expects probeAssay[["RTS_ID"]][1L] to return a character vector, and it's returning a non-character object. When checking out an object it's often useful to go slow, in case it's a big object and not constrained (for example if you have a data.frame with 1M rows, you wouldn't want to just type its name because it's gonna keep printing that on the console until you get to the R-imposed limit).

class(probeAssay)

## if that is a DFrame object it's constrained, and you can then just call the show method
probeAssay

## which will print the first and last ten (or so) lines
## or you could do
class(probeAssay[["RTS_ID"]]) 

## etc

ADD REPLY • link 4 months ago James W. MacDonald 67k