Question

Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

0

Entering edit mode

biomiha ▴ 20

@biomiha-11346

Last seen 3.4 years ago

UK/Cambridge

Hi,

I'm trying to use the spillover function from the flowCore package.

I'm not sure what's wrong but any set of compensation samples (i.e. unstained and single stained) I use it on, I always get the same error:

Error in dimnames(x) <- dn :

length of 'dimnames' [1] not equal to array extent.

For example:

library(ggcyto) # Also attaches ggplot2, flowCore, ncdfFlow, RcppArmadillo, BH and flowWorkspace.
frames <- lapply(dir(system.file("extdata", "compdata", "data", package="flowCore"), full.names=TRUE),
read.FCS) # This is the example set from the package.
names(frames) <- sapply(frames, keyword, "SAMPLE ID")
fs <- as(frames, "flowSet")
spillover(fs, unstained = "NA", fsc = "FSC-Height", ssc = "SSC-Height", stain_match = "ordered")

> Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent

Thanks.

---------------

sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.1

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] ggcyto_1.4.1 flowWorkspace_3.24.4
[3] ncdfFlow_2.22.2 BH_1.65.0-1
[5] RcppArmadillo_0.8.300.1.0 ggplot2_2.2.1
[7] BiocInstaller_1.26.1 flowCore_1.42.3

loaded via a namespace (and not attached):
[1] pcaPP_1.9-72 Rcpp_0.12.14 bindr_0.1
[4] DEoptimR_1.0-8 compiler_3.4.2 RColorBrewer_1.1-2
[7] plyr_1.8.4 tools_3.4.2 zlibbioc_1.22.0
[10] tibble_1.3.4 gtable_0.2.0 lattice_0.20-35
[13] pkgconfig_2.0.1 rlang_0.1.4 graph_1.54.0
[16] Rgraphviz_2.20.0 yaml_2.1.16 parallel_3.4.2
[19] mvtnorm_1.0-6 hexbin_1.27.1 bindrcpp_0.2
[22] gridExtra_2.3 stringr_1.2.0 dplyr_0.7.4
[25] cluster_2.0.6 IDPmisc_1.1.17 stats4_3.4.2
[28] grid_3.4.2 glue_1.2.0 data.table_1.10.4-3
[31] robustbase_0.92-8 Biobase_2.36.2 R6_2.2.2
[34] rrcov_1.4-3 XML_3.98-1.9 latticeExtra_0.6-28
[37] magrittr_1.5 corpcor_1.6.9 scales_0.5.0
[40] matrixStats_0.52.2 BiocGenerics_0.22.1 MASS_7.3-47
[43] assertthat_0.2.0 colorspace_1.3-2 KernSmooth_2.23-15
[46] stringi_1.1.6 flowViz_1.40.0 lazyeval_0.2.1
[49] munsell_0.4.3

flowcore spillover • 13k views

ADD COMMENT • link updated 6.3 years ago by Jiang, Mike ★ 1.3k • written 6.4 years ago by biomiha ▴ 20

0

Entering edit mode

SamGG ▴ 350

@samgg-6428

Last seen 3 days ago

France/Marseille/Inserm

Hi,

I think the right way to read a set of FCS is the following. If I use your code, I get the same error.

fs <- read.flowSet(files = dir(system.file("extdata", "compdata", "data", package="flowCore"), full.names=TRUE))

Moreover these example files do not seem to be use with the spillover function.

HTH

ADD COMMENT • link 6.4 years ago SamGG ▴ 350

0

Entering edit mode

Thanks SamGG,

The thing is that I get the same error with other flowSets, irrespective of how I read them in. So far, I've got the same error with each and every set of FCS files I've tried. The one I used for my reprex was just the code from the flowCore vignette. Do you not get the error with your code?

Can you perhaps elaborate why the couldn't be used with the spillover function, please? As far as I was able to determine, they're single stain controls.

M

ADD REPLY • link 6.4 years ago biomiha ▴ 20

0

Entering edit mode

Here is the message I got once the set of FCS files are read

> library(flowCore)
> fs <- read.flowSet(files = dir(system.file("extdata", "compdata", "data", package="flowCore"), full.names=TRUE))
> spillover(fs, unstained = "NA", fsc = "FSC-Height", ssc = "SSC-Height", stain_match = "ordered")
Error: Baseline not in this set.

The spillover function aims at computing the compensation matrix that will be applied when samples are analysed. This is usually done using commercial software by the expert of cytometry. The FCS files I get from experimenters have already a compensation matrix encoded in the header of the files. Therefore I use the compensate function to apply the compensation matrix to the data. The compensation matrix is usually encoded within the SPILLOVER keyword of the description (aka header).

IMHO, read the FCS files as stated above, and check the presence of the SPILLOVER matrix in your FCS files. If compensation is there, it should be ready for compensate function.

fsApply(fs, keyword, "SPILLOVER")
# NULL answer = not found
# fsApply(fs, keyword, "$P1N")
# Here answer = FSC-H, ie the name of the first parameter

fsApply(fs, function(ff) grep("SPILL", names(keyword(ff)), ignore.case = TRUE))
# Alternative search

HTH

ADD REPLY • link 6.4 years ago SamGG ▴ 350

0

Entering edit mode

Hi,

I think you get the `Error: Baseline not in this set.` because the unstained control is not specified correctly. If you change that to the index of the unstained control, you'll get the dimnames error again. There is a bug in the spillover function that I'm not able to locate.

I know there are other ways of extracting the spillover / compensation matrix, but the whole point of doing this is that I can generate one myself in case the compensation on the instrument (or in FlowJo, FACSDiva, etc...) wasn't done or wasn't done properly. What I want to achieve is a reproducible FACS workflow that does not rely on the experimenter's expert eye because that is very difficult to write down or share or put on a server. We have noticed that sometimes even the same operator gets slightly different results on different days, which is less than ideal. I know this is an accepted quirk of flow cytometry but it doesn't necessarily need to be this way.

M

ADD REPLY • link 6.4 years ago biomiha ▴ 20

0

Entering edit mode

I was not sure that you really want to compute the compensation matrix. I looked into the code but I didn't understand the rational behind. I found that interesting explanation at Roderer web that sounds easy although I doubt the background noise is taken into account (but may be there is no need). Roderer's explanation uses only single staining (aka unstained) whereas flowCore code uses also stained samples. If you get the rational, let me know.

Cheers.

ADD REPLY • link 6.4 years ago SamGG ▴ 350

0

Entering edit mode

So, the reasoning is the same every time you run a FACS compensation panel. You take your controls (cells or beads) that will bind an antibody and you stain them with single colours to generate you single stained samples. Then you run them on the machine and look at all of the detectors to see how much spillover of each single colour (not the same as unstained) you get in each of the detectors. The unstained is the baseline, where you've only got autofluorescence. Then you create a linear model for each detector to account for the spillover, which is why it's called a spillover matrix. The inverse matrix is the compensation matrix, which you apply to all cell signals to account for the contributions of the other colours in your specific detector.

Hope this helps - but there's still a bug in the code :)

ADD REPLY • link 6.4 years ago biomiha ▴ 20

0

Entering edit mode

Thanks a lot for your explanation. Previously I played around with the FCS files from the compensation tutorial of FlowJo, but there is no unstained file if I understand correctly. Do you know a small dataset that includes all required files to try the spillover function ?

ADD REPLY • link 6.4 years ago SamGG ▴ 350

0

Entering edit mode

The FCS repository is a good source. I've been using this one (https://flowrepository.org/id/FR-FCM-ZZ36) to play with.

ADD REPLY • link 6.3 years ago biomiha ▴ 20

0

Entering edit mode

From a practical point of view, if I don't have an unstained sample, does using the use of a positive population and a negative population in each maker help at computing the spillover matrix? If so, do you know a piece code?

ADD REPLY • link 6.3 years ago SamGG ▴ 350

score 2 · Accepted Answer · 2017-12-18

First of all you should use channel (name col) instead of marker (desc col) to effectively identify fsc/ssc.

> fs[[1]]
flowFrame object '060909.001'
with 10000 cells and 7 observables:
     name       desc range minRange maxRange
$P1 FSC-H FSC-Height  1024        0     1023
$P2 SSC-H SSC-Height  1024        0     1023
$P3 FL1-H       <NA>  1024        1    10000
$P4 FL2-H       <NA>  1024        1    10000
$P5 FL3-H       <NA>  1024        1    10000
$P6 FL1-A       <NA>  1024        0     1023
$P7 FL4-H       <NA>  1024        1    10000
141 keywords are stored in the 'description' slot
> sampleNames(fs)
[1] "060909.001" "060909.002" "060909.003" "060909.004" "060909.005"

Secondly, you need to tell it which sample is the baseline(or unstained sample) through 'unstained' argument, NA or NULL are not helpful in your example code.

Thirdly, you want to make sure each sample is matched to the single stained marker properly, by setting stain_match to 'ordered' in your case indicating the samples follow the order of the channel listed above. However your fs only has 4 samples after excluding one unstained sample, but has 5 channels (excluded fsc/ssc) to compute, which is why you see your error ( I've pushed a patch to github repo for the better descriptive error message).

So to get a successful run, you want something like this

> fs <- fs[, -6] # excluding the redundant channel
> spillover(fs, unstained = 1, fsc = "FSC-H", ssc = "SSC-H", stain_match = "ordered")
             FL1-H        FL2-H       FL3-H       FL4-H
FL1-H 1.0000000000 0.2420222776 0.032083706 0.001127816
FL2-H 0.0077220477 1.0000000000 0.140788232 0.002632689
FL3-H 0.0007590319 0.0009620459 0.003218614 1.000000000
FL4-H 0.0150806322 0.1755899032 1.000000000 0.229593860

That said, this result may not be correct, I am just giving your an example of proper API usage. But it is update to you to ensure the order of single-stained samples are consistent with these stains ( or name the sample properly and use stain_match = "regexpr" to do the accurate 1vs1 match between sample and stain).