DropletUtils barcodeRanks bug (?)
1
0
Entering edit mode
jmanning • 0
@jmanning-18917
Last seen 5.4 years ago

Hi,

I'm running DropletUtils' (1.2.1) barcodeRanks() function, and noticing some inconsistencies in the output. 

I run the command like:

br.out <- barcodeRanks(counts(single_cell_experiment))

single_cell_experiment is an object parsed from the pbmc3k dataset, 2700 barcodes. 

From the docs I expect br.out$rank, br.out$total and br.out$fitted to be of length 2700 corresponding to the columns of single_cell_experiment. The vectors are the right length, but I noticed the labels of the output vectors don't match the input SCE (or each other):

> head(colData(single_cell_experiment))

DataFrame with 6 rows and 4 columns
                                                                                     
Sample Barcode barcodeRank barcodeTotal                                                                                
<character>      <character>   <numeric>    <numeric>
AAACATACAACCAC-1 /test_data AAACATACAACCAC-1        2599           NA
AAACATTGAGCTAC-1 /test_data AAACATTGAGCTAC-1      2322.5         2149
AAACATTGATCAGC-1 test_data AAACATTGATCAGC-1         109         1566
AAACCGTGCTTCCG-1 test_data AAACCGTGCTTCCG-1         836           NA
AAACCGTGTATGCG-1 /test_data AAACCGTGTATGCG-1       966.5           NA
AAACGCACTGGTAC-1 /test_data AAACGCACTGGTAC-1       177.5         2211

> head(br.out$rank)

TTACTCGAACGTTG-1 AGAGGTCTACAGCT-1 GGCACGTGTGAGAA-1 GCGAAGGAGAGCTT-1 ACGAACTGGCTATG-1 GGGCCAACCTTGGA-1 
          1042.0             77.0            429.0            784.5           2555.5           1415.0 

> head(br.out$total)

CCAGTCTGCGGAGA-1 TTACTCGAACGTTG-1 AGAGGTCTACAGCT-1 GGCACGTGTGAGAA-1 GCGAAGGAGAGCTT-1 ACGAACTGGCTATG-1 
            2421             4903             3149             2639              981             2164 

If I try and match on the vector names I discover there are multiple entries for some single_cell_experiment barcodes, none for others.

All this to say that I suspect this means the vector labels in br.out here are meaningless relics- but if so the labels should probably be unset to prevent confusion ;-).

Thanks,

Jon

 

 

dropletutils • 1.3k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 16 minutes ago
The city by the bay

Your diagnosis is correct, the rle function does some weird things with the names. The next version (i.e., in BioC 3.9) of the package will remove the names from each vector in favour of returning a DataFrame where the row names are the column names of the input matrix. (The knee and inflection point estimates will be stored in the metadata so this will require some changes to any downstream code using them.)

ADD COMMENT
0
Entering edit mode

Great- thanks.

ADD REPLY

Login before adding your answer.

Traffic: 357 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6