() followed by [] within an argument in singlecell OSCA workflow
Entering edit mode
Last seen 10 months ago

Hi Community,

I've been working through the OSCA workflow for single cell Lun 416B cell line (http://bioconductor.org/books/3.14/OSCA.workflows/lun-416b-cell-line-smart-seq2.html).

I'm trying to understand the logic of what is happening in the argument, "col", in the plot below as I haven't come across parentheses followed by brackets used in this way and couldn't insert an operator in between that returned the same result.

The individual components make sense; c("black", "red") is specifying two colours. grepl() is a search through "phenotype" looking for the string "induced" and will return 1 or 2 representing T or F. But how are those two parts interacting in a 'programming' sense? Is there an operator that can be placed in between that might give me some insight?

plot(librarySizeFactors(sce.416b), sizeFactors(sce.416b), pch=16,
    xlab="Library size factors", ylab="Deconvolution factors", 
    col=c("black", "red")[grepl("induced", sce.416b$phenotype)+1],
SingleCell R OSCA • 793 views
Entering edit mode
jeroen.gilis ▴ 90
Last seen 6 months ago

When looking at such code, I like to start from the center and work outwards (if that makes sense).

First we have the grepl statement, which looks for matches between the vector sce.416b$phenotype to the string "induced". For each element of sce.416b$phenotype, the grepl function will return a TRUE or a FALSE, i.e., a match or a no-match to "induced".

Second you need to know that R (and many other languages) allow for interpreting TRUE and FALSE as 1 and 0. Try doing TRUE*5 or FALSE-1 in the console, you will see.

So, the +1 behind grepl is converting the 0's and 1's to 1's and 2's, respectively. The result of grepl("induced", sce.416b$phenotype)+1 will thus be a vector of 1's and 2's. Lets say that vector is c(1,1,1,2,2,2).

c("black", "red") is simply a vector of lenght 2. The brackets are just selecting, based on position, either the value at position 1 or the one at position 2. If inside the brackets we have c(1,1,1,2,2,2), then the end result will be c("black","black","black","red","red","red").

As a final note, these are concepts that are not specific to Bioconductor, but rather general to R. So typically, questions like these are raised and answered on platforms like stackoverflow.


Login before adding your answer.

Traffic: 1389 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6