scRNA-seq: a question about the names of the columns in a Seurat object
Entering edit mode
Bogdan ▴ 620
Last seen 9 days ago
Palo Alto, CA, USA

Dear all,

please would you advise on the following : I am running the package Seurat on the dataset that was published in :

In the article, there are 6 datasets on BIPOLAR cells that are all together in a matrix, and the columns are labelled :

Bipolar1_barcode1, ...., Bipolar1_barcodeXYZ,

Bipolar2_barcode1, ...., Bipolar2_barcodeXYZ,

Bipolar3_barcode1, ...., Bipolar3_barcodeXYZ,

Bipolar4_barcode1, ...., Bipolar4_barcodeXYZ,

Bipolar5_barcode1, ...., Bipolar5_barcodeXYZ,

Bipolar6_barcode1, ...., Bipolar6_barcodeXYZ,

Shall I understand that, when we would like to include multiple experiments in the same matrix, for the analysis with Seurat, we just need to label the columns according to a scheme : ExperimentA_Barcode, ...., ExperimentX_Barcode ;

thanks a lot,

-- bogdan

scRNA-seq SEURAT • 1.1k views
Entering edit mode
Last seen 4 hours ago
United States

For better or for worse, Seurat isn't a Bioconductor package, so this board is technically not the right/best place to get help on using it.

That having been said:

  1. You are likely seeing the <ExperimentN>_<barcodeY> column names because the same barcodes are used across samples/experiments (where barcode is the cell barcode from a 10x-like experiment, or perhaps the ID of a well in some other type of expt). So, if you have count matrices from different experiments, they may just have the <barcodeY> column names, in which case you will have to prefix it with something unique to the experiment.

  2. Just spanking count matrices together from different experiments can be problematic due to batch effects.

Entering edit mode

Thank you Steve ! Yes, I am using both pipelines : 1) Seurat and 2) the workflow based on simpleSingleCell.

Talking about batch effects, If I may add a question, as I have noted 2-3 strategies : 

a. a strategy where the samples from multiple experiments are concatenated in a large matrix (as I have described above).

talking about the batch correction :  one may apply the COMBAT function in SVA package on the matrix.

b. another strategy to use CCA (canonical correlation analysis), as recently published :

c. MNN-based correction, as presented at the link you've provided :

would a strategy work better than other ? what would you advise ? thanks !

Entering edit mode

I would advise you to reference the relevant literature ;-)

The MNN paper does a comparison against COMBAT and shows their method to be superior, and the Seurat preprint claims their method to be superior to MNN.

If it were me, I'd likely ignore COMBAT and take my time with MNN, LIGER, and Seurat v3 to see how they compare to each other. Each has their own set of parameters you should spend some time playing with to understand how they effect the results.

(Note that I've updated the original answer to add reference to LIGER as a 3rd approach to tackle dataset integration)

Entering edit mode

thanks a lot Steve ! that is very helpful and very informative !


Login before adding your answer.

Traffic: 267 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6