Question: PCA outliers cleaning
0
11 days ago by

Hii,

I have been trying to analyse an available scRNA seq data set with the script they provided, however as a beginner it was not that intuitive for me so I decided to combine it with Seurat/SingleCellExperiment pipelines on my own and try to get a similar final outcome.

Here is the available script in Github (https://github.com/saeyslab/brainimmuneatlas/blob/master/script_scRNAseq.R) as a reference.

In my case, I converted the data.matrix into a Seurat object, visualize QC metrics and did a first cleaning based on that, calculated cell-cycle scores for second cleaning, then I converted to SingleCellExperiment and used isOutlier function to clean outliers (cells, genes and mit DNA %) (mad=4). Then I examined genes levels and did a cutoff in 10^-3.5.

In this point I have two problems: 1) For generating PCA and cleaning from outliers, I used runPCA function and detected outliers, however I am a bit lost in how to delete them from my original sce object. Do I need to create a metadata matrix as in the original script or is there an easier way?

2) Once I want to normalize, I used computeSumFactors for correct calculation of size factors and then I normalize and from there I return the expression values matrix with exprs function. If now I want to create a new Seurat object with this expression matrix, would I need to normalized again (with NormalizeData function) or it is not necessary anymore?

Hope it was not too much!

modified 10 days ago • written 11 days ago by albadrs.9310

Reading the book may give you a better background when starting scRNA-seq data analysis.

0
11 days ago by
United States
James W. MacDonald51k wrote:

Edit I missed that you meant SingleCellExperiment by SCE.

1) Generating principal components is generally used to reduce dimensionality, not to find and exclude outliers (what would an outlier be, in this context?). If you did decide you have an outlier, the SingleCellExperiment class acts just like a matrix when using [ for subsetting, so you can just drop columns you don't like.

2) Seurat isn't a Bioconductor package, and if you want to mix'n'match things like that, you are by definition deciding that you have the wherewithal to do so. So I guess you have to figure that out for yourself.