Question: Normalization and Quality Control for Multiple scRNAseq Data Sets
gravatar for lmm278
3 months ago by
lmm2780 wrote:


I am trying to analyze scRNAseq data from multiple 10X Genomics Chromium samples that I sequenced in different runs. My data is from injured tissue that contains many different cell types and I'd like to be able to compare relative gene expression both within and between cell types over time (I have one sample from 5 different time points all sequenced separately). I am new to working with large data sets, and I am trying to implement the proper quality control measures, as well as proper normalization procedures before clustering. I have a rough understanding of the potential issues with comparing data sets sequenced separately. I am wondering: Are the Batch Effect and the Dropout Effect the only issues I need to control for with data sets from different sequencing runs? If so, does anyone have suggestions on the best ways to go about this (preferably in a program such as R)? Also, is it acceptable to use housekeeping gene expression as a way to normalize for different runs, which is done with many qpcr analyses?

Thank you! (and sorry for the many questions)

ADD COMMENTlink modified 3 months ago by Steve Lianoglou12k • written 3 months ago by lmm2780
Answer: Normalization and Quality Control for Multiple scRNAseq Data Sets
gravatar for Steve Lianoglou
3 months ago by
Steve Lianoglou12k wrote:

There is a great series of articles written by Aaron Lun, Davis McCarthy, and John Marioni that outline various aspects of working with single cell data in the bioconductor-verse, I will link to both the "release" and "development" version of these articles.

If I were you I'd focus on the devel stuff. These will soon become the "release" version of bioconductor (in about or month or so, I think). Although there is a bit more work involved to get a "devel" environment working, for your particular problem (and release timing of devel), I think it will be quite worth your time.

  1. release workflow
  2. devel workflow

Although all of these articles will be relevant to you as you come up to speed with analyzing single cell data, the "Correcting batch effects" one is of particular importance to your direct question.

Good luck!


ADD COMMENTlink modified 3 months ago • written 3 months ago by Steve Lianoglou12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 188 users visited in the last hour