Question

sc-RNA data analysis using scater

0

Entering edit mode

Bioinformatician_R ▴ 20

@hrishi27n-11821

Last seen 9 weeks ago

United States

Hello All,

I am trying to analyze data for a single cell RNA sequencing experiment, for QC and normalization I am considering using the scater package. There are a few things I would like to know before starting analyzing this dataset. All your help and suggestions are much appreciated. This is my first attempt to analyze sc-RNA, I apologize in advance if my questions are confusing. Questions: 1) The sequencing lab is using an unpublished protocol, they have provided read counts and spike-ins file separately. Do I need to combine these two files? I am considering this, for the "feature_controls" option for calculateQCMetrics method.

2) After doing the initial QC, I see that total counts for all my wells(cells) is almost >50k+ and the number of genes detected is above 10k. I am removing genes that have 0 expression, I am also filtering genes with very low average nonzero expression across all cells(using a mean of counts across all cells). Do I need to do any other filtering for both cells and genes?

scater scrna • 1.8k views

ADD COMMENT • link updated 8.4 years ago by Aaron Lun ★ 28k • written 8.4 years ago by Bioinformatician_R ▴ 20

score 1 · Answer 1 · 2017-02-02

1

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 1 hour ago

The city by the bay

For your first question; does the "read counts" file already contain counts for the spike-in transcripts? If yes, then that's all you need to make a SCESet object in scater; just supply the count matrix as countData= in the constructor. Otherwise, you'll first have to rbind the matrix of gene counts with that of the spike-in counts. Note that you don't need to know the concentrations of the spike-ins to use most scater functions.

As for your second question, have a look at https://www.bioconductor.org/help/workflows/simpleSingleCell/.

ADD COMMENT • link 8.4 years ago Aaron Lun ★ 28k

0

Entering edit mode

Aaron,

Thanks for the reply. My "read count" file does't include the ERCC's. A grep for "^ERCC-" doesn't really give anything, however the ERCC's are provided in a separate file. I guess I might have to do a rbind on the gene count file to include the spike-in data.

ADD REPLY • link 8.4 years ago Bioinformatician_R ▴ 20