Question: Regarding scRNA seq vs Bulk RNA seq
gravatar for naman.sep
15 months ago by
naman.sep0 wrote:


I have data from scRNA- seq and Bulk RNA seq data from same cell/tissues. I want to know why there are certain transcripts which show up in single cells are not shown in Bulk RNA seq data. Is it something normal? I would assume that all transcripts in single cell should be represented in bulk data, while the same will not be true for bulk data because you much more transcripts detected in bulk data. But it was weird for me to see single cell RNA seq data had few transcripts which never showed up in bulk RNA seq. Ofcourse i would believe the difference between single cell data with different transcripts number detected but why do we have this variation in single vs bulk data.  IS this a technical noise that we should ignore. 



ADD COMMENTlink modified 15 months ago by galib3610 • written 15 months ago by naman.sep0
gravatar for Aaron Lun
15 months ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

Because they involve different protocols? Bulk RNA-seq tends to use (ribosome-depleted) total RNA protocols nowadays, while most single-cell RNA-seq uses poly-A'd approaches. I can imagine that this would result in different biases and preferences for particular transcripts.

There are also considerations with cell dissociation and size. For example, if a tissue contains some fragile cell types, these would lyse and not show up in the single-cell data. In comparison, the fragile cell types would still be present in the bulk data where no dissociation is required, only lysis of the entire tissue. The resulting bulk-only transcripts would compete with and suppress the coverage of transcripts unique to other cell types, resulting in counts that are only observed in single-cell data. A similar effect occurs with large and small cells in the same bulk population, where transcripts unique to small cells get suppressed in bulk data.

Finally, there is always sampling noise, which means that transcripts for lowly expressed genes may be sampled in a few cells on a plate but not in the bulk sample. You would have to have equal total sequencing depth between the bulk sample and all single-cell samples for the counts to be fully comparable. Obviously you will miss transcripts if the bulk sample is not sequenced to the same depth.

P.S. This question is more suited for a general forum like SEQAnswers, it doesn't seem to involve any Bioconductor packages.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Aaron Lun21k

Thank you Aaron Lun. May be i did not mention clearly. Bulk and single cell RNA seq were performed exactly the same way using same protocol of smart seq2. So we sorted single cell and 50 cells which means basically it is smart seq2 of single cell vs smart seq2 for 50 cells(bulk). Thats why i was wondering if this variation is due to technical noise or something else. 


Thanks and Regards

ADD REPLYlink written 15 months ago by naman.sep0

See the edits to my answer.

ADD REPLYlink written 15 months ago by Aaron Lun21k

Thank you for the reply. I was not aware of this if we are only suppose to ask questions related to bioconductor packages. 

ADD REPLYlink written 15 months ago by naman.sep0
gravatar for galib36
15 months ago by
United Kingdom
galib3610 wrote:

Are these lower expressed genes? It could be that as there are fixed number of reads that are shared across the mRNAs, the highly expressed genes are taking the majority share of the reads and so the lowly expressed genes are not getting any read.


ADD COMMENTlink written 15 months ago by galib3610

Thank you galib for the reply. But how do we explain that if the genes which are low expressed are present in single cells and not in bulk data. My problem is that there are few genes which show up in single cells but not in the bulk data. But i understood your point. Logically genes from single cells should all be present in the bulk(50 cells in my case) data- right?

ADD REPLYlink written 15 months ago by naman.sep0

The last para of Aaron Lun's modified reply answers the question. In Bulk population cells were dominating with some very high expressed genes causing all the reads to go for those genes. This would not be dominated in single-cells because some of the cells would have those highly expressed genes moderately expressed causing the lowly expressed genes to share some reads and thus causing those genes to show up. One way to check is to look at the expression distribution or variance in the highly expressed and low expressed genes.




ADD REPLYlink modified 15 months ago • written 15 months ago by galib3610
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 215 users visited in the last hour