Question: Using DESeq2 for T-cell receptor clonotypes
gravatar for jirkanov
12 weeks ago by
jirkanov0 wrote:


I need to do a differential expression analysis of T-cell receptor (TCR) clonotypes. Simplified, TCRs are special proteins (part of immune system) which can recognize various another proteins. Because of their dynamic recognition abilities, gene encoding TCR has variable regions which are different in various T-cells - those differing T-cells are called clonotypes.

I have RNA-Seq data from 300bp paired-end MiSeq run. Four different samples, three biological replicates, 12 samples in total. Reads are such that everyone contains the variable region of TCR and UMI barcode on 5' end. Doing the standard pipeline for TCR analysis (MIGEC), I got a count matrix where columns are samples and rows are clonotypes. In my opinion, this is very similar to normal RNA-Seq count matrix where rows are genes.

Unfortunately, sequencing didn't go very well, so there are large differences in depth. Next thing is there are some dominating clonotypes, highly abundant across all samples, and on the other hand some clonotypes are very rare, with zero counts in almost all samples. Overall, I have 615 clonotypes. To get rid of those "zero" clonotypes, I did a standard rowSums thresholding: dds[rowSums(counts(dds)) >= 10, ]

but only 60 clonotypes left! With threshold of 5, 134 clonotypes left.

My question is whether this type of data is suitable for analysis with DESeq2.

To see my existing results, you can download RMarkdown HTML report:

Thank you in advance for any help!

ADD COMMENTlink modified 12 weeks ago by Michael Love20k • written 12 weeks ago by jirkanov0
gravatar for Michael Love
12 weeks ago by
Michael Love20k
United States
Michael Love20k wrote:

I think the dispersion shrinkage can be useful even with eg 100 rows. The depth issue is not a problem alone, unless it is very confounded with condition. Can you plot sizeFactors(dds) over dds$condition? With so few genes you may want to use fitType=“mean”.

ADD COMMENTlink written 12 weeks ago by Michael Love20k

Thanks for quick answer. Here it is:

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by jirkanov0

That looks fine. The overall range really isn’t so bad.

ADD REPLYlink written 12 weeks ago by Michael Love20k

OK. And should I use fitType=“mean”?

ADD REPLYlink written 12 weeks ago by jirkanov0

That's what i recommended above

ADD REPLYlink written 12 weeks ago by Michael Love20k

Sure, I was just a little bit confused with the formulation "you may want" :-) Many thanks Michael!

And by the way, thumbs up for your talks at CSAMA2018, they were fantastic :-) Unfortunately, at that time, I didn't have those data to ask you personally...

ADD REPLYlink written 12 weeks ago by jirkanov0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 233 users visited in the last hour