Question: Using DESeq2 for T-cell receptor clonotypes
0
8 months ago by
jirkanov0
jirkanov0 wrote:

Hello,

I need to do a differential expression analysis of T-cell receptor (TCR) clonotypes. Simplified, TCRs are special proteins (part of immune system) which can recognize various another proteins. Because of their dynamic recognition abilities, gene encoding TCR has variable regions which are different in various T-cells - those differing T-cells are called clonotypes.

I have RNA-Seq data from 300bp paired-end MiSeq run. Four different samples, three biological replicates, 12 samples in total. Reads are such that everyone contains the variable region of TCR and UMI barcode on 5' end. Doing the standard pipeline for TCR analysis (MIGEC), I got a count matrix where columns are samples and rows are clonotypes. In my opinion, this is very similar to normal RNA-Seq count matrix where rows are genes.

Unfortunately, sequencing didn't go very well, so there are large differences in depth. Next thing is there are some dominating clonotypes, highly abundant across all samples, and on the other hand some clonotypes are very rare, with zero counts in almost all samples. Overall, I have 615 clonotypes. To get rid of those "zero" clonotypes, I did a standard rowSums thresholding: dds[rowSums(counts(dds)) >= 10, ]

but only 60 clonotypes left! With threshold of 5, 134 clonotypes left.

My question is whether this type of data is suitable for analysis with DESeq2.

To see my existing results, you can download RMarkdown HTML report: https://owncloud.cesnet.cz/index.php/s/UtWukFacNR6kD3Y

Thank you in advance for any help!

modified 8 months ago by Michael Love23k • written 8 months ago by jirkanov0
Answer: Using DESeq2 for T-cell receptor clonotypes
0
8 months ago by
Michael Love23k
United States
Michael Love23k wrote:

I think the dispersion shrinkage can be useful even with eg 100 rows. The depth issue is not a problem alone, unless it is very confounded with condition. Can you plot sizeFactors(dds) over dds\$condition? With so few genes you may want to use fitType=“mean”.

Thanks for quick answer. Here it is:

ADD REPLYlink modified 8 months ago • written 8 months ago by jirkanov0

That looks fine. The overall range really isn’t so bad.

OK. And should I use fitType=“mean”?

That's what i recommended above

Sure, I was just a little bit confused with the formulation "you may want" :-) Many thanks Michael!

And by the way, thumbs up for your talks at CSAMA2018, they were fantastic :-) Unfortunately, at that time, I didn't have those data to ask you personally...