Combine RNAseq datasets vst + Combat
1
0
Entering edit mode
User34591 • 0
@user34591-8699
Last seen 2.9 years ago
France

Hi,
I have a home made RNAseq dataset, and I would like to compare the expression of some genes to TCGA samples (public data). I am not talking about differential analysis here, rather descriptive analysis.
What I would like to do is to first "vst transforme" all data together, then apply Combat on the output.
Is it a right way to perform this kind of analysis ?

Thank you

combat vst deseq2 • 833 views
0
Entering edit mode

My goal is to describe my dataset according to TCGA data. Does my samples have same expression level than those from TCGA for a set of genes (boxplot and PCA/clustering if possible). The absolute expression level is not really important I am more interested by the trend.

0
Entering edit mode

I moved your post to a comment, because you had posted it as an "Answer" to your original question.

You can see what the Combat authors say, but if the data is perfectly confounded, I don't think these batch correction removing software tools can help at all.

You can try removing GC dependence trends using Bioconductor software like cqn and EDASeq (you would provide these tools with the counts, not the VST values).

0
Entering edit mode

I am not sure to understand how  GC correction will correct for protocol and batch effect ?

0
Entering edit mode

Some amount of technical differences in counts across batch can be removed by modeling the dependence of counts on GC content of features (and length as well), as performed by those two software packages I mentioned (see citations for details of those methods). But I don't know of any method that claims to be able to remove all protocol and/or batch effects when those are perfectly confounded with the comparison of interest.

0
Entering edit mode

I see your point, thank you ! I know this is really tricky. I will try what you have suggested. Again thank you for your help !

0
Entering edit mode
@mikelove
Last seen 21 hours ago
United States

You can use the vst() function to quickly variance stabilize a large set of samples. Whether it makes sense to do different things downstream of that depends on the data and what you plan to do. You haven't really said enough about what you are going to do so that the authors of Combat could reply if it makes sense or not.