Question

Combine RNAseq datasets vst + Combat

0

Entering edit mode

User34591 • 0

@user34591-8699

Last seen 7.0 years ago

France

Hi,
I have a home made RNAseq dataset, and I would like to compare the expression of some genes to TCGA samples (public data). I am not talking about differential analysis here, rather descriptive analysis.
What I would like to do is to first "vst transforme" all data together, then apply Combat on the output.
Is it a right way to perform this kind of analysis ?

Thank you

combat vst deseq2 • 2.4k views

ADD COMMENT • link 9.1 years ago User34591 • 0

0

Entering edit mode

Many thanks for your reply.

My goal is to describe my dataset according to TCGA data. Does my samples have same expression level than those from TCGA for a set of genes (boxplot and PCA/clustering if possible). The absolute expression level is not really important I am more interested by the trend.

ADD REPLY • link 9.1 years ago User34591 • 0

0

Entering edit mode

I moved your post to a comment, because you had posted it as an "Answer" to your original question.

You can see what the Combat authors say, but if the data is perfectly confounded, I don't think these batch correction removing software tools can help at all.

You can try removing GC dependence trends using Bioconductor software like cqn and EDASeq (you would provide these tools with the counts, not the VST values).

ADD REPLY • link 9.1 years ago Michael Love 43k

0

Entering edit mode

I am not sure to understand how GC correction will correct for protocol and batch effect ?

ADD REPLY • link 9.1 years ago User34591 • 0

0

Entering edit mode

Some amount of technical differences in counts across batch can be removed by modeling the dependence of counts on GC content of features (and length as well), as performed by those two software packages I mentioned (see citations for details of those methods). But I don't know of any method that claims to be able to remove all protocol and/or batch effects when those are perfectly confounded with the comparison of interest.

ADD REPLY • link 9.1 years ago Michael Love 43k

0

Entering edit mode

I see your point, thank you ! I know this is really tricky. I will try what you have suggested. Again thank you for your help !

ADD REPLY • link 9.1 years ago User34591 • 0

score 0 · Answer 1 · 2016-10-25

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 7 hours ago

United States

You can use the vst() function to quickly variance stabilize a large set of samples. Whether it makes sense to do different things downstream of that depends on the data and what you plan to do. You haven't really said enough about what you are going to do so that the authors of Combat could reply if it makes sense or not.

ADD COMMENT • link 9.1 years ago Michael Love 43k