(how) can I make heatmaps/PCA in DESeq2 using normalized counts from cummeRbund?
2
0
Entering edit mode
Jon Bråte ▴ 260
@jon-brate-6263
Last seen 5 months ago
Norway

Hi,

I like to use DESeq2 for making PCA-plots and heat maps, but on a current dataset we only have count values from cufflinks/cummeRbund (exported using count() in cummeRbund). I know DESeq2 needs raw counts, but can I use these counts only for plotting/visualization? And can I perform the rlog-transformation on cufflinks normalized counts?

Thanks

deseq2 cummerbund cufflinks • 3.0k views
ADD COMMENT
1
Entering edit mode
@andrewjskelton73-7074
Last seen 8 months ago
United Kingdom

'Raw counts' from Tuxedo are not really raw counts, they're "raw pseudo counts" - So you won't get the type of data that DESeq2 excepts (short of rounding the values you get out of count in cummeRbund). 

Assuming your output of count is called foo

counts_in <- ceiling(foo)
dds       <- DESeqDataSetFromMatrix(counts_in, 
                                    colData = data.frame(names=1:ncol(counts_in)), 
                                    design=~1)
rld       <- rlog(dds)
plotPCA(rld)

CummeRbund offers a method to perform a PCA of FPKM values, however if you want to use the DESeq2 methods, I'd recommend you follow the DESeq2 workflow: htSeq_Count from alignments -> DESeq2, rather than trying to manipulate the output of cummerbund. 

ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 23 hours ago
United States

We require integers as input to protect against users accidentally inputting FPKM or normalized counts (counts corrected for library size). In both of these cases, the precision has been altered from what is expected by the statistical model, so this really breaks the assumptions of our software. I will say that I've used the EDA and DE routines of DESeq2 on rounded estimated counts before, but only when I made sure that the value is an estimation of the count of fragments assigned to a gene (not transcript), and it has not been divided by a library size correction. One concern with this approach though, is if you use software which distributes fragments which could be assigned to many homologous genes, then if there is DE in one, it could be attributed to all the genes.

ADD COMMENT
0
Entering edit mode

Hello Michael:

Why is it not ok to round the counts assigned to transcripts?

Thanks,

Nik

 

ADD REPLY
1
Entering edit mode
I think rounded estimated *gene* counts are fine for DESeq2, but estimated transcript counts are negatively correlated within a gene -- there is a lot of additional variance from estimation uncertainty. DESeq2 is not built for transcript level analysis.
ADD REPLY
0
Entering edit mode

What methods take that into account? Cuffdiff? ALDEx2?

ADD REPLY

Login before adding your answer.

Traffic: 642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6