Search
Question: DESeq2 Following RSEM
0
15 months ago by
tungphannv0 wrote:

I am using RNA-seq data processed by RSEM given by someone else. The data has this format:

         sample1 sample2 sample3 sample4 sample5 sample6
A1BG       94.64  278.03   94.07   96.00   55.00   64.03
A1BG-AS1   64.28  114.08   98.88  109.05   95.32   73.11
A1CF        0.00    2.00    1.00    1.00    0.00    1.00
A2M        24.00    2.00   18.00    4.00   35.00    8.00
A2M-AS1     1.00    1.00    1.00    0.00    0.00    0.00
A2ML1       0.84    2.89    0.00    1.00    2.00    3.11

But seems like DESeq2 can't handle integers. I tried tximport but hasn't been able to implement it for it to work, also I don't understand how/whether tximport would help in this situation. Can I just use round(data) instead? Anyone has better suggestion to deal with this kind of data? Thanks so much!

modified 15 months ago by Michael Love18k • written 15 months ago by tungphannv0
1
15 months ago by
Michael Love18k
United States
Michael Love18k wrote:

hi,

Use tximport to import the RSEM estimated counts. This will take care of everything for you. If you encounter a problem, please report back with full documentation of your code and the error and your sessionInfo().

There is an example in the tximport vignette of importing RSEM estimated counts.

From the vignette, I think tximport expect a list of .genes.files from RSEM. For some reason, I don't have those files. The only data I have is the tsv file with the expected counts in above format (e.g first column is the list of genes, and the next 8 columns contains expected counts - 4 of them are control).

Is it possible to make it work with tximport/deseq2 ? Thank you very much.

So this is a little sub-optimal compared to using tximport (which leverages the effective length information and isoform abundance estimates), but you can round the matrix of estimated counts (note, these are not normalized counts) and feed these into DESeqDataSetFromMatrix.

0
15 months ago by
Walter F. Baumann 10 wrote:

The DESeq2 manuals says: "As input, the DESeq2 package expects count data as obtained", which means just handles whole numbers. I doubt that "round" is a good approach here, because you change the counts. Where does the e.g. 0.64 count from Sample1 A1BG come from?

0
15 months ago by
tungphannv0 wrote:

I think it comes from the way RSEM assign their counts to certain genes since RSEM estimates abundances for transcript/genes.