Question: Kallisto output with DEXSeq
0
gravatar for j2h2k2
12 months ago by
j2h2k20
j2h2k20 wrote:

Hello, 

I am trying to use the Kallisto aligner to run differential exon usage on my dataset. I want to run DEXSeq but am running into an error. Kallisto's GenomeBam does not include an NH tag so DEXSeq's dexseq_counts.py command refuses to run. When trying with Kallisto's PseudoBam the script reports them all as empty, likely due to the coordinates not matching up as PseudoBam doesn't project to genomic coordinates. I was wondering if there is some work around or changing dexseq_counts.py to not require the NH tag?

Thank you.

dexseq kallisto nh_tag • 414 views
ADD COMMENTlink modified 12 months ago by Alejandro Reyes1.7k • written 12 months ago by j2h2k20
Answer: Kallisto output with DEXSeq
2
gravatar for Alejandro Reyes
12 months ago by
Alejandro Reyes1.7k
Dana-Farber Cancer Institute, Boston, USA
Alejandro Reyes1.7k wrote:

Hi! I have not personally try something like that, but I know that others have use the kallisto output as input to DEXSeq. See for example these two papers from Mark Robinson's lab:

1. https://f1000research.com/articles/5-1356/v1
2. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0862-3

However, these papers use transcript-level counts inferred from kallisto directly into DEXSeq. This is an important distinction from what you are trying to do. The script `dexseq_count.py` is designed for assigning reads mapped to the reference genome into exon regions. I doubt that it will work for counting reads per exonic regions starting from pseudo-alignments to reference transcriptomes. I am actually not sure if it is possible to extract exon-level counts from the output of kallisto.

Best,
Alejandro 

ADD COMMENTlink written 12 months ago by Alejandro Reyes1.7k

Thank you for your reply. That makes sense to me as I was thinking that kallisto's program seems to do the job of dexseq_counts.py. The papers you attached do mention using DEXSeq for some transcript analysis, however I do not seem to be able to get the DEXSeqDataSet to run. Would you happen to know which of the DEXSeqDataSet commands might take kallisto output (and in what form) and make it a DEXSeq object to be able to run through the DEXSeq command?

Thank you

ADD REPLYlink written 12 months ago by j2h2k20
2
One option is to import the kallisto output into R with tximport (set txOut=TRUE to get the transcript-level count matrix). Then you can use DEXSeqDataSet() to create the DEXSeqDataSet object. You'll need to provide the featureID (which would be the rownames of the count matrix) and the groupID (which would be the corresponding gene for each transcript).
ADD REPLYlink written 12 months ago by csoneson100

So I have my tximport object from when I did DESeq2, and tried to run it through DEXSeqDataSet. The command said that it was a list not a dataframe/matrix so I pulled out just the counts that are in a matrix. However now I am getting an error that not all values in the assay are integers. Do you know if there is a way around this? all my attempts to change the numbers to integers just converts the matrix to an integer and leaves it useless and unable to convert back into a matrix.

Thank you.

ADD REPLYlink written 12 months ago by j2h2k20
2

You can do round(counts) to get a matrix of integers. Also note that if you previously did differential gene expression analysis with DESeq2, your count matrix is probably on the gene level. For DEXSeq, you need a transcript-level count matrix (i.e., by setting txOut=TRUE in the tximport call). 

ADD REPLYlink modified 12 months ago • written 12 months ago by csoneson100

I was able to get the command to work and got DEXseq to run for me.

Thank you very much.

ADD REPLYlink written 12 months ago by j2h2k20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 185 users visited in the last hour