Question: tximport and de novo transcriptome
0
gravatar for RMRG
12 months ago by
RMRG10
RMRG10 wrote:

Hi,

I typically run DE analyses through the trinity pipeline, but due to certain features of the transcriptome I'm currently working on, I've found it's essential to use a kmer value higher than allowed in trinity. I'm not used to the various software for doing DE analyses outside of trinity.

I am working with a de novo assembly from a non-model organism (2 genotypes) and I don't have a high quality genome assembly to go with it. I have carried out transcript-level abundance pseudoalignments with salmon, and I'd like to get gene-level abundances, but I'm not quite sure how to do this. It seems that tximport is commonly used to get this, but from all of the examples I've seen, it seems that it requires a known set of genes, presumably from a sequenced genome project.

Is it possible to do what I want to do with tximport? Or some other program?

Thanks for your help!

 

 

ADD COMMENTlink modified 12 months ago by James W. MacDonald51k • written 12 months ago by RMRG10
Answer: tximport and de novo transcriptome
1
gravatar for Michael Love
12 months ago by
Michael Love25k
United States
Michael Love25k wrote:

You can set txOut to TRUE to import transcript level only. Or you can make up your own table if you want to do any summarization. tx2gene is just a data.frame.

ADD COMMENTlink written 12 months ago by Michael Love25k

Thanks!

But could you direct me to the software I could use to actually do the summarization? I.E. to get 'genes' from transcripts in a similar way to what Trinity does for de novo assemblies?

ADD REPLYlink written 12 months ago by RMRG10
Answer: tximport and de novo transcriptome
1
gravatar for James W. MacDonald
12 months ago by
United States
James W. MacDonald51k wrote:

Trinity outputs both the transcript and gene ID, and you can use all the transcripts for each gene, just like you would normally do with a more comprehensive transcriptome/genome. I generally use a two-step approach; I first generate a tx2gene data.frame based on what Trinity says are the transcript/gene combinations, and then import using tximport. The next step is generally to get rid of genes that have consistently low counts (which there will be many, due to Trinity's greedy algorithm - lots of those transcripts aren't real). I then usually go back and use BLAST+ to align the filtered transcripts against some reasonable database of sequences, which invariably results in many transcripts being matched to the same gene. I then use that information to make an updated tx2gene and read the data back in.

That usually gets you to a reasonable set of genes, with many actually having some sort of tenuous annotation from BLAST.

ADD COMMENTlink written 12 months ago by James W. MacDonald51k
1

Oh, and if you are using salmon to do the alignment, particularly a newer version, you will probably want to set the --incompatPrior to some non-zero value. Particularly with de novo transcriptomes it seems that you end up with lots of incompatible libtypes, and if --incompatPrior is set to zero (the default now) you may end up with really low mapping rates.

ADD REPLYlink written 12 months ago by James W. MacDonald51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 204 users visited in the last hour