Hello everyone. I have some RNAseq data from different conditions and samples. Using tximport function, I was able to map transcript ids to gene ids. Now I have a data matrix where rows are gene ids and columns are TPM values for 21 samples. Since I have several transcript variants for each gene, their transcript ids are different, but they map to a same gene id. To make some biological meaning out of this, I want to ignore transcript variants and merge all TPM values for the same gene id. What is the best way to do this? Sum them up? Or a weighed sum (based on length)?
Since I am new to RNAseq, I will appreciate your help :)
That doesnt seem to be true in our case. We got our ref database from UCI. The transcript variants/isoforms have diff transcript IDs but ofcourse will have the same gene id. So when tximport maps them, it will treat isoforms uniquely, and in the output, you will have several rows of different transcript ids with the same gene id. I hope I am able to make myself clear.
You will have to provide some data. Simply saying that you get something you don't want without showing what you got isn't helpful. Plus you need to show the code you used, and the output from running
sessionInfo()
.Please show some code, you either use wrong code or your tx2gene map is not correctly put together.