Hi,
I have from two different RNAseq experiments one has expression values in RPKM and the other has expression values in FPKM.
I would like to know if there is a way to compare gene expression between the two experiments.
Thank you in advance
Hi,
I have from two different RNAseq experiments one has expression values in RPKM and the other has expression values in FPKM.
I would like to know if there is a way to compare gene expression between the two experiments.
Thank you in advance
Were these data processed using the same pipeline? I mean, were the reads preprocessed, aligned, and "gene counted" uniformly? If not, I'd start by doing that.
Or is one of them RPKM and the other FPKM simply because one is single end vs. paired end?
Or?
Thanks Steve for your reply.
Both are paired-end reads one was processed using Tophat-cufflinks pipeline and the other was aligned to using bwa and then RPKM was calculated.
Just for curiosity, Is there any way of comparing RPKM and FPKM in case of one is single end vs. paired end?What about if both are paired end processed in same way but have expression values in different units such as RPKM,FPKM,..etc??
Hi Asma,
What would you like to compare? If it is statistical analysis, I would recommend to drop the RPKM and FPKM values and start aligning both sets with the same aligner. Then use raw counts in limma of edgeR for proper statistical analysis.
Good luck!
Ben
Hi,
Thanks Ben for reply. I would like to plot heat map for selected genes in the two studies.
I appreciate if any one can answer the this question:
Just for curiosity, Is there any way of comparing RPKM and FPKM in case of one is single end vs. paired end?What about if both are paired end processed in same way but have expression values in different units such as RPKM,FPKM,..etc??
Thank you.
As far as I understand, RPKM = FPKM for single-end read experiments. If you have paired-end experiments you should call it FPKM. I think your problem is that different pipelines, use different methods to calculate RPKM/FPKM values. So if you want to compare RPKM/FPKM values between experiments, try to use the same method to produces these values. If you don't you will have technical differences, which are in biological sense not very interesting. Correct me if I am wrong!
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
OK, you should stop right here: TopHat is a splice aware aligner, and bwa isn't, so just start over from the raw reads.
It's not to laborious to realign the data, you can even try kallisto, which apparently can do the alignment for you on a laptop in less than ten minutes, so ... no excuse not to. Sticking with that theme, it looks like the artemis package might helpful in analyzing the results downstream.
You ask if there's any way to compare RPKM and FPKM, etc -- the way I'd answer that is "I don't care". Get your data as comparable as you can, then follow any of the abundant tutorials available to analyze it.
After that, your next problem will be whether or not the comparisons you want to make with this data are confounded by batch, ie. are all of your "cases" from the data that was processed as RPKMs, and the "controls" from the data processed as FPKMs.
Thank you very much
Hi Steve,
When i wanted to use kallisto , i got criticism like that i should use more well known (standard ) programs such as bow tie, tophat ,etc.
I know that kallisto is much faster that tophat, so i would like to hear your comment.
All the programs you mention come from the same group (Lior Pachter's lab). Kallisto is intended to be faster than the tuxedo suite while being at least as 'good' and in some sense better.
What is the basis of the argument? I could see from a grantsmanship viewpoint that you might want to use programs that people in a study section might have heard of, but from a practical standpoint that's like saying you should drive a Prius because it is more well known than a Tesla Model S.
no you should drive a Prius because you can't afford the Tesla Model S but in the case of kallisto vs Tophat they're both free :-)
H.
free as in beer, but kallisto has a free-for-academic-use license whereas tophat is Boost.
Pall Melsted happened to be in town and gave a talk on his work to our department, and it was pretty impressive, but why would you care about what I think, anyway?
You can do some reason by starting at either of these two blog posts and continue to dig from there:
You should, in particular, read the arXiv preprint. Unfortunately we can't really kick the tires on kallisto due to its license restrictions, but I just recently learned some folks have been looking at salmon, and it compares quite favorably vs. "traditional" alignments to the genome.
Anyway, the point of my original proposal was to provide you with a quick way to reprocess your reads so you can get to your analysis as soon as possible, but it's almost 20 days later, now, so ... ;-)
For what it's worth, someone posted results of an analysis when the reads were quantified in three different ways (with HTSeq, kallisto, and salmon), his conclusions was that they're all quite similar to each other.