RPKM and FPKM
6
0
Entering edit mode
Asma rabe ▴ 290
@asma-rabe-4697
Last seen 6.2 years ago
Japan

Hi,

I have from two different RNAseq experiments one has expression values in RPKM and the other has expression values in FPKM.

I would like to know if there is a way to compare gene expression between the two experiments.

Thank you in advance

rnaseq • 4.5k views
ADD COMMENT
1
Entering edit mode
@steve-lianoglou-2771
Last seen 14 months ago
United States

Were these data processed using the same pipeline? I mean, were the reads preprocessed, aligned, and "gene counted" uniformly? If not, I'd start by doing that.

Or is one of them RPKM and the other FPKM simply because one is single end vs. paired end?

Or?

ADD COMMENT
0
Entering edit mode
Asma rabe ▴ 290
@asma-rabe-4697
Last seen 6.2 years ago
Japan

Thanks Steve for your reply.

Both are paired-end reads one was processed using Tophat-cufflinks pipeline and the other was aligned to using bwa and then RPKM was calculated.

Just for curiosity, Is there any way of comparing RPKM and FPKM in case of  one is single end vs. paired end?What about if both are paired end processed in same way but have expression values in different units such as RPKM,FPKM,..etc??

ADD COMMENT
1
Entering edit mode

OK, you should stop right here: TopHat is a splice aware aligner, and bwa isn't, so just start over from the raw reads.

It's not to laborious to realign the data, you can even try kallisto, which apparently can do the alignment for you on a laptop in less than ten minutes, so ... no excuse not to. Sticking with that theme, it looks like the artemis package might helpful in analyzing the results downstream.

You ask if there's any way to compare RPKM and FPKM, etc -- the way I'd answer that is "I don't care". Get your data as comparable as you can, then follow any of the abundant tutorials available to analyze it.

After that, your next problem will be whether or not the comparisons you want to make with this data are confounded by batch, ie. are all of your "cases" from the data that was processed as RPKMs, and the "controls" from the data processed as FPKMs.

 

ADD REPLY
0
Entering edit mode

Thank you very much

ADD REPLY
0
Entering edit mode

Hi Steve,

When i wanted to use kallisto , i got criticism like that i should use more well known (standard ) programs such as bow tie, tophat ,etc.

I know that kallisto is much faster that tophat, so i would like to hear your comment.

ADD REPLY
1
Entering edit mode

All the programs you mention come from the same group (Lior Pachter's lab). Kallisto is intended to be faster than the tuxedo suite while being at least as 'good' and in some sense better.

What is the basis of the argument? I could see from a grantsmanship viewpoint that you might want to use programs that people in a study section might have heard of, but from a practical standpoint that's like saying you should drive a Prius because it is more well known than a Tesla Model S.

ADD REPLY
0
Entering edit mode

no you should drive a Prius because you can't afford the Tesla Model S but in the case of kallisto vs Tophat they're both free :-)

H.

ADD REPLY
0
Entering edit mode

free as in beer, but kallisto has a free-for-academic-use license whereas tophat is Boost.

ADD REPLY
0
Entering edit mode

Pall Melsted happened to be in town and gave a talk on his work to our department, and it was pretty impressive, but why would you care about what I think, anyway?

You can do some reason by starting at either of these two blog posts and continue to dig from there:

You should, in particular, read the arXiv preprint. Unfortunately we can't really kick the tires on kallisto due to its license restrictions, but I just recently learned some folks have been looking at salmon, and it compares quite favorably vs. "traditional" alignments to the genome.

Anyway, the point of my original proposal was to provide you with a quick way to reprocess your reads so you can get to your analysis as soon as possible, but it's almost 20 days later, now, so ... ;-)

For what it's worth, someone posted results of an analysis when the reads were quantified in three different ways (with HTSeq, kallisto, and salmon), his conclusions was that they're all quite similar to each other.

ADD REPLY
0
Entering edit mode
b.nota ▴ 360
@bnota-7379
Last seen 3.6 years ago
Netherlands

Hi Asma,

What would you like to compare? If it is statistical analysis, I would recommend to drop the RPKM and FPKM values and start aligning both sets with the same aligner. Then use raw counts in limma of edgeR for proper statistical analysis.

Good luck!

Ben

ADD COMMENT
0
Entering edit mode
Asma rabe ▴ 290
@asma-rabe-4697
Last seen 6.2 years ago
Japan

Hi,

Thanks Ben for reply. I would like to plot heat map for selected genes in the two studies.

I appreciate if any one can answer the this question:

Just for curiosity, Is there any way of comparing RPKM and FPKM in case of  one is single end vs. paired end?What about if both are paired end processed in same way but have expression values in different units such as RPKM,FPKM,..etc??

Thank you.

 

 

 

ADD COMMENT
0
Entering edit mode
b.nota ▴ 360
@bnota-7379
Last seen 3.6 years ago
Netherlands

As far as I understand, RPKM = FPKM for single-end read experiments. If you have paired-end experiments you should call it FPKM. I think your problem is that different pipelines, use different methods to calculate RPKM/FPKM values. So if you want to compare RPKM/FPKM values between experiments, try to use the same method to produces these values. If you don't you will have technical differences, which are in biological sense not very interesting. Correct me if I am wrong!

ADD COMMENT
0
Entering edit mode
Asma rabe ▴ 290
@asma-rabe-4697
Last seen 6.2 years ago
Japan

Thank you very much..

ADD COMMENT

Login before adding your answer.

Traffic: 431 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6