Question

Calculating FPKM from raw counts data using fpkm() in deseq2

1

Entering edit mode

Beginner ▴ 60

@beginner-15939

Last seen 15 months ago

Switzerland

I have raw counts data from featureCounts. I actually wanted to do survival analysis. For a specific gene I want to classify the samples into Low and High based on expression cutoff. For that I'm using maxstat package.

First I would like to convert raw counts to FPKM. So, I did like following.

              sample1   sample2   sample3   sample4   sample5
A1BG-AS1      195         612       145       131       300
A2M-AS1       373         445       573      1388      1386
A2ML1-AS1     75          27         45       18        35
A2ML1-AS2      0           0         0         0        0
AA06           0           0         0         0        0

I have a matrix like above having genes as rows and samples as columns.

library("DESeq2")
dds <- DESeqDataSetFromMatrix(countData = counts,
                              colData = coldata,
                              design = ~ Type)
dds
dds <- estimateSizeFactors(dds)

And now I used fpm() function to calculate FPKM from counts data.

fpkm_data <- fpkm(dds)
Error in fpkm(dds) : rowRanges(object) has all ranges of zero width.
the user should instead supply a column, mcols(object)$basepairs,
which will be used to produce FPKM values

Sorry, no idea about what to do now. Can you please tell me what I need to do to the data to calculate FPKM. Thank you

deseq2 r fpkm() geneexpression • 12k views

ADD COMMENT • link updated 5.8 years ago by Michael Love 42k • written 5.8 years ago by Beginner ▴ 60

0

Entering edit mode

Hello,

I am having the same question. So I have a matrix with the gene length which i calculated from the gff file and I want to add this in the dds in order to run the fpkm(dds) can you help me with this?

ADD REPLY • link 4.8 years ago aapapaiwannou • 0

1

Entering edit mode

Just add the number of basepairs to here:

mcols(dds)$basepairs = ...

ADD REPLY • link 4.8 years ago Michael Love 42k

score 0 · Answer 1 · 2018-10-05

0

Entering edit mode

Michael Love 42k

@mikelove

Last seen 12 hours ago

United States

It’s not trivial to calculate the gene lengths. I don’t have any simple generic code to get the gene lengths (exonic basepairs).

I collaborated on Salmon which is a sophisticated model for estimating abundance, so that’s my solution if users want to work with expression values, rather than making arbitrary calculations on counts (arbitrary because you don’t know the gene length).

ADD COMMENT • link 5.8 years ago Michael Love 42k

0

Entering edit mode

I can actually get the gene lengths from the annotation gtf file. So, if I add that gene_length column to the matrix will it work?

ADD REPLY • link 5.8 years ago Beginner ▴ 60

0

Entering edit mode

Try following the advice given in the message from the software and see how it works. If you get stuck then it’s appropriate to post.

ADD REPLY • link 5.8 years ago Michael Love 42k