3
0
Entering edit mode
rhart • 0
@rhart-13682
Last seen 2.1 years ago

I'd like to manually add a vector of transcript sizes to my DDS object.  I tried setting:

dds@assays$data$avgTxLength=data.frame(row.names=txOrder,"avgTxLength"=tlen[txOrder,])

Where txOrder is the row names from the dds data and tlen is the (out of order) transcript list for those row names.  The error I get when I try using fpkm() is "invalid 'dimnames' given for data frame".

How do I add this vector so I can output fpkm?

deseq2 fpkm • 1.7k views
1
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

hi,

You say vector, which sounds like you just have a single number per gene. Is that correct? In that case, if you take a look at ?fpkm, the thing to do is just:

mcols(dds)$basepairs <- x Where x is your vector of gene lengths, lined up with rows of dds. ADD COMMENT 0 Entering edit mode Nope, neither worked. I'm working at the transcript level and I have the length for each transcript (calculated with a simple script from the gtf). So tlen is a data frame with transcript_id's as row names and one column of data labeled "Len". txOrder is a character array with the transcript_id values used in my dds object (after filtering) in the correct order. Replacing "xxx" or "x" in the above examples with tlen[txOrder,] (which produces a numeric vector) still gives the "dimnames" error when I try fpkm(). ADD REPLY 0 Entering edit mode Can you make a small reproducible example, so I can take a look? Ideally with just simulated data, e.g. makeExampleDESeqDataSet. ADD REPLY 0 Entering edit mode In creating a subset example object, I can now get the mcols(dds)$basepairs<-x method to work.  So now I'll go back and re-create my original dds object--perhaps I screwed it up with my other trial methods.  But thanks--this seems to have solved my problem!

0
Entering edit mode

For bp gene lengths, should I use the original CDS length or the effective CDS length (CDS length - read length)? Thanks!

0
Entering edit mode

I don't have any preference here. I tend to use methods like Salmon for calculating abundance, rather than FPKM from counts.

0
Entering edit mode
@steve-lianoglou-2771
Last seen 6 hours ago
Denali

Your DDS is a subclass of a summarizedexperiment. rowData(DDS) Should return you a DataFrame of gene info. So, you should be able to do something like rowData(DDS)\$txlength <- xxx

Where xxx is a properly ordered vector of gene lengths

0
Entering edit mode
rhart • 0
@rhart-13682
Last seen 2.1 years ago

(deleted)