Add avgTxLength to DESeqDataSet
3
0
Entering edit mode
rhart • 0
@rhart-13682
Last seen 7 months ago
United States

I'd like to manually add a vector of transcript sizes to my DDS object.  I tried setting:

dds@assays$data$avgTxLength=data.frame(row.names=txOrder,"avgTxLength"=tlen[txOrder,])

Where txOrder is the row names from the dds data and tlen is the (out of order) transcript list for those row names.  The error I get when I try using fpkm() is "invalid 'dimnames' given for data frame".

How do I add this vector so I can output fpkm?

 

deseq2 fpkm • 3.2k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

hi,

You say vector, which sounds like you just have a single number per gene. Is that correct? In that case, if you take a look at ?fpkm, the thing to do is just:

mcols(dds)$basepairs <- x

Where x is your vector of gene lengths, lined up with rows of dds.

ADD COMMENT
0
Entering edit mode

Nope, neither worked. 

I'm working at the transcript level and I have the length for each transcript (calculated with a simple script from the gtf).  So tlen is a data frame with transcript_id's as row names and one column of data labeled "Len".  txOrder is a character array with the transcript_id values used in my dds object (after filtering) in the correct order. 

Replacing "xxx" or "x" in the above examples with tlen[txOrder,] (which produces a numeric vector) still gives the "dimnames" error when I try fpkm().

 

ADD REPLY
0
Entering edit mode

Can you make a small reproducible example, so I can take a look? Ideally with just simulated data, e.g. makeExampleDESeqDataSet.

ADD REPLY
0
Entering edit mode

In creating a subset example object, I can now get the mcols(dds)$basepairs<-x method to work.  So now I'll go back and re-create my original dds object--perhaps I screwed it up with my other trial methods.  But thanks--this seems to have solved my problem!

 

ADD REPLY
0
Entering edit mode

For bp gene lengths, should I use the original CDS length or the effective CDS length (CDS length - read length)? Thanks!

ADD REPLY
0
Entering edit mode

I don't have any preference here. I tend to use methods like Salmon for calculating abundance, rather than FPKM from counts.

ADD REPLY
0
Entering edit mode
@steve-lianoglou-2771
Last seen 13 months ago
United States

Your DDS is a subclass of a summarizedexperiment. rowData(DDS) Should return you a DataFrame of gene info. So, you should be able to do something like rowData(DDS)$txlength <- xxx

Where xxx is a properly ordered vector of gene lengths

ADD COMMENT
0
Entering edit mode
rhart • 0
@rhart-13682
Last seen 7 months ago
United States

(deleted)

 

ADD COMMENT

Login before adding your answer.

Traffic: 698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6