Question

Add avgTxLength to DESeqDataSet

0

Entering edit mode

rhart • 0

@rhart-13682

Last seen 2.5 years ago

United States

I'd like to manually add a vector of transcript sizes to my DDS object. I tried setting:

dds@assays$data$avgTxLength=data.frame(row.names=txOrder,"avgTxLength"=tlen[txOrder,])

Where txOrder is the row names from the dds data and tlen is the (out of order) transcript list for those row names. The error I get when I try using fpkm() is "invalid 'dimnames' given for data frame".

How do I add this vector so I can output fpkm?

deseq2 fpkm • 4.2k views

ADD COMMENT • link updated 4.6 years ago by Michael Love 43k • written 8.5 years ago by rhart • 0

score 1 · Answer 1 · 2017-08-08

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 3 days ago

United States

hi,

You say vector, which sounds like you just have a single number per gene. Is that correct? In that case, if you take a look at ?fpkm, the thing to do is just:

mcols(dds)$basepairs <- x

Where x is your vector of gene lengths, lined up with rows of dds.

ADD COMMENT • link 8.5 years ago Michael Love 43k

0

Entering edit mode

Nope, neither worked.

I'm working at the transcript level and I have the length for each transcript (calculated with a simple script from the gtf). So tlen is a data frame with transcript_id's as row names and one column of data labeled "Len". txOrder is a character array with the transcript_id values used in my dds object (after filtering) in the correct order.

Replacing "xxx" or "x" in the above examples with tlen[txOrder,] (which produces a numeric vector) still gives the "dimnames" error when I try fpkm().

ADD REPLY • link 8.5 years ago rhart • 0

0

Entering edit mode

Can you make a small reproducible example, so I can take a look? Ideally with just simulated data, e.g. makeExampleDESeqDataSet.

ADD REPLY • link 8.5 years ago Michael Love 43k

0

Entering edit mode

In creating a subset example object, I can now get the mcols(dds)$basepairs<-x method to work. So now I'll go back and re-create my original dds object--perhaps I screwed it up with my other trial methods. But thanks--this seems to have solved my problem!

ADD REPLY • link 8.5 years ago rhart • 0

0

Entering edit mode

For bp gene lengths, should I use the original CDS length or the effective CDS length (CDS length - read length)? Thanks!

ADD REPLY • link 4.6 years ago Shujun • 0

0

Entering edit mode

I don't have any preference here. I tend to use methods like Salmon for calculating abundance, rather than FPKM from counts.

ADD REPLY • link 4.6 years ago Michael Love 43k

score 0 · Answer 2 · 2017-08-08

0

Entering edit mode

Steve Lianoglou ★ 13k

@steve-lianoglou-2771

Last seen 1 day ago

United States

Your DDS is a subclass of a summarizedexperiment. rowData(DDS) Should return you a DataFrame of gene info. So, you should be able to do something like rowData(DDS)$txlength <- xxx

Where xxx is a properly ordered vector of gene lengths

ADD COMMENT • link 8.5 years ago Steve Lianoglou ★ 13k

score 0 · Answer 3 · 2017-08-08

0

Entering edit mode

rhart • 0

@rhart-13682

Last seen 2.5 years ago

United States

(deleted)

ADD COMMENT • link 8.5 years ago rhart • 0