Question

Mapping DESeq2 output back to chromosomal positions

0

Entering edit mode

rbronste ▴ 60

@rbronste-12189

Last seen 4.5 years ago

So I output a matrix from DiffBind as my input into DESeq2, with the chromosome positions being removed from the count matrix, just leaving the first column key as a reference followed by the counts in all the replicates in the matrix:

dds<-DESeqDataSetFromMatrix(countData = countData,
                            colData = colData,
                            design =  ~ sex + treatment)

Following this I ran DESeq2 and have output the results as follows:

write.csv(as.data.frame(results), file="results.csv")

Now this of course is a limited list by LFC and adjusted p-val so I was wondering how I can remap this list to the original matrix output from DiffBind so I can recover the chromosome positions since this contains only the key.

Thanks!

deseq2 deseq diffbind differential binding analysis • 1.3k views

ADD COMMENT • link updated 6.4 years ago by Michael Love 41k • written 6.4 years ago by rbronste ▴ 60

0

Entering edit mode

I guess another way to look at is how to from the following results recover the associated countData?

head(results)
log2 fold change (MLE): +1,-0.333333333333333,-0.333333333333333,-0.333333333333333 
Wald test p-value: +1,-0.333333333333333,-0.333333333333333,-0.333333333333333 
DataFrame with 6 rows and 6 columns
        baseMean log2FoldChange     lfcSE      stat    pvalue      padj
       <numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>
440385  24.01924      -2.625855  1.659993 0.0000000        NA        NA
440386  30.58690       4.390651  1.499186 1.5946323        NA        NA
440387  32.86232       2.876218  1.407257 0.6226421 0.2667599         1
440388  19.38288      -1.482686  1.485418 0.0000000 0.9904758         1
440389  80.59038       1.277523  1.458940 0.0000000 0.6897729         1
440390  41.60001       0.122999  1.434733 0.0000000 0.9046071         1

ADD REPLY • link 6.4 years ago rbronste ▴ 60

score 1 · Answer 1 · 2017-12-19

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 14 hours ago

United States

You can match by the rownames, right? Here you would just use base R functionality, e.g. ordering by a character vector of rownames with square brackets or using the match() function.

Note that DESeqDataSet is actually built on RangedSummarizedExperiment objects, and so results() can output *ranges* in this case, see ?results. This would be my preferred option, to give DESeq2 ranged data to begin with, and then the package will keep track of this for you.

ADD COMMENT • link 6.4 years ago Michael Love 41k

0

Entering edit mode

I guess another way I considered doing it is by adding the (CHR, START, STOP) columns as metadata to the:

DESeqDataSetFromMatrix

As they would be there initially from DiffBind if I did not strip them out. This would leave them in the results if I understand correctly?

Is there a good way to go about this given the info above? Thanks!

ADD REPLY • link 6.4 years ago rbronste ▴ 60

1

Entering edit mode

I’d recommend making a RangedSummarizedExperiment object (it’s the same amount of work as what you are suggesting) then use DESeqDataSet(rse, design). Then everything will “just work”, and you can get a ranged results table.

ADD REPLY • link 6.4 years ago Michael Love 41k

0

Entering edit mode

Thanks for the advice!

I think the following (from DiffBind) should do the job of outputting the RangedSummarizedExperiment object that I can use in DESeq2.

dba.peakset

ADD REPLY • link 6.4 years ago rbronste ▴ 60