Question: Mapping DESeq2 output back to chromosomal positions
0
gravatar for rbronste
23 months ago by
rbronste60
rbronste60 wrote:

So I output a matrix from DiffBind as my input into DESeq2, with the chromosome positions being removed from the count matrix, just leaving the first column key as a reference followed by the counts in all the replicates in the matrix:

dds<-DESeqDataSetFromMatrix(countData = countData,
                            colData = colData,
                            design =  ~ sex + treatment)

Following this I ran DESeq2 and have output the results as follows:

write.csv(as.data.frame(results), file="results.csv")

Now this of course is a limited list by LFC and adjusted p-val so I was wondering how I can remap this list to the original matrix output from DiffBind so I can recover the chromosome positions since this contains only the key. 

Thanks!

 

ADD COMMENTlink modified 23 months ago by Michael Love26k • written 23 months ago by rbronste60

I guess another way to look at is how to from the following results recover the associated countData?

head(results)
log2 fold change (MLE): +1,-0.333333333333333,-0.333333333333333,-0.333333333333333 
Wald test p-value: +1,-0.333333333333333,-0.333333333333333,-0.333333333333333 
DataFrame with 6 rows and 6 columns
        baseMean log2FoldChange     lfcSE      stat    pvalue      padj
       <numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>
440385  24.01924      -2.625855  1.659993 0.0000000        NA        NA
440386  30.58690       4.390651  1.499186 1.5946323        NA        NA
440387  32.86232       2.876218  1.407257 0.6226421 0.2667599         1
440388  19.38288      -1.482686  1.485418 0.0000000 0.9904758         1
440389  80.59038       1.277523  1.458940 0.0000000 0.6897729         1
440390  41.60001       0.122999  1.434733 0.0000000 0.9046071         1
ADD REPLYlink modified 23 months ago • written 23 months ago by rbronste60
Answer: Mapping DESeq2 output back to chromosomal positions
1
gravatar for Michael Love
23 months ago by
Michael Love26k
United States
Michael Love26k wrote:

You can match by the rownames, right? Here you would just use base R functionality, e.g. ordering by a character vector of rownames with square brackets or using the match() function.

Note that DESeqDataSet is actually built on RangedSummarizedExperiment objects, and so results() can output *ranges* in this case, see ?results. This would be my preferred option, to give DESeq2 ranged data to begin with, and then the package will keep track of this for you.

ADD COMMENTlink written 23 months ago by Michael Love26k

I guess another way I considered doing it is by adding the (CHR, START, STOP) columns as metadata to the: 

DESeqDataSetFromMatrix

As they would be there initially from DiffBind if I did not strip them out. This would leave them in the results if I understand correctly? 

Is there a good way to go about this given the info above? Thanks!

ADD REPLYlink written 23 months ago by rbronste60
1

I’d recommend making a RangedSummarizedExperiment object (it’s the same amount of work as what you are suggesting) then use DESeqDataSet(rse, design). Then everything will “just work”, and you can get a ranged results table.

ADD REPLYlink modified 23 months ago • written 23 months ago by Michael Love26k

Thanks for the advice! 

I think the following (from DiffBind) should do the job of outputting the RangedSummarizedExperiment object that I can use in DESeq2.

dba.peakset
ADD REPLYlink written 23 months ago by rbronste60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 373 users visited in the last hour