Paired analysis DESeq2 problems
1
0
Entering edit mode
georgina.fqw ▴ 10
@georginafqw-23788
Last seen 3.1 years ago

Hi,

I have 50 samples form 25 individuals. These are paired samples (tumour and matched normal) and want to see difference between tumour and normal taking into account individuals.

my colData looks like this:

Sample Condition
1 N
1 T
2 N
2 T
3 N
3 T
.
.
25 N 25 T

dds_1 <- DESeqDataSetFromMatrix(countData = count_matrix, colData=colData, design = ~ Condition)
dds_2 <- DESeqDataSetFromMatrix(countData = count_matrix, colData=colData design = ~ Condition + Sample)
converting counts to integer mode
  the design formula contains one or more numeric variables with integer values,
  specifying a model with increasing fold change for higher values.
  did you mean for this to be a factor? if so, first convert
  this variable to a factor using the factor() function
  the design formula contains one or more numeric variables that have mean or
  standard deviation larger than 5 (an arbitrary threshold to trigger this message).
  it is generally a good idea to center and scale numeric variables in the design
  to improve GLM convergence.
Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
  some variables in design formula are characters, converting to factors

 # perform DEA
dea_1 <- DESeq(dds_1)
dea_2 <- DESeq(dds_2)
    estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
1 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest

result analysis

res_1 <- results(dea_1)
res_2 <- results(dea_2)

Results from the first run looks OK

> res_1
log2 fold change (MLE): Condition T vs N 
Wald test p-value: Condition T vs N 
DataFrame with 30161 rows and 6 columns

However, I am not sure if the second analysis run correctly as I can only see "Sample"

> res_2
log2 fold change (MLE): Sample 
Wald test p-value: Sample 
DataFrame with 30161 rows and 6 columns

Thank you!

DESeq2 • 588 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 8 hours ago
United States

The note printed by DESeq() up above is saying clearly what to do in my opinion.

You’ve coded sample as a numeric which is not what you want. Sample 5 is not sample 2 + sample 3.

Did the note not make sense?

ADD COMMENT
0
Entering edit mode

Thank you Michael for help.

I have now changed the Sample and Condition to factor (as.factor), run it and got for the second analysis:

> res_2
log2 fold change (MLE): Sample 29 vs 1 
Wald test p-value: Sample 29 vs 1 
DataFrame with 30161 rows and 6 columns

so it looks that the analysis compared sample 29 vs sample1. How to compare T vs N taking into account sample option? Thanks again for help!

ADD REPLY
0
Entering edit mode

This is covered in the documentation, see the vignette on designs with multiple factors.

ADD REPLY
0
Entering edit mode

Thank you for suggestions, I have read the vignette , but my design is not that complex like in the "Group-specific condition effects, individuals nested within groups" , I am not sure if I have to create another column like in the example "ind.n". My column "Sample" already has data regarding the individuals (1 individual -two samples). Maybe I read it wrongly. Your help will be very much appreciated! Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6