Paired analysis DESeq2 problems
1
0
Entering edit mode
georgina.fqw ▴ 10
@georginafqw-23788
Last seen 21 months ago

Hi,

I have 50 samples form 25 individuals. These are paired samples (tumour and matched normal) and want to see difference between tumour and normal taking into account individuals.

my colData looks like this:

Sample Condition
1 N
1 T
2 N
2 T
3 N
3 T
.
.
25 N 25 T

dds_1 <- DESeqDataSetFromMatrix(countData = count_matrix, colData=colData, design = ~ Condition)
dds_2 <- DESeqDataSetFromMatrix(countData = count_matrix, colData=colData design = ~ Condition + Sample)
converting counts to integer mode
the design formula contains one or more numeric variables with integer values,
specifying a model with increasing fold change for higher values.
did you mean for this to be a factor? if so, first convert
this variable to a factor using the factor() function
the design formula contains one or more numeric variables that have mean or
standard deviation larger than 5 (an arbitrary threshold to trigger this message).
it is generally a good idea to center and scale numeric variables in the design
to improve GLM convergence.
Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
some variables in design formula are characters, converting to factors

# perform DEA
dea_1 <- DESeq(dds_1)
dea_2 <- DESeq(dds_2)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
1 rows did not converge in beta, labelled in mcols(object)\$betaConv. Use larger maxit argument with nbinomWaldTest


# result analysis

res_1 <- results(dea_1)
res_2 <- results(dea_2)


Results from the first run looks OK

> res_1
log2 fold change (MLE): Condition T vs N
Wald test p-value: Condition T vs N
DataFrame with 30161 rows and 6 columns


However, I am not sure if the second analysis run correctly as I can only see "Sample"

> res_2
log2 fold change (MLE): Sample
Wald test p-value: Sample
DataFrame with 30161 rows and 6 columns


Thank you!

DESeq2 • 382 views
0
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

The note printed by DESeq() up above is saying clearly what to do in my opinion.

You’ve coded sample as a numeric which is not what you want. Sample 5 is not sample 2 + sample 3.

Did the note not make sense?

0
Entering edit mode

Thank you Michael for help.

I have now changed the Sample and Condition to factor (as.factor), run it and got for the second analysis:

> res_2
log2 fold change (MLE): Sample 29 vs 1
Wald test p-value: Sample 29 vs 1
DataFrame with 30161 rows and 6 columns


so it looks that the analysis compared sample 29 vs sample1. How to compare T vs N taking into account sample option? Thanks again for help!

0
Entering edit mode

This is covered in the documentation, see the vignette on designs with multiple factors.

0
Entering edit mode

Thank you for suggestions, I have read the vignette , but my design is not that complex like in the "Group-specific condition effects, individuals nested within groups" , I am not sure if I have to create another column like in the example "ind.n". My column "Sample" already has data regarding the individuals (1 individual -two samples). Maybe I read it wrongly. Your help will be very much appreciated! Thank you!