I have 50 samples form 25 individuals. These are paired samples (tumour and matched normal) and want to see difference between tumour and normal taking into account individuals.
my colData looks like this:
25 N 25 T
dds_1 <- DESeqDataSetFromMatrix(countData = count_matrix, colData=colData, design = ~ Condition) dds_2 <- DESeqDataSetFromMatrix(countData = count_matrix, colData=colData design = ~ Condition + Sample) converting counts to integer mode the design formula contains one or more numeric variables with integer values, specifying a model with increasing fold change for higher values. did you mean for this to be a factor? if so, first convert this variable to a factor using the factor() function the design formula contains one or more numeric variables that have mean or standard deviation larger than 5 (an arbitrary threshold to trigger this message). it is generally a good idea to center and scale numeric variables in the design to improve GLM convergence. Warning message: In DESeqDataSet(se, design = design, ignoreRank) : some variables in design formula are characters, converting to factors # perform DEA dea_1 <- DESeq(dds_1) dea_2 <- DESeq(dds_2) estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing 1 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
res_1 <- results(dea_1) res_2 <- results(dea_2)
Results from the first run looks OK
> res_1 log2 fold change (MLE): Condition T vs N Wald test p-value: Condition T vs N DataFrame with 30161 rows and 6 columns
However, I am not sure if the second analysis run correctly as I can only see "Sample"
> res_2 log2 fold change (MLE): Sample Wald test p-value: Sample DataFrame with 30161 rows and 6 columns