**20**wrote:

Hello All,

I am performing a DEG analysis using DESeq2 and would really appreciate your advice on the results(contrast) section of my analysis.

I have 5 disease categories(RM, RP, SP_progress, SP_stable and healthy) for each category we extracted monocytes and treated these monocytes with different concentration of a chemical (10nm,50nm,100nm,200nm and 0nm/none). The goal is to perform DE analysis within each group by comparing monocytes with different treatment compositions (Eg: For disease category RM, I am interested in comparing 10nm vs 100nm, 10nm vs 50nm, 100nm vs none..etc). My code is as follows:

ddsComplete <- DESeqDataSetFromMatrix(countData=FilteredCounts, colData=phenoFile,design=~0+Monocytes:Disease)

ddsComplete <- DESeq(ddsComplete)

PhenoFile:

head(phenoFile) Position Disease Monocytes 1 A1 RM None 2 B1 RM 10nm 3 C1 RM 50nm 4 D1 RP 10nm 5 E1 RP 50nm 6 F1 RP 100nm

Results:

resultsNames(ddsComplete)

[1] "Monocytes100nm.DiseaseHealthy"

[2] "Monocytes10nm.DiseaseHealthy"

[3] "Monocytes50nm.DiseaseHealthy"

[4] "MonocytesNone.DiseaseHealthy"

[5] "Monocytes200nm.DiseaseHealthy"

[6] "Monocytes100nm.DiseaseRM"

[7] "Monocytes10nm.DiseaseRM"

[8] "Monocytes50nm.DiseaseRM"

[9] "MonocytesNone.DiseaseRM"

[10] "Monocytes200nm.DiseaseRM"

[11] "Monocytes100nm.DiseaseRP"

[12] "Monocytes10nm.DiseaseRP"

[13] "Monocytes50nm.DiseaseRP"

[14] "MonocytesNone.DiseaseRP"

[15] "Monocytes200nm.DiseaseRP"

[16] "Monocytes100nm.DiseaseSP_progress"

[17] "Monocytes10nm.DiseaseSP_progress"

[18] "Monocytes50nm.DiseaseSP_progress"

[19] "MonocytesNone.DiseaseSP_progress"

[20] "Monocytes200nm.DiseaseSP_progress"

[21] "Monocytes100nm.DiseaseSP_stable"

[22] "Monocytes10nm.DiseaseSP_stable"

[23] "Monocytes50nm.DiseaseSP_stable"

[24] "MonocytesNone.DiseaseSP_stable"

[25] "Monocytes200nm.DiseaseSP_stable"

In order to extract results for e.g. comparing 100nm vs 200nm for Disease category RP, I wrote the following contrast:

RP100_vs_200 = results(ddsComplete, contrast=c('Monocytes','100nm.DiseaseRP','200nm.DiseaseRP'))

>RP100_vs_200

log2 fold change (MLE): Monocytes 100nm.DiseaseRP vs 200nm.DiseaseRP

Wald test p-value: Monocytes 100nm.DiseaseRP vs 200nm.DiseaseRP

DataFrame with 16770 rows and 6 columns

baseMean log2FoldChange lfcSE stat pvalue padj

<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>

A1BG 0.77580621 -0.4489085 0.5952707 -0.75412502 0.45077415 0.9999971

A1BG-AS1 0.14957383 0.3799384 1.7488089 0.21725552 0.82800922 0.9999971

A1CF 0.01171596 -0.3450653 10.0207498 -0.03443507 0.97253022 0.9999971

A2M 0.46024794 -1.5849584 0.7909503 -2.00386595 0.04508442 0.9999971

A2M-AS1 0.01496816 -0.3450659 10.0207498 -0.03443514 0.97253016 0.9999971

... ... ... ... ... ... ...

ZYG11A 1.3799655 0.1101611 0.4711617 0.2338074 0.8151345 0.9999971

ZYG11B 29.3911744 -0.2092063 0.5674405 -0.3686841 0.7123632 0.9999971

ZYX 0.8827047 0.2417958 0.6352289 0.3806436 0.7034677 0.9999971

ZZEF1 0.4399459 -0.3553409 0.7771330 -0.4572459 0.6474943 0.9999971

ZZZ3 0.5939009 -0.5578059 0.6982475 -0.7988655 0.4243684 0.9999971

Could someone please help me understand how the contrast works? For the above result(RP100_vs_200) does the contrast function internally subsets groups and then generates table based upon the inputs given to it? If I was to subset my countData and phenoFile to just include disease category RP and re-run the analysis using design=~Monocytes, would the results for 100nm_vs_200nm be similiar? I expect the results to be little different for RP since the estimates would be different for subset RP considering the sample size.

Hope the question is clear! Thank you!!!