DESEQ2 normalize data of population data
1
0
Entering edit mode
vrehaman • 0
@vrehaman-14957
Last seen 4.0 years ago

Dear Team,

I want to perform differential expression on counts generated from featureCounts. Each family has different disease. Can we perfrom differential expression across all samples? or since disease is different from family to family, should we perform separately family by family?

Example sample information is like below

SampleID condition Family
Sample1 Normal Fam1
Sample2 Diseased Fam1
Sample3 Normal Fam2
Sample4 Diseased Fam2
Sample5 Normal Fam3
Sample6 Diseased Fam3
Sample7 Normal Fam4
Sample8 Diseased Fam4
Sample9 Normal Fam5
Sample10 Diseased Fam5
 

I followed the below link for grouping condition and family and get multiple comparisons. Is it ok t proceed like this?

DESEq2 comparison with mulitple cell types under 2 conditions

dds <- DESeqDataSetFromMatrix(countData = counts, colData = coldata , design = ~ Family + condition)

dds$group <- factor(paste0(dds$Family, dds$condition))

design(dds) <- ~ group
dds <- DESeq(dds)
resultsNames(dds)

Could you please let me know your suggestions.

Thanks In Advance

Fazulur Rehaman

deseq2 multiple factor design • 1.2k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

What kind of genes are you looking for? Genes commonly DE in diseased relative to normal?

You cannot perform DE within family here, because there are no "replicates" (I don't know if these are human donors, or what exactly a sample refers to...).

ADD COMMENT
0
Entering edit mode

Dear Michael,

Thanks a lot for your quick response.

Please find below the details

What kind of genes are you looking for? Genes commonly DE in diseased relative to normal?

We are looking for more diabetic and obesity genes commonly DE in diseased relative to normal.

You cannot perform DE within family here, because there are no "replicates" (I don't know if these are human donors, or what exactly a sample refers to...).

yes, these are human donors having diabetic or obseity.  Each family has one control and one diseases sample (more than 6 families). Disease might be either diabetic or obesity. 

Please suggest me how can I proceed with DE.

Thanks In Advance

Fazulur Rehaman

ADD REPLY
1
Entering edit mode

You can use ~family + condition, which will control for family baseline, while finding genes where the diseased samples show DE relative to normal.

ADD REPLY
0
Entering edit mode

Dear Michael,

Thanks a lot for your suggestions.

At first, I used the same model ~family + condition 

Here are the details:

which will control for family baseline, while finding genes where the diseased samples show DE relative to normal.

It means only one comparison where we can find genes diseased vs Normal, irrespective of which disease, right?

> resultsNames(dds)

[1] "Intercept"                    "family_Fam139_vs_Fam10"       

[3] "family_Fam193_vs_Fam10"        "family_Fam43_vs_Fam10"        

[5] "family_Fam52_vs_Fam10"         "family_Fam53_vs_Fam10"        

[7] "family_Fam55_vs_Fam10"         "family_Fam8_vs_Fam10"         

[9] "condition_Normal_vs_Diseased"

Building Results table for diseased vs Normal condition.

res1 <- results(dds, contrast=c("condition","Diseased","Normal"))

In resultsNames, it was mentioned as "condition_Normal_vs_Diseased". Since we have to get genes DE in diseased relative to normal, in results() function I have given "Diseased" followed by the "Normal". Please confirm if it is ok?

And also resultsNames() function giving other possible contrasts which are "family_Fam139_vs_Fam10", `"family_Fam43_vs_Fam10" etc. How can I use them.

Please let me know your suggestions.

Thanks In Advance

Fazulur Rehaman

ADD REPLY
1
Entering edit mode

Take a look at the DESeq2 vignette note on factor levels, where this is discussed.

But in short, what you have above is fine, you can always specify the LFC you want using contrast. The object res1 is your results table of interest.

You do not use the other coefficients listed in resultsNames, they are nuisance coefficients for your purposes. Those coefficients control for the family baselines.

ADD REPLY
0
Entering edit mode

Dear Michael,

Thanks a lot for your confirmation and suggestions on factor levels.

Thanks & Regards

Fazulur Rehaman

ADD REPLY
0
Entering edit mode

Dear Michael,

I have one more question. Since, we have only one comparsion which is diseased relative to Normal, Is there any possibility, I might know the family belongs to upregulated or down regulated genes. 

Please let me know.

Thanks & Regards

Fazulur Rehaman

ADD REPLY
0
Entering edit mode

You can look at plotCounts()

ADD REPLY

Login before adding your answer.

Traffic: 469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6