Search
Question: about DEseq for spike-in datasets
0
3.1 years ago by
United States
gan.haiyun0 wrote:

I try to use DEseq2 to Identify the changed genes in my two chip-seq data.

Because our experiment design is use the spike-in chromatin to do

normalization, so we do not need estimate the effective library size.

I just assign a value to sizeFactors like:

> sizeFactors(data) <-- c(-1,-2)

> sizeFactors(data)

> hj67 hj68

1    2

> res = nbinomTest(data, "wt", "ko" );

> My question is :

1) I am not sure whether i can do like this ?

2) Another question is: when I want to the sizeFactors is 1 and 2,  why I need input -1 and -2 ?

modified 3.1 years ago • written 3.1 years ago by gan.haiyun0

Before I reply: Please correct the post type from "tutorial" for "question". Otherwise, this might end up anywhere.

1
3.1 years ago by
Simon Anders3.5k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.5k wrote:

2) Another question is: when I want to the sizeFactors is 1 and 2,  why I need input -1 and -2 ?

What makes you think that you have to input -1 and -2? This is, of course, wrong. If the size factors you calculated are 1.23 and 0.78, then you should put

sizeFactors(data) <- c( 1.23, 0.78 )

1) I am not sure whether i can do like this ?

Yes, of course, you can manually specify size factors, provided that they are correct. Whether your size factors are correct depends on how you calculated them. Your example, "1 and 2", does not look like realistic values.

However, please note that spike-ins are suitable and helpful for normalization only in certain circumstances. You would need to explain more about your experiment (and especially: on when exactly in the sample prep you spike them in) to see whether they are appropriate.

0
3.1 years ago by
United States
gan.haiyun0 wrote:

Thank you very much !

base on the spike-in, we decide the sizeFactors is 1, 3.82, 3.63.

so i run R:

> count_table <- read.table("count.txt", header=T, sep="\t", row.names=1);
> expt_design <- data.frame(row.names = colnames(count_table), condition = c("wt","ko","ko"));
> conditions = expt_design\$condition;
> library("DESeq");
> data <- newCountDataSet(count_table, conditions)
> sizeFactors(data) <- c(1, 3.82, 3.63)
> data = estimateDispersions(data)

> res = nbinomTest(data, "wt", "ko" );

when i running estimateDispersions, i get the following warning messages.

i am not sure this warning messages will effect the results ?

warning messages:

1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
2: step size truncated due to divergence
3: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
4: step size truncated due to divergence
5: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
6: step size truncated due to divergence 

If the function runs without printing Error, I believe you can ignore these Warnings, which come from intermediate steps, but in the end the dispersion estimation routine converged.

0
3.1 years ago by
Michael Love19k
United States
Michael Love19k wrote:

hi Haiyun,

A note: from your code I can tell you are not using DESeq2 but DESeq (the original software). Please post the output of

sessionInfo()

when posting to the support site so we can see what versions you are using. In the future, you might consider updating to DESeq2, as this is the version which is actively maintained.

0
3.1 years ago by
United States
gan.haiyun0 wrote:

Thank you very much.

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] DESeq_1.10.1       lattice_0.20-29    locfit_1.5-9.1     Biobase_2.18.0
[5] BiocGenerics_0.4.0

loaded via a namespace (and not attached):
[1] annotate_1.36.0      AnnotationDbi_1.20.7 DBI_0.2-7
[4] genefilter_1.40.0    geneplotter_1.36.0   grid_2.15.2
[7] IRanges_1.16.6       parallel_2.15.2      RColorBrewer_1.0-5
[10] RSQLite_0.11.4       splines_2.15.2       stats4_2.15.2
[13] survival_2.37-7      XML_3.98-1.1         xtable_1.7-3