about DEseq for spike-in datasets
4
0
Entering edit mode
gan.haiyun • 0
@ganhaiyun-8692
Last seen 8.7 years ago
United States

I try to use DEseq2 to Identify the changed genes in my two chip-seq data.

 Because our experiment design is use the spike-in chromatin to do

 normalization, so we do not need estimate the effective library size.

 I just assign a value to sizeFactors like:

> sizeFactors(data) <-- c(-1,-2)

> sizeFactors(data)

> hj67 hj68

   1    2

> res = nbinomTest(data, "wt", "ko" );

> My question is :

 

 1) I am not sure whether i can do like this ?

 2) Another question is: when I want to the sizeFactors is 1 and 2,  why I need input -1 and -2 ?

normalization deseq • 3.5k views
ADD COMMENT
0
Entering edit mode

Before I reply: Please correct the post type from "tutorial" for "question". Otherwise, this might end up anywhere.

ADD REPLY
1
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.7 years ago
Zentrum für Molekularbiologie, Universi…

To start with your second question:

 2) Another question is: when I want to the sizeFactors is 1 and 2,  why I need input -1 and -2 ?

What makes you think that you have to input -1 and -2? This is, of course, wrong. If the size factors you calculated are 1.23 and 0.78, then you should put

sizeFactors(data) <- c( 1.23, 0.78 )

 

1) I am not sure whether i can do like this ?

Yes, of course, you can manually specify size factors, provided that they are correct. Whether your size factors are correct depends on how you calculated them. Your example, "1 and 2", does not look like realistic values.

However, please note that spike-ins are suitable and helpful for normalization only in certain circumstances. You would need to explain more about your experiment (and especially: on when exactly in the sample prep you spike them in) to see whether they are appropriate.

ADD COMMENT
0
Entering edit mode
gan.haiyun • 0
@ganhaiyun-8692
Last seen 8.7 years ago
United States

Thank you very much !

base on the spike-in, we decide the sizeFactors is 1, 3.82, 3.63.

so i run R:

> count_table <- read.table("count.txt", header=T, sep="\t", row.names=1);
> expt_design <- data.frame(row.names = colnames(count_table), condition = c("wt","ko","ko"));
> conditions = expt_design$condition;
> library("DESeq");
> data <- newCountDataSet(count_table, conditions)
> sizeFactors(data) <- c(1, 3.82, 3.63)
> data = estimateDispersions(data) 

> res = nbinomTest(data, "wt", "ko" );

when i running estimateDispersions, i get the following warning messages. 

i am not sure this warning messages will effect the results ?

 

warning messages:

1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
2: step size truncated due to divergence 
3: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
4: step size truncated due to divergence 
5: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
6: step size truncated due to divergence 
ADD COMMENT
0
Entering edit mode

If the function runs without printing Error, I believe you can ignore these Warnings, which come from intermediate steps, but in the end the dispersion estimation routine converged.

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 5 minutes ago
United States

hi Haiyun,

A note: from your code I can tell you are not using DESeq2 but DESeq (the original software). Please post the output of 

sessionInfo()

when posting to the support site so we can see what versions you are using. In the future, you might consider updating to DESeq2, as this is the version which is actively maintained.

ADD COMMENT
0
Entering edit mode
gan.haiyun • 0
@ganhaiyun-8692
Last seen 8.7 years ago
United States

Thank you very much.

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DESeq_1.10.1       lattice_0.20-29    locfit_1.5-9.1     Biobase_2.18.0    
[5] BiocGenerics_0.4.0

loaded via a namespace (and not attached):
 [1] annotate_1.36.0      AnnotationDbi_1.20.7 DBI_0.2-7           
 [4] genefilter_1.40.0    geneplotter_1.36.0   grid_2.15.2         
 [7] IRanges_1.16.6       parallel_2.15.2      RColorBrewer_1.0-5  
[10] RSQLite_0.11.4       splines_2.15.2       stats4_2.15.2       
[13] survival_2.37-7      XML_3.98-1.1         xtable_1.7-3        

ADD COMMENT

Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6