Question: Normalization using DESeq
gravatar for aangajala
2.7 years ago by
aangajala20 wrote:

I would like to use Deseq2 for normalization of RNA seq data, I need to compare the expression of a gene in different samples. Is there any way I can do normalization in DeSeq2 and download the data into CSV, then I can use excel to analyze the data ? Please tell me cpm matrix can be considered as normalized data, i know that CPM matrix is created based upon design matrix. In my case, I used primary tumor and normal sample. But, going forther I will need to compare between groups with in Primary tumor.Thanks so much in advance.

normalization deseq2 • 3.1k views
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by aangajala20

Thanks a million for your reply. As per first two lines of code above, by any chance do you know if it will generate normalized counts both row and column wise normalization?

I am getting error , please see following sessioninfo()

> dds <- Data

> dds <- estimateSizeFactors(dds)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘estimateSizeFactors’ for signature ‘"data.frame"’
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Biobase_2.34.0       BiocGenerics_0.20.0  BiocInstaller_1.24.0

loaded via a namespace (and not attached):
[1] tools_3.3.2




ADD REPLYlink written 2.7 years ago by aangajala20

Take a look at the DESeq2 vignette and manual. You have to create a DESeqDataSet first, if you have a matrix of data you can use DESeqDataSetFromMatrix.

ADD REPLYlink written 2.7 years ago by Michael Love25k

Thanks so much for the reply. I changed type= for DESeq matrix and it worked out. :)


Is it possible to have multiple groups in Design matrix? In the documentation I read there were only two groups.


Which version of R I should use for DESeq2.I am getting following warning, or shall I ignore the warning? Warning in install.packages :  package ‘DeSeq’ is not available (for R version 3.3.2)

Question#3: I have obtained the "mat"or normalized counts. I would like to know (if the counts are normalized both by rows and columns)? 

Thanks so much, in advance.


ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by aangajala20

You can have any number of groups or even more complex designs in DESeq2. I don't know where you read anything about only two groups.

You should use the latest version of R for the latest version of Bioconductor (and therefore DESeq2):

The version of Bioconductor which is installed depends on the version of R you have. If you download the latest version of R, you will get the latest version of Bioconductor.

If you loaded a counts matrix, the counts are normalized by columns.

There is a new pipeline, called tximport, which can be used upstream of DESeq2, in which the counts are normalized by both rows and columns. tximport is a separate Bioconductor package.

ADD REPLYlink written 2.7 years ago by Michael Love25k

You should read over the DESeq2 vignette and manuals more closely. Be very careful when typing things out. Every letter counts, and you have to capitalize things properly. For example, there is no such package "DeSeq". The package name is "DESeq2".

ADD REPLYlink written 2.7 years ago by Michael Love25k

Hello Dr.Love,

Thanks so much for your help again.I truly appreciate it. I will be careful with my typing.

I have spent quite a bit of time reading tximport, which would be perfect for my data.It looks like the input should be in "kallisto" software output,with multiple files. In my case I have two files ( one - counts, two-metadata), also once we create txi$count (I am having trouble to create), then it can be imported to Deseq.

Please help.

ADD REPLYlink written 2.7 years ago by aangajala20
If you have only counts and the sample info you can't use tximport. You would need the reads in FASTQ format to start.
ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Michael Love25k

Thank you for the reply Dr.Love. If possible please tell me any other tools that is helpful for raw wise normalization.I have FPKM-UQ from TCGA, which is already normalized by column, I need to nomalize by rows or samples.Thanks.

ADD REPLYlink written 2.7 years ago by aangajala20

I don't really have any more advice.

This support forum, especially the package-tagged posts, is more for support about specific software, and I've told you all there is to say about obtaining normalized counts with DESeq2.

For more general purpose advice about bioinformatics, you should try posting here:

ADD REPLYlink written 2.7 years ago by Michael Love25k

Okay thank you so much.

ADD REPLYlink written 2.7 years ago by aangajala20
Answer: Normalization using DESeq
gravatar for Michael Love
2.7 years ago by
Michael Love25k
United States
Michael Love25k wrote:

If you want to use DESeq2 to produce normalized counts you can do:

dds <- estimateSizeFactors(dds)
mat <- counts(dds, normalized=TRUE)

If you want to produce CPMs with DESeq2 you can do:

mat <- fpm(dds)

Note that the fpm() function will produce CPMs which do not necessarily add up to 1e6, because it is using a robust definition of library size instead of the column sum.

What you do with these values downstream is up to you.

ADD COMMENTlink written 2.7 years ago by Michael Love25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 195 users visited in the last hour