Normalization using DESeq
1
2
Entering edit mode
aangajala ▴ 20
@aangajala-12237
Last seen 3.7 years ago

I would like to use Deseq2 for normalization of RNA seq data, I need to compare the expression of a gene in different samples. Is there any way I can do normalization in DeSeq2 and download the data into CSV, then I can use excel to analyze the data ? Please tell me cpm matrix can be considered as normalized data, i know that CPM matrix is created based upon design matrix. In my case, I used primary tumor and normal sample. But, going forther I will need to compare between groups with in Primary tumor.Thanks so much in advance.

normalization deseq2 • 6.6k views
0
Entering edit mode

Thanks a million for your reply. As per first two lines of code above, by any chance do you know if it will generate normalized counts both row and column wise normalization?

I am getting error , please see following sessioninfo()

> dds <- Data

> dds <- estimateSizeFactors(dds)
Error in (function (classes, fdef, mtable)  :
unable to find an inherited method for function ‘estimateSizeFactors’ for signature ‘"data.frame"’
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Biobase_2.34.0       BiocGenerics_0.20.0  BiocInstaller_1.24.0

loaded via a namespace (and not attached):
[1] tools_3.3.2

0
Entering edit mode

Take a look at the DESeq2 vignette and manual. You have to create a DESeqDataSet first, if you have a matrix of data you can use DESeqDataSetFromMatrix.

0
Entering edit mode

Thanks so much for the reply. I changed type= for DESeq matrix and it worked out. :)

Question#1:

Is it possible to have multiple groups in Design matrix? In the documentation I read there were only two groups.

Question#2:

Which version of R I should use for DESeq2.I am getting following warning, or shall I ignore the warning? Warning in install.packages :  package ‘DeSeq’ is not available (for R version 3.3.2)

Question#3: I have obtained the "mat"or normalized counts. I would like to know (if the counts are normalized both by rows and columns)?

1
Entering edit mode

You can have any number of groups or even more complex designs in DESeq2. I don't know where you read anything about only two groups.

https://cran.r-project.org/

http://bioconductor.org/install/

If you loaded a counts matrix, the counts are normalized by columns.

There is a new pipeline, called tximport, which can be used upstream of DESeq2, in which the counts are normalized by both rows and columns. tximport is a separate Bioconductor package.

1
Entering edit mode

You should read over the DESeq2 vignette and manuals more closely. Be very careful when typing things out. Every letter counts, and you have to capitalize things properly. For example, there is no such package "DeSeq". The package name is "DESeq2".

0
Entering edit mode

Hello Dr.Love,

Thanks so much for your help again.I truly appreciate it. I will be careful with my typing.

I have spent quite a bit of time reading tximport, which would be perfect for my data.It looks like the input should be in "kallisto" software output,with multiple files. In my case I have two files ( one - counts, two-metadata), also once we create txi\$count (I am having trouble to create), then it can be imported to Deseq.

1
Entering edit mode
If you have only counts and the sample info you can't use tximport. You would need the reads in FASTQ format to start.
0
Entering edit mode

Thank you for the reply Dr.Love. If possible please tell me any other tools that is helpful for raw wise normalization.I have FPKM-UQ from TCGA, which is already normalized by column, I need to nomalize by rows or samples.Thanks.

1
Entering edit mode

I don't really have any more advice.

This support forum, especially the package-tagged posts, is more for support about specific software, and I've told you all there is to say about obtaining normalized counts with DESeq2.

For more general purpose advice about bioinformatics, you should try posting here:

http://www.biostars.org

0
Entering edit mode

Okay thank you so much.

3
Entering edit mode
@mikelove
Last seen 1 day ago
United States

If you want to use DESeq2 to produce normalized counts you can do:

dds <- estimateSizeFactors(dds)
mat <- counts(dds, normalized=TRUE)

If you want to produce CPMs with DESeq2 you can do:

mat <- fpm(dds)

Note that the fpm() function will produce CPMs which do not necessarily add up to 1e6, because it is using a robust definition of library size instead of the column sum.

What you do with these values downstream is up to you.