Data normalization
1
0
Entering edit mode
nolwenn • 0
@c3b2b52f
Last seen 3.7 years ago
France

Hello,

I'm new in bioinformatic and I would like to normalize HTseqcount data to do a survival analysis. How can I choose the best normalization ?

Thank you for your help !!

DESeq2 Normalization • 2.5k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

See the DESeq2 vignette on variance stabilizing transformations. This is one approach to dealing with sequencing depth and heteroskedasticity, so you can work with scaled, transformed data downstream. The data is approximately log2 scale after running vst.

ADD COMMENT
0
Entering edit mode

I try this transformation :

DESeq_object <- DESeqDataSetFromMatrix(countData = count,
                                colData = coldata,
                                design= ~ gender)
vst_object <- varianceStabilizingTransformation(DESeq_object)
boxplot(assay(vst_object))

I have this boxplot : boxplot normalized data

All medians are not align ... How can I align all medians to have a good normalization ?

ADD REPLY
0
Entering edit mode

VST is not quantile normalization. It adjusts for sequencing depth and transforms to log2 scale. It also does not perform sample QC for you.

ADD REPLY
0
Entering edit mode

..and, for the survival analysis part, I would take the VST expression levels and then follow my tutorial from Step 2, here: Tutorial:Survival analysis with gene expression

ADD REPLY
0
Entering edit mode

I would like to predict how many month a patient stay in life after the diagnostics. I need to test each gene independently ?

ADD REPLY
0
Entering edit mode

Can I use the median ratio method normalization and VST method ? How can I do QC ? I'm sorry I'm new in bioinformatic ...

ADD REPLY
0
Entering edit mode

This is out of scope for what I can provide on the support site. Consult a bioinformatician perhaps, or a general purpose site such as biostars (see Kevin's post above).

ADD REPLY
0
Entering edit mode

Thank you for your help.

ADD REPLY
0
Entering edit mode

I'm sorry, I have a new question with vst. I try to use vst and represente data with meanSdPlot like in the vignette and I don't have a real difference between normalized data and vst data. It's normal ?

DESeq_object <- estimateSizeFactors(DESeq_object)
counts_normalized <- counts(DESeq_object, normalized = TRUE)
meanSdPlot(log2(counts_normalized+1), ranks = FALSE)

graph

vst_object <- varianceStabilizingTransformation(DESeq_object, blind = TRUE)
meanSdPlot(assay(vst_object), ranks = FALSE)

graph

Thank you !

ADD REPLY
0
Entering edit mode

As I said, VST is similar to log2 transformation, but has better stabilization of variance of small count genes.

ADD REPLY
0
Entering edit mode

The user now posted on Biostars https://www.biostars.org/p/9464165/#9464165

ADD REPLY
0
Entering edit mode

Thank you for your answer !

ADD REPLY

Login before adding your answer.

Traffic: 994 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6