Data normalization
1
0
Entering edit mode
nolwenn • 0
@c3b2b52f
Last seen 19 hours ago
France

Hello,

I'm new in bioinformatic and I would like to normalize HTseqcount data to do a survival analysis. How can I choose the best normalization ?

Thank you for your help !!

DESeq2 Normalization • 133 views
1
Entering edit mode
@mikelove
Last seen 21 hours ago
United States

See the DESeq2 vignette on variance stabilizing transformations. This is one approach to dealing with sequencing depth and heteroskedasticity, so you can work with scaled, transformed data downstream. The data is approximately log2 scale after running vst.

0
Entering edit mode

I try this transformation :

DESeq_object <- DESeqDataSetFromMatrix(countData = count,
colData = coldata,
design= ~ gender)
vst_object <- varianceStabilizingTransformation(DESeq_object)
boxplot(assay(vst_object))


I have this boxplot :

All medians are not align ... How can I align all medians to have a good normalization ?

0
Entering edit mode

VST is not quantile normalization. It adjusts for sequencing depth and transforms to log2 scale. It also does not perform sample QC for you.

0
Entering edit mode

..and, for the survival analysis part, I would take the VST expression levels and then follow my tutorial from Step 2, here: Tutorial:Survival analysis with gene expression

0
Entering edit mode

I would like to predict how many month a patient stay in life after the diagnostics. I need to test each gene independently ?

0
Entering edit mode

Can I use the median ratio method normalization and VST method ? How can I do QC ? I'm sorry I'm new in bioinformatic ...

0
Entering edit mode

This is out of scope for what I can provide on the support site. Consult a bioinformatician perhaps, or a general purpose site such as biostars (see Kevin's post above).

0
Entering edit mode

Thank you for your help.

0
Entering edit mode

I'm sorry, I have a new question with vst. I try to use vst and represente data with meanSdPlot like in the vignette and I don't have a real difference between normalized data and vst data. It's normal ?

DESeq_object <- estimateSizeFactors(DESeq_object)
counts_normalized <- counts(DESeq_object, normalized = TRUE)
meanSdPlot(log2(counts_normalized+1), ranks = FALSE)


vst_object <- varianceStabilizingTransformation(DESeq_object, blind = TRUE)
meanSdPlot(assay(vst_object), ranks = FALSE)


Thank you !

0
Entering edit mode

As I said, VST is similar to log2 transformation, but has better stabilization of variance of small count genes.

0
Entering edit mode

The user now posted on Biostars https://www.biostars.org/p/9464165/#9464165

0
Entering edit mode