Dear all, (or more specific Michael Love, if i get lucky) :)
Hope that you are doing well.
I am contacting you regarding the R package, DESeq2. I have been using this package for some years now, but only this week appeared a question when I was brainstorming with a biostatistic technician of my institute. :)
For exploratory analysis when I am doing a RNAseq analysis we have an option to use rlog or the vst transformations.
In your paper "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2" you have this statement:
while the VST is also effective at stabilizing variance, it does not directly take into account differences in size factors; and in datasets with large variation in sequencing depth (dynamic range of size factors ≳≳4) we observed undesirable artifacts in the performance of the VST.
But in the R vst function page:
The rlog is less sensitive to size factors, which can be an issue when size factors vary widely.
Reading these two statements I felt confused on which transformation method is effective for the size factors. Can you help me?
Now that I am here I will take advantage of it, and ask about this warning that I never saw. I am searching in google for answers without success.
> dds<-DESeqDataSetFromMatrix(countData = table, colData = data, design=
> ~RIN+group)
> the design formula contains one or more numeric
> variables that have mean or standard deviation larger than 5 (an
> arbitrary threshold to trigger this message). it is generally a good
> idea to center and scale numeric variables in the design to improve
> GLM convergence.
Is this a problem when I am doing the exploratory analysis? and what about the DEGs analysis?
Also after that i get:
> dds<-estimateSizeFactors(dds, controlGenes=index) dds<-DESeq(dds)
> using pre-existing size factors estimating dispersions gene-wise
> dispersion estimates mean-dispersion relationship final dispersion
> estimates fitting model and testing 1 rows did not converge in beta,
> labelled in mcols(object)$betaConv. Use larger maxit argument with
> nbinomWaldTest
I searched in google and you already replied to someone https://github.com/mikelove/DESeq2/issues/3 ... I did:
> dds<-estimateSizeFactors(dds, controlGenes=index)
> dds<-estimateDispersions(dds)
> dds<-nbinomWaldTest(dds, maxit=5000)
And I am still getting the same warning/error, I am asking if I keep increasing the number of the maxit but probably i will end it with the same error or can i remove the row that messes with this analysis step?
Thanks i advance, Andreia
Thanks for the quick reply.