my name is Mattia Furlan, I am a student majoring in Physics of Complex Systems at the University of Turin (IT).
As my master thesis project, I am currently working on the inference of transcription rates in eukaryotic cells from time course RNA-seq data.
To pursue this aim I have recourse to DESeq2, in order to "prepare" the raw counts data coming from the sequencing experiments, according to the following framework:
- creation of a DESeq data set, function DESeqDataSetFromMatrix
- execution of the DESeq protocol, function DESeq
- normalization of the counts, function counts(... , normalized = TRUE)
- evaluation of the variance for each gene and for each time point,
mu + alpha*mu^2 with mu<-assays( ... )[['mu']] , alpha<-dispersions( ... )
I am seeking for feedbacks on the correctness of this procedure.
I am also evaluating the possibility of applying the rLog transformation to my data modifying the previous framework in the following manner:
- creation of a DESeq data set
- execution of the DESeq protocol
- rLog transformation, function rlog(... ,blind = TRUE)
- reversion of the transformation as 2^assay( "rlog data" )
- evaluation of the variance in the way described above
I wonder if this modification is licit, is still this specific variance a good estimation for the real counts dispersion after the rLog transformation?
Thank you all in advice for your attention, best regards.