Hiya,
I have 105 samples of RNAseq data - I have two expression files, the (i) the gene counts matrix (rows genes, columns samples) (ii) and vst normalised matrix from DESeq2 (rows genes, columns samples)
LCA<-as.matrix(read.delim("Counts_filtered100.txt", header=TRUE, row.names=1))
pheno_LCA <- read.csv("SampleData_.csv", row.names=1)
pheno_LCA$Cat <- factor(pheno_LCA$Cat, levels = c("A","B","C","D"))
full_mod = model.matrix(~as.factor(Cat), data=pheno_LCA)
null_mod = model.matrix(~1, data=pheno_LCA)
I ran the following four options for (i) and for (ii)
svaobj = svaseq(LCA,full_mod,null_mod,n.sv=NULL,numSVmethod="be",B=20)
svaobj = svaseq(LCA,full_mod,null_mod,n.sv=NULL,numSVmethod="leek")
svaobj = sva(LCA,full_mod,null_mod,n.sv=NULL,numSVmethod="be",B=20)
svaobj = sva(LCA,full_mod,null_mod,n.sv=NULL,numSVmethod="leek")
RESULTS:
A/ sva function on counts data: (i) Method= Leek, Number of significant surrogate variables is: 101 Iteration (out of 5 ):Error in density.default(x, adjust = adj) : 'x' contains missing values In addition: Warning message: In pf(fstats, df1 = (df1 - df0), df2 = (n - df1)) : NaNs produced
(ii) Method = Be, Number of significant surrogate variables is: 1
B/ sva function on VST DESeq2 output data: (i) Method= Leek, Number of significant surrogate variables is: 1
(ii) Method = Be, Number of significant surrogate variables is: 15 A/ svaSeq function on counts data: (i) Method= Leek, Number of significant surrogate variables is: 2
(ii) Method = Be, Number of significant surrogate variables is: 8
B/ svaSeq function on VST DESeq2 output data: (i) Method= Leek, No significant surrogate variables
(ii) Method = Be, Number of significant surrogate variables is: 18
My questions are:
Am I correct in thinking that normalised counts (i.e. VST from DESeq2) should be used with svaSeq for RNAseq gene expression data - and therefore B/ (ii) is the correct output to take forward
What is the error of A/ (i) - I can't seem to find a reason for this, there are no rows with sums of zero for counts?
Is there anywhere, where the methods "keep" and "be" are described/contrasted please - I couldn't see this in the manual? How does one choose what "B" should be, I have just used 20 as was used in the example I found.
best wishes, Bex