Entering edit mode
Katherine
•
0
@katherine-24218
Last seen 4.1 years ago
Hey,
I'm trying to look for differentially expressed genes in my database between different surfaces in marine water.
I have 4 surfaces: water, glass, PET and PE
I have 5 locations: A, B, C, D and E
I also seem to have a batch effect as locations B&D was ran on the sequencer a different day than A,C,E.
I was trying to correct for this by designing a matrix that incorporated the location as an effect and then duplicate correlation to compensate for the batch effect. See code below:
dge<- DGEList(counts=OTU,samples=targets, genes=tax, group=group)
keep<- filterByExpr(dge, min.count = 1)
dge <- dge[keep,,keep.lib.sizes=FALSE]
dge <- calcNormFactors(dge)
design <- model.matrix(~0+group+location)
v <- voom(dge,design, plot=TRUE)
corfit <- duplicateCorrelation(v,design,block=batch)
but I got this error message: Warning message: In atanh(pmax(-1, rho)) : NaNs produced
Any suggestions would be much appreciated!
Day of running on the instrument rarely causes technical artifacts, unless there was a big QC problem with one of the runs. The simpler explanation is that B&D really are different from the others. If instrument really was causing a problem, it would be totally confounded with your location, no algorithm can fix that.