I am currently working to analyze RNAseq patient data for lung cancer patients available through cBioPortal/TCGA. More specifically, I want to pass patient RNAseq count data to varianceStabilizingTransformation before doing additional downstream clustering. When I attempt to pass the data to VST, I receive an error stating: "Error in DESeqDataSet(se, design = design, ignoreRank): some values in assay are negative." When looking at the count data from cBioPortal, the data does indeed contain negative values. This makes sense considering that the count data had already been log transformed (and potentially normalized? I haven't been able to verify this in the cBioPortal documentation or in other literature) before I obtained it.
I've read the DESeq2 vignette as well as other related literature and am struggling with how to deal with these negative values. I've seen some researchers simply assign negative values a new value of "0." Others have added a constant value to the entire matrix so that all values are >0. In discussion with other colleagues it has been suggested to simply 'undo' the log transformation prior to passing the count data to VST. How should I handle these negative values prior to passing data to VST? Below is my simple code as well as a sample of my input patient data. Thanks in advance!
analysis <- read.delim("C:/..../analysis.txt")
gene_matrix <- as.matrix.data.frame(analysis)
stabilized <- varianceStabilizingTransformation(gene_matrix, blind = TRUE, fitType = "parametric")
>> Error in DESeqDataSet(se, design = design, ignoreRank): some values in assay are negative.
Patient Data Sample (directly from cBioPortal)
SAMPLE_ID Gene A Gene B Gene C Gene D Gene E Gene F Gene G
TCGA-05-4249-01 -0.7275 -0.7416 -0.8330 5.1667 -0.7212 -0.2081 0.9704
TCGA-05-4384-01 -0.8908 -0.8282 -1.1507 -0.1649 -0.4065 -0.8573 0.3573