I am quite new to bioinformatics analysis and am interested in using the GSVA function with NanoString nCounter Data. I have read that "the NanoString nCounter is most similar to RNA-Seq in that it processes discrete counts of measurement similar to RNA-Seq." Thus, I was under the impression that when inputting the NanoString data into GSVA, I would treat it like RNAseq data. Is this a reasonable assumption?
After reading through the GSVA manual, it is unclear to me whether the raw NanoString expression data should indeed be pre-processed (according to normalization steps from the nCounter manual and then converted to log2) prior to being inputted to the GSVA function. According to the following reported in the GSVA manual, I am wondering if it is optional as to whether to normalize and convert the data to log2. By default, kcdf="Gaussian" which is suitable when input expression values are continuous, such as microarray fluorescent units in logarithmic scale, RNA-seq log-CPMs, log-RPKMs or log-TPMs. When input expression values are integer counts, such as those derived from RNA-seq experiments, then this argument should be set to kcdf="Poisson".
I would prefer to perform pre-processing (normalization and conversion of data to log2) myself before inputting the expression data to the GSVA function. Am I correct in using the default method kcdf="Gaussian" in this case? Is it also possible to input the raw NanoString counts to the GSVA function and use the alternative method, kcdf="Poisson"? If I did use the alternative method, would the data be internally normalization by the function? If it is internally normalized, could someone provide me with a link for where I can read about the normalization that is performed within the function?
I look forward to your response.